Debugging

Debugging is the cornerstone of being a programmer. The first meaning of the verb "debug" is to remove errors, but the meaning that really matters is to see into the execution of a program by examining it. A programmer that cannot debug effectively is blind.

Idealists, those who think design, analysis, complexity theory, and the like are more fundamental than debugging, are not working programmers. The working programmer does not live in an ideal world. Even if you are perfect, you are surrounded by and must interact with code written by major software companies, organizations like GNU, and your colleagues. Most of this code is imperfect and imperfectly documented. Without the ability to gain visibility into the execution of this code, the slightest bump will throw you permanently. Often this visibility can be gained only by experimentation: that is, debugging.

Debugging is about the running of programs, not programs themselves. If you buy something from a major software company, you usually don't get to see the program. But there will still arise places where the code does not conform to the documentation (crashing your entire machine is a common and spectacular example), or where the documentation is mute. More commonly, you create an error, examine the code you wrote, and have no clue how the error can be occurring. Inevitably, this means some assumption you are making is not quite correct or some condition arises that you did not anticipate. Sometimes, the magic trick of staring into the source code works. When it doesn't, you must debug.

To get visibility into the execution of a program you must be able to execute the code and observe something about it. Sometimes this is visible, like what is being displayed on a screen, or the delay between two events. In many other cases, it involves things that are not meant to be visible, like the state of some variables inside the code, which lines of code are actually being executed, or whether certain assertions hold across a complicated data structure. These hidden things must be revealed.

The common ways of looking into the ā€˜innardsā€™ of an executing program can be categorized as:

  • Using a debugging tool,

  • Printlining - Making a temporary modification to the program, typically adding lines that print information out, and

  • Logging - Creating a permanent window into the programs execution in the form of a log.

Debugging tools are wonderful when they are stable and available, but printlining and logging are even more important. Debugging tools often lag behind language development, so at any point in time they may not be available. In addition, because the debugging tool may subtly change the way the program executes it may not always be practical. Finally, there are some kinds of debugging, such as checking an assertion against a large data structure, that require writing code and changing the execution of the program. It is good to know how to use debugging tools when they are stable, but it is critical to be able to employ the other two methods.

Some beginners fear debugging when it requires modifying code. This is understandable - it is a little like exploratory surgery. But you have to learn to poke at the code and make it jump; you have to learn to experiment on it and understand that nothing that you temporarily do to it will make it worse. If you feel this fear, seek out a mentor - we lose a lot of good programmers at the delicate onset of their learning to this fear.

How to Debug by Splitting the Problem Space

Debugging is fun, because it begins with a mystery. You think it should do something, but instead it does something else. It is not always quite so simple---any examples I can give will be contrived compared to what sometimes happens in practice. Debugging requires creativity and ingenuity. If there is a single key to debugging it is to use the divide and conquer technique on the mystery.

Suppose, for example, you created a program that should do ten things in a sequence. When you run it, it crashes. Since you didn't program it to crash, you now have a mystery. When you look at the output, you see that the first seven things in the sequence were run successfully. The last three are not visible from the output, so now your mystery is smaller: ā€˜It crashed on thing #8, #9, or #10.ā€™

Can you design an experiment to see which thing it crashed on? Sure. You can use a debugger or we can add printline statements (or the equivalent in whatever language you are working in) after #8 and #9. When we run it again, our mystery will be smaller, such as ā€˜It crashed on thing #9.ā€™ I find that bearing in mind exactly what the mystery is at any point in time helps keep one focused. When several people are working together under pressure on a problem it is easy to forget what the most important mystery is.

The key to divide and conquer as a debugging technique is the same as it is for algorithm design: as long as you do a good job splitting the mystery in the middle, you won't have to split it too many times, and you will be debugging quickly. But what is the middle of a mystery? This is where true creativity and experience comes in.

To a true beginner, the space of all possible errors looks like every line in the source code. You don't have the vision you will later develop to see the other dimensions of the program, such as the space of executed lines, the data structure, the memory management, the interaction with foreign code, the code that is risky, and the code that is simple. For the experienced programmer, these other dimensions form an imperfect but very useful mental model of all the things that can go wrong. Having that mental model is what helps one find the middle of the mystery effectively.

Once you have evenly subdivided the space of all that can go wrong, you must try to decide in which space the error lies. In the simple case where the mystery is: ā€˜Which single unknown line makes my program crash?ā€™, you can ask yourself: ā€˜Is the unknown line executed before or after this line that I judge to be executed in the middle of the running program?ā€™ Usually you will not be so lucky as to know that the error exists in a single line, or even a single block. Often the mystery will be more like: ā€˜Either there is a pointer in that graph that points to the wrong node, or my algorithm that adds up the variables in that graph doesn't work.ā€™ In that case you may have to write a small program to check that the pointers in the graph are all correct in order to decide which part of the subdivided mystery can be eliminated.

How to Remove an Error

I've intentionally separated the act of examining a program's execution from the act of fixing an error. But of course, debugging does also mean removing the bug. Ideally you will have perfect understanding of the code and will reach an ā€˜A-Ha!ā€™ moment where you perfectly see the error and how to fix it. But since your program will often use insufficiently documented systems into which you have no visibility, this is not always possible. In other cases the code is so complicated that your understanding cannot be perfect.

In fixing a bug, you want to make the smallest change that fixes the bug. You may see other things that need improvement; but don't fix those at the same time. Attempt to employ the scientific method of changing one thing and only one thing at a time. The best process for this is to be able to easily reproduce the bug, then put your fix in place, and then rerun the program and observe that the bug no longer exists. Of course, sometimes more than one line must be changed, but you should still conceptually apply a single atomic change to fix the bug.

Sometimes, there are really several bugs that look like one. It is up to you to define the bugs and fix them one at a time. Sometimes it is unclear what the program should do or what the original author intended. In this case, you must exercise your experience and judgment and assign your own meaning to the code. Decide what it should do, and comment it or clarify it in some way and then make the code conform to your meaning. This is an intermediate or advanced skill that is sometimes harder than writing the original function in the first place, but the real world is often messy. You may have to fix a system you cannot rewrite.

How to Debug Using a Log

Logging is the practice of writing a system so that it produces a sequence of informative records, called a log. Printlining is just producing a simple, usually temporary, log. Absolute beginners must understand and use logs because their knowledge of programming is limited; system architects must understand and use logs because of the complexity of the system. The amount of information that is provided by the log should be configurable, ideally while the program is running. In general, logs offer three basic advantages:

  • Logs can provide useful information about bugs that are hard to reproduce (such as those that occur in the production environment but that cannot be reproduced in the test environment).

  • Logs can provide statistics and data relevant to performance, such as the time passing between statements.

  • When configurable, logs allow general information to be captured in order to debug unanticipated specific problems without having to modify and/or redeploy the code just to deal with those specific problems.

The amount to output into the log is always a compromise between information and brevity. Too much information makes the log expensive and produces scroll blindness, making it hard to find the information you need. Too little information and it may not contain what you need. For this reason, making what is output configurable is very useful. Typically, each record in the log will identify its position in the source code, the thread that executed it if applicable, the precise time of execution, and, commonly, an additional useful piece of information, such as the value of some variable, the amount of free memory, the number of data objects, etc. These log statements are sprinkled throughout the source code, particularly at major functionality points and around risky code. Each statement can be assigned a level and will only output a record if the system is currently configured to output that level. You should design the log statements to address problems that you anticipate. Anticipate the need to measure performance.

If you have a permanent log, printlining can now be done in terms of the log records, and some of the debugging statements will probably be permanently added to the logging system.

How to Deal with Intermittent Bugs

The intermittent bug is a cousin of the 50-foot-invisible-scorpion-from-outer-space kind of bug. This nightmare occurs so rarely that it is hard to observe, yet often enough that it can't be ignored. You can't debug because you can't find it.

Although after 8 hours you will start to doubt it, the intermittent bug has to obey the same laws of logic everything else does. What makes it hard is that it occurs only under unknown conditions. Try to record the circumstances under which the bug does occur, so that you can guess what the variability really is. The condition may be related to data values, such as ā€˜This only happens when we enter Wyoming as a value.ā€™ If that is not the source of variability, the next suspect should be improperly synchronized concurrency.

Try, try, try to reproduce the bug in a controlled way. If you can't reproduce it, set a trap for it by building a logging system, a special one if you have to, that can log what you guess you need when it really does occur. Resign yourself to that if the bug only occurs in production and not at your whim, this may be a long process. The hints that you get from the log may not provide the solution but may give you enough information to improve the logging. The improved logging system may take a long time to be put into production. Then, you have to wait for the bug to reoccur to get more information. This cycle can go on for some time.

The stupidest intermittent bug I ever created was in a multi-threaded implementation of a functional programming language for a class project. I had very carefully ensured correct concurrent evaluation of the functional program, good utilization of all the CPUs available (eight, in this case). I simply forgot to synchronize the garbage collector. The system could run a long time, often finishing whatever task I began, before anything noticeable went wrong. I'm ashamed to admit I had begun to question the hardware before my mistake dawned on me.

At work we recently had an intermittent bug that took us several weeks to find. We have multi-threaded application servers in Javaā„¢ behind Apacheā„¢ web servers. To maintain fast page turns, we do all I/O in small set of four separate threads that are different than the page-turning threads. Every once in a while these would apparently get ā€˜stuckā€™ and cease doing anything useful, so far as our logging allowed us to tell, for hours. Since we had four threads, this was not in itself a giant problem - unless all four got stuck. Then the queues emptied by these threads would quickly fill up all available memory and crash our server. It took us about a week to figure this much out, and we still didn't know what caused it, when it would happen, or even what the threads where doing when they got ā€˜stuckā€™.

This illustrates some risk associated with third-party software. We were using a licensed piece of code that removed HTML tags from text. Although we had the source code (thank goodness!) we had not studied it carefully until by turning up the logging on our servers we finally realized that the email threads were getting stuck in this problematic licensed code.

The program performed well except on some long and unusual kinds of texts. On these texts, the code was quadratic or worse. This means that the processing time was proportional to the square of the length of the text. Had these texts occurred commonly, we would have found the bug right away. If they had never occurred at all, we would never have had a problem. As it happens, it took us weeks to finally understand and resolve the problem.

How to Conduct Experiments

The late, great Edsger Dijkstra has eloquently explained that Computer Science is not an experimental science[ExpCS] and doesn't depend on electronic computers. As he puts it referring to the 1960s [Knife],

...the harm was done: the topic became known as ā€œcomputer scienceā€ - which, actually, is like referring to surgery as ā€œknife scienceā€ - and it was firmly implanted in people's minds that computing science is about machines and their peripheral equipment.

Programming ought not to be an experimental science, but most working programmers do not have the luxury of engaging in what Dijkstra means by computing science. We must work in the realm of experimentation, just as some, but not all, physicists do. If thirty years from now programming can be performed without experimentation, it will be a great accomplishment of Computer Science.

The kinds of experiments you will have to perform include:

  • Testing systems with small examples to verify that they conform to the documentation or to understand their response when there is no documentation,

  • Testing small code changes to see if they actually fix a bug,

  • Measuring the performance of a system under two different conditions due to imperfect knowledge of their performance characteristics,

  • Checking the integrity of data, and

  • Collecting statistics that may hint at the solution to difficult or hard-to-repeat bugs.

I don't think in this essay I can explain the design of experiments; you will have to study and practice. However, I can offer two bits of advice.

First, try to be very clear about your hypothesis, or the assertion that you are trying to test. It also helps to write the hypothesis down, especially if you find yourself confused or are working with others.

You will often find yourself having to design a series of experiments, each of which is based on the knowledge gained from the last experiment. Therefore, you should design your experiments to provide the most information possible. Unfortunately, this is in tension with keeping each experiment simple - you will have to develop this judgement through experience.

Last updated