Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it? - Brian Kernighan, 1974
If you have programmed, you have debugged. Any program will have bugs the first time. Debugging is an art itself, and one talk can't possibly teach you how to be an expert debugger. This talk is mainly about some tools which will make debugging easier. These tools let you see into your programs in greater detail, which helps you to debug better.
This tends to be the first way of debugging. It works, but is usually slow since you have to re-run the program each time it goes. Still, it is a good first step for small and fast programs.
This is a good system, but the things you type are temporary. They can't be used as part of other functions or anything. This makes it harder to use for big programs.
A debugger is a tool that can inspect other processes and view internal state. It is the equivalent of a medical imaging device for programs. By being able to see directly inside of running processes, your debugging efficiency can increase greatly. Imagine a doctor trying to diagnose hard problems without advanced imaging.
Use the scientific method
Specify a condition that must be true at a certain point of time
1 | assert( n_links % 2 == 0);
|
If condition is false, program fails and prints the exact location in code
Catch problems before they become hard bugs or wrong results
For speed, can be removed in production code (e.g. -DNDEBUG)
One component of software testing
Software testing is extremely important, because it lets you ensure quality by actually running code. Assertions are a small portion of that that can only test things at one point in code. In addition, they are embedded within the code, and run on all of the real data going through the program.
Assertions are used for conditions that must always be true at a certain point in code. If they ever are not true, something serious must be wrong and the program fails with a useful error message. As a real example, once I used it when I was counting something that should always be even. I had an assertion for evenness. One day, it started failing. It turned out that my initial input data wasn't prepared properly.
Assertions should be used for conditions that should never be false and are a property of the code. They shouldn't be used for things like checking arguments, since it is expected that arguments could be wrong and the assertions could be disabled by optimizing out.
Debugging is a concept that exists across programming languages. Creating a debugger is a necessary step of creating any programming language, toolchain, or operating system.
Basically, whatever you do, you should be able to find a debugger for it. Most of the operations I describe below should work with your environment. The commands within the debuggers seem to be fairly standard.
Debuggers exist not just for "normal" programs like we use here, but for operating system kernels (which have to operate at a very low level, maybe by external network connections since a kernel can't pause to debug itself), embedded devices (which may have to run over dedicated cables attached to the circuit board), as servers to run over network links, and so on.
- Debugging is actually an interface, so there can be more friendly front-ends available. For example,
- The "Data Display Debugger" (ddd) is a more graphical debugger for gcc.
- pudb is a console (ncurses) based Python debugger.
- Most IDEs (e.g. emacs, spyder, ...) will integrate debuggers somehow.
- Different C compilers may have different debuggers. You may have to search some to find the right debugger for your language, compiler, and architecture.
- Matlab: - http://se.mathworks.com/help/matlab/debugging-code.html - Tutorial: http://se.mathworks.com/help/matlab/matlab_prog/debugging-process-and-features.html#brqxeeu-177
- Bash: http://sourceforge.net/projects/bashdb/
- R: http://www.stats.uwo.ca/faculty/murdoch/software/debuggingR/
More formal definitions:
- execution frame:
- All the context within a function. In C, it is all variables available for use at a certain line. In Python, this is basically the local variables (locals()), global variables of the module (globals()).
- call stack:
- A data structure that stores active subroutines in a computer program. On the stack is the main function, then the first function called, then the second function called, and so on. The python exception tracebacks are a listing of the stack.
In C, you must compile with debugging symbols.
Since C programs are basically raw machine code, the program doesn't include the source code for each machine instruction, variable names, or anything human-understandable.
Compile using the -g option:
1 | $ gcc -g filename.c
|
Python, being interpreted, always has the source code available, no nothing special is needed.
Other languages or compiler options may vary.
Note: according to current plan, these exercises will not be done in class, but are left in for reference.
In this set of exercises, we will compile a C code with debugging symbols and run it through the debugger in different ways.
In /triton/scip/debug/, there is a program error.c that has an error in it. Copy this file to your working directory, compile, and run it.
1 2 3 4 | $ gcc error.c
$ ./a.out
...
Segmentation fault
|
We see that there is a fatal error (by design). How can we see where it is? First, recompile with debugging symbols enabled. This is needed so that debuggers are able to see what line corresponds with each compiled instruction. In gcc, this is done with the -g flag.
1 | $ gcc -g error.c
|
If you run the code normally, nothing appears different. We have to start the program under control of the debugger. For gcc, the debugger is gdb.
1 2 3 4 5 6 | $ gdb a.out
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-75.el6)
...
Reading symbols from
/home/darstr1/scip2015/debugging/a.out...done.
(gdb)
|
We end up in an interactive gdb shell. The program doesn't start running until we say to.
First, we tell the program to run using the run command:
1 2 3 4 5 6 7 8 | (gdb) run
Starting program: /home/darstr1/scip2015/debugging/a.out
...
Program received signal SIGSEGV, Segmentation fault.
0x000000000040055c in main () at error.c:14
14 printf("%d, %d, %x\n", i, *pointers[i]);
(gdb)
|
We see that it runs, and when the error occurs we drop back to the interactive shell for more work.
Explore the following commands: l or list, bt or backtrace.
Let's figure out what the problem is. Use p or print to try to figure out what the problem is.
1 2 3 4 5 6 | (gdb) print pointers[i]
$1 = (int *) 0x500000000
(gdb) print i
$2 = 5
(gdb) print *pointers[i]
Cannot access memory at address 0x500000000
|
It turns out that pointers[5] is an invalid memory address. We investigate the definition of pointers, and see that it is of size 5 so valid indexes go only from 0--4. Our loop counter has an off-by-one error.
Sometimes there isn't a fatal error, but there is a notable bug. Or maybe we want to make the debugger stop a few lines before our error, so we can examine the lead-up. We can do this using breakpoints
Start the debugger again on a.out from error.c from the previous exercise:
1 | $ gdb a.out
|
Set a breakpoint using the b command:
1 2 | (gdb) b 8
Breakpoint 1 at 0x400517: file error.c, line 8.
|
Now run the program:
1 2 3 4 5 | (gdb) run
Starting program: /home/darstr1/scip2015/debugging/a.out
Breakpoint 1, main () at error.c:9
9 for (i=0 ; i<5 ; i++) {
|
The program runs and stops at this line. You can now do all of the normal commands as in the last exercise.
We continue from the previous exercise. Once we are stopped on a non-fatal error, we can step through the program line-by-line and see what is going on.
The command next runs the current line and goes to the next.
1 2 3 4 5 6 7 8 9 10 | (gdb) next
10 pointers[i] = &array[i];
(gdb) print i
$1 = 0
(gdb) next
9 for (i=0 ; i<5 ; i++) {
(gdb) next
10 pointers[i] = &array[i];
(gdb) print i
$1 = 1
|
We see that each line executes in the loop, one by one. We can print and interact which each line in sequence.
Once you are done, you can cont to continue until the next breakpoint or error occurs.
1 2 3 4 5 | (gdb) cont
...
Program received signal SIGSEGV, Segmentation fault.
0x000000000040055c in main () at error.c:14
14 printf("%d, %d, %x\n", i, *pointers[i]);
|
Let's say you have started running a program, and you need to see what is going on inside of it? What can you do?
In scip/debugging, there is a program attaching.c. Compile it with debugging symbols.
1 | $ gcc -g attaching.c
|
Once this program starts, it will enter an infinite loop consuming CPU. It will print its process ID. We will open a separate shell on triton, and attach to this process using gdb -p PID.
In shell 1:
1 2 | $ ./a.out
This process id is 4395
|
In shell 2:
1 | $ gdb -p 4395
|
Now, you can do all of the normal things. Do this at least:
Don't forget to kill the process (with Control-C) once you detach the debugger, or else you'll keep occupying the processor on the frontend node - a big no-no.
- C:
- Python:
- A good introduction to using pdb: https://pythonconquerstheuniverse.wordpress.com/2009/09/10/debugging-in-python/
- Matlab: - http://se.mathworks.com/help/matlab/debugging-code.html - Tutorial: http://se.mathworks.com/help/matlab/matlab_prog/debugging-process-and-features.html#brqxeeu-177
- Bash: http://sourceforge.net/projects/bashdb/
- R: http://www.stats.uwo.ca/faculty/murdoch/software/debuggingR/