Skip to Content

C/C++: Debugging Memory with Valgrind

One of the main problems in C and C++ programming is managing dynamically allocated memory. Java has garbage collection, an automated process that cleans up all of your dynamic memory that is not needed, so you don’t have to, but C/C++ does not have this – so you are responsible for un-allocating (freeing) all allocated memory.

Many tools have been created to help programmers figure out if they are correctly freeing all allocated memory, and the one we are using is called Valgrind. See https://www.valgrind.org/ for more info. Memory checking is just one of valgrind’s capabilities, it is a really cool open source tool.

Preliminaries

C and C++ compilers have an option flag for including debugging information in the object code and executable. Our gcc/g++ compilers use “-g” for this. When this is used, debuggers have more information to connect the executable back to the original source code. This is great! You don’t want to deploy a real program into production with debugging information in it, but in development it is very helpful. Use it!

To use it, just add “-g” to the CFLAGS variable in your Makefile.

BTW, learning to use the Gnu debugger, called “gdb” is a very good idea…

Valgrind Use

With a -g enabled executable, the way you use valgrind is that you run your program under valgrind’s control. A good set up is the following:

valgrind --leak-check=full --show-leak-kinds=all --track-origins=yes ./mycc test3.c

In the example above, my compiler program is “./mycc” and I am giving it “test3.c” as input (note: this example is from my compilers class, and so that’s why the program is operating on a program source code!). All the stuff before that is options for valgrind. You can put this line as the action below a “memcheck” rule in your Makefile (be sure the line begins with a tab character, which ‘make’ requires for action lines).

Example

When I run valgrind on my Compiler-2 code, I get lots of output, but down at the end is the summary:

==2917== LEAK SUMMARY:
==2917==    definitely lost: 7,065 bytes in 44 blocks
==2917==    indirectly lost: 0 bytes in 0 blocks
==2917==      possibly lost: 0 bytes in 0 blocks
==2917==    still reachable: 17,090 bytes in 10 blocks
==2917==         suppressed: 0 bytes in 0 blocks

This summary tells me that my program lost about 7K of memory over 44 separate memory leaks. Your goal is to have ZERO bytes lost (the “still reachable” line is irrelevant). I can scroll up through the valgrind output and it has detailed explanations of all of those memory leaks. This is great! One example if this:

==2917== 494 bytes in 4 blocks are definitely lost in loss record 9 of 16
==2917==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2917==    by 0x401505: yyparse (parser.y:80)
==2917==    by 0x401D78: main (parser.y:168)

The lower three lines is the call stack – main() called yyparse() at line 168 of my source file parser.y, and then yyparse() called malloc() at line 80 of my parser.y source file. This is inside an action block of a production rule, which yacc embeds in the yyparse() function. This tells you where the memory was allocated at – but it does not tell you where and when you should free it, you have to figure that out!

So if I go look at my yacc code, line 80 is the malloc() in this production rule:

funcall: ID LPAREN arguments RPAREN 
     {
        if (debug) fprintf(stderr,"function call!\n");
        char *code = (char*) malloc(32+strlen($3));
        sprintf(code,"%s\tcall\t%s@PLT\n",$3,$1);
        $$ = code;
        argnum = 0;
     }

So this is telling me that the code string returned as $$ from this action block is never freed. But where to free it? Well, I need to figure out where and when I am done with it! Fortunately, my “funcall” nonterminal is only used in one place, in this production rule:

statement: funcall SEMICOLON
     {
        if (debug) fprintf(stderr,"statement def!\n");
        $$ = $1;
     }

So when I look at this, I see that I am just passing this same code string along, so I am not done with it here! Now where is “statement” used? It is in this production rule:

statements: /* empty */ { $$ = ""; }
     | statement statements
     {
        if (debug) fprintf(stderr,"statements def!\n");
        char *code = (char*) malloc(strlen($1)+strlen($2)+5);
        strcpy(code,$1);
        strcat(code,$2);
        $$ = code;
     }

Now I can see that after I create a new code string here, and then copy both $1 and $2 into it, then I am done with $1 and $2, and I can free them! So my code should be changed to be:

statements: /* empty */ { $$ = ""; }
     | statement statements
     {
        if (debug) fprintf(stderr,"statements def!\n");
        char *code = (char*) malloc(strlen($1)+strlen($2)+5);
        strcpy(code,$1);
        strcat(code,$2);
        $$ = code;
        free($1);
        free($2);
     }

Not only have I taken care of freeing my “funcall” code string (which came through the “statement” rule), I’ve also seen that my previous “statements” code string also needs freed at this point, so I’ve taken care of even more memory leaks!

In general, whenever an allocated code string has been copied into a new code string, you can and should free it. But this is a general rule – you need to understand your own code and be sure you don’t free something you are still using.

If I make the change above, I’ll probably get a segmentation fault – an error! You can use “gdb” to track the error down, but if you look at my “empty” statements rule, you can see that I do not allocate a string there, I just return a constant empty string. But then my “free($2)” in my other action block will try to free it! So I really should be careful to make my code obey this rule: every action block should return an newly allocated string. With this rule then, I can always safely free() each code string after I copy it into a new one. So my code above should look like (change is in bold):

statements: /* empty */ { $$ = strdup(""); }
     | statement statements
     {
        if (debug) fprintf(stderr,"statements def!\n");
        char *code = (char*) malloc(strlen($1)+strlen($2)+5);
        strcpy(code,$1);
        strcat(code,$2);
        $$ = code;
        free($1);
        free($2);
      }

This is the basic idea – now go and free() everything you allocate!

Notes: strings that you created using strdup() in your scanner eventually need freed, and a few other things too, like symbol table entries.