C/C++: Basic Compiling and Linking
On many Unix platforms (and other), the C/C++ compiler has traditionally been
the command cc
and c++
, but the Gnu Free Software
Foundation has long provided a free, open-source C compiler known
as gcc
and g++
, and this is the one that the Linux operating system uses.
Even on Linux, you can type in cc/c++
and it will work, since there is
a soft link (read a Unix tutorial) from
the names cc/c++
to the binary executables named gcc/g++
– but it
is better just to explicitly use gcc/g++
directly, just in case
there are other compilers installed, too (like clang).
Different compilers on different operating systems will use different flags and have different options. The discussion below is specific to gcc/g++, but the ideas are transportable to virtually all compilers.
Basic Compiling and Linking
If you have a single source code file, say “prog.c” or “prog.cpp”, and you just want to compile it to an executable, you can use one of:
gcc prog.c
g++ prog.cpp
as the simplest possible compiling command. On most Unix platforms, if the program is syntactically correct and complete, this will produce in a binary executable file named “a.out”. Yes, this is very strange, and it is a very old historical custom on Unix platforms, but that is what you have to live with. If you want the executable to be named something else, then use the “-o” option:
gcc -o prog prog.c
g++ -o prog prog.cpp
This will name the executable “prog”. Note that on Unix platforms, it is customary for an executable to have no extension. On other platforms you might be used to seeing “.exe” or something like that, but on Unix (and thus Linux), an executable file should not have an extension.
The above commands actually do more than just compile. They also perform the step we call linking. The compile step really just takes the source code that you wrote and converts it to CPU instructions, which we call machine code or more usually object code. The compiler can do this for a whole program or for any partial piece of code (with at least complete functions). Object code by itself is not executable. On Unix platforms, object code files generally have a “.o” extension, while some other platforms might use “.obj”. You can tell the C compiler to only do compilation by using the “-c” flag:
gcc -c prog.c
g++ -c prog.cpp
This will result in an object code file named “prog.o”.
The linker is responsible for taking one or more object code files and linking them together to form an executable. Even if there is only one object code file, it still needs linked to libraries if it uses external functions (like printf(), for example). Even though you can use the Unix linker directly (it is the command “ld”, and if you want more information, look at it’s man page), normally we do not – instead, we use the C compiler and let it invoke the linker for us.
We can make the C compiler do the linking step for us simply by giving it object code files instead of source code files:
gcc -o prog prog.o
g++ -o prog prog.o
The above command will only do linking, since the “prog.c” file was already compiled into the “prog.o” object code file. Gcc sees that it was given an object code file and so it skips the compile step and goes directly to the linker step. Note that you still need to tell it the output executable file name, or else it will generate “a.out” again!
Compiling and Linking Multiple Files
For all but the smallest programs, our program source code is usually spread across multiple files. In C++, we generally put each class into its own source files (one header file and one code file), but we also separate C programs just for the sake of manageability.
It is possible to compile and link all files in one step, with something like:
gcc -o prog *.c
g++ -o prog *.cpp
In a Makefile we would usually write out the entire list of files, rather
than use *
.
This would compile all source files and then link them together into an executable program. We typically do not do this, however, because part of the advantage of multiple files is that we only need to recompile whatever we change – everything else should not be recompiled, only re-linked.
So we typically do a compile-only on each source file, which the “-c” flag:
gcc -c file1.c
g++ -c file1.cpp
As above, this compiles the file into an object code file named file1.o
(or
on Windows, file1.obj
).
Then we create the executable file by using the compiler to do linking:
gcc -o prog *.o
g++ -o prog *.o
(and, as above, in a Makefile we would usually write out the list of files.)
When the compiler is given object code files, it only does linking, not compiling.
In summary, when we have a program split over multiple source code files,
we usually use -c
to compile each separate source file into object code,
and then use -o execname
with all of the object files to create our
final executable program.
The Complete Picture
Below is a diagram of the complete picture of what happens when you compile and link your program. The annotations on arrows in between steps are the gcc options that will stop the process at that step. (I.e., to stop after preprocessing, use “gcc -E”, to stop after assembly code generation, use “gcc -S”, etc.)
Compiling Options
There are many options that you can give compilers to change the way that your code is compiled, or to help you with developing your code. Here are a few popular “gcc/g++” options (most compilers will have similar ones):
- -g: compile with debugging support. This option will include debugging information in the output (object code or executable), so that you can use a debugger (such as “gdb”) on your program.
- -W: enable warnings. This option will let the compiler warn you about potential problems in your code. Very useful!. I especially recommend using “-Wall” or “-Wpedantic”.
- -O: optimize the code produced. This will invoke various levels of code optimization to produce faster-running programs. Sometimes optimization can make a big difference. You can just use “-O” or you can use higher levels by using “-O2” or “-O3”. Optimization is hard, so your compilation will take longer. NOTE: You must retest after changing to optimized compilations. The output program is different, even though hopefully it acts the same. But it needs to be tested!
- -Idir: use directory for include file searching. This allows you
to put project include files anywhere and find them using a search
path mechanism. By default, when you do
#include <file.h>
, only /usr/include is searched. The -I option can be used multiple times and each time it adds another directory to search.