Compiling, Linking, and Loading Programs

This page discusses how we go about translating a program from a high level language (like C) into machine code, and then getting that machine code into a runnable state.

The main points we'll make are:

  1. Compilation translates a program in a high level language into assembly code.
  2. Assembly translates a program in assembly code into machine (also known as object) code.
  3. The differences between a high level language and assembly language are
    1. A statement in a high level language generally compiles into more than one statement in assembly code.
    2. A statement in assembly code generally assembles into exactly one statement in machine code.
    3. High level languages are intended to be portable across multiple instruction sets; an assembly language is just a human-readable representation of a single instruction set.
  4. Linking combines separately compiled (and assembled) code modules with system libraries to produce executable code. While that's important in many environments, it won't be used in ours.

Note: some compilers skip the assembly step and compile directly to object code.

Compilation

Compiling a program translates it from the high level language into the machine code. We'll show the general idea in terms of a simple program that just flashes the lights on the HC11 miniboard. Here's the program, in C:

#define MOTORS (*(char *)0x1004)
int main()
{
    register short int x; // 16 bit register
    register char a;      //  8 bit register
    
    a = 0xf0;             // put a value in a that would turn on the lights

    while (1) {
        MOTORS = a;       // write out to motors

        for (x = 0xffff; x != 0; x--) { /* wait */ }

        a = a ^ 0xf;      // reverse the lights' color
    }
}

The compilation process converts this code into assembly code. We'll talk more about just how this works later (and CS370 will talk about it a lot! For now since we'll be hand-compiling, we can just think in these terms: for every statement in the C code, we need to find some corresponding statements in assembly code. Here's the assembly code we get, with the corresponding C code appearing as comments.

MOTORS  equ     $1004     * #define MOTORS (*(char *)0x1004)

        org $f800
start                     
*                         * main() {

        ldaa    #$f0      *     a = 0xf0;

oloop                     
*                         *     while (1) {

        staa    MOTORS    *         MOTORS = a;

        ldx     #$ffff    *         for (x = 0xffff;
iloop   dex               *                                  x--)
        bne     iloop     *                          x != 0;

        eora    #$0f      *         a = a ^ 0xf;
        
        bra     oloop     *     }
    
        end     start     * }

Assembly

The assembly process converts the assembly code we got in the last step into machine code (note: some compilers skip the assembly step, and convert directly to machine code). If we assemble this code, using the command

as11 lights.asm -l > lights.lst

we get two new files as a result: one is called lights.lst, and the other is lights.s19. Of the two, lights.lst is more important to us, and lights.s19 is more important to the computer. Let's take a look at lights.lst.

  Assembling lights.asm
0001 1004                         MOTORS  equ     $1004     * #define MOTORS (*(char *)0x1004)
0002                              
0003 f800                               org $f800
0004                              start                     
0005                              *                       * main() {
0006                              
0007 f800 86 f0                           ldaa  #$f0      *     a = 0xf0;
0008                              
0009                              oloop                     
0010                              *                       *     while (1) {
0011                              
0012 f802 b7 10 04                        staa  MOTORS    *         MOTORS = a;
0013                              
0014 f805 ce ff ff                      ldx     #$ffff    *         for (x = 0xffff;
0015 f808 09                      iloop dex               *                                  x--)
0016 f809 26 fd                         bne     iloop     *                          x != 0;
0017                              
0018 f80b 88 0f                         eora    #$0f      *         a = a ^ 0xf;
0019                                    
0020 f80d 20 f3                         bra     oloop     *     }
0021                                
0022 f80f                               end     start     * }


Number of errors 0
Number of warnings 0

You can see that this is exactly our assembly code, and also the machine code it translates into.

Linking

Think about a program in C. Just about every C program ever written uses the printf() function. But where does it come from? It certainly isn't in your program. The answer is that it is in a system library. Adding that function (and all the other system functions) into your program so it can be used is called linking the program.

For large programs, you don't want to have to compile the whole thing every time you make a change. So, instead of the program being one huge file, it's a bunch of smaller files which are compiled separately. Linking also stitches all this code together.


Last modified: Wed Jan 28 13:06:47 MST 2004