This page discusses how we go about translating a program from a high level language (like C or Java) into machine code, and then getting that machine code into a runnable state.
The main points we'll make are:
for
and while loops, while assembly code will only
have branches (though the low-level operations
provided by the assembly code are primarily intended to
support the high-level concepts available in the high
level language).
Note: some compilers skip the assembly step and compile directly to object code.
Compiling a program translates it from the high level language into the machine code. We'll show the general idea in terms of a simple program that just adds some numbers together. Here's the program, in C:
// tiny example program that adds up the numbers 0+1+2+3+4
#include
The compilation process converts this code into assembly code. We'll talk more about just how this works later (and CS370 will talk about it a lot! For now since we'll be hand-compiling, we can just think in these terms: for every statement in the C code, we need to find some corresponding statements in assembly code. Here's the assembly code we get, with the corresponding C code appearing as comments.
*// tiny example program that adds up the numbers 0+1+2+3+4
*#include
* variables
org 0
count rmb 1 * char count;
sum rmb 1 * char sum;
* code
org $f800
main
* * int main() {
ldaa #0 * sum = 0;
staa sum
ldaa #0 * count = 0;
staa count
loop
ldaa count * while (count < 5) {
cmpa #5
bge out
ldaa sum * sum = sum + count;
adda count
staa sum
ldaa count count = count + 1;
adda #1
staa count
bra loop * }
* * printf("sum = %d\n", sum);
out bra out
end main * }
The assembly process converts the assembly code we got in the last step into machine code (note: some compilers skip the assembly step, and convert directly to machine code). If we assemble this code, using the command
as11 sum.asm -l > sum.lst
we get two new files as a result: one is called sum.lst, and the other is sum.s19. Of the two, sum.lst is more
important to us, and sum.s19 is more important to the computer.
Let's take a look at sum.lst
Assembling sum.asm
0001 *// tiny example program that adds up the numbers 0+1+2+3+4
0002 *#include
You can see that this is exactly our assembly code, and also the machine code it translates into.
Think about a program in C. Just about every C program ever written
uses the printf() function. But where does it come from?
It certainly isn't in your program. The answer is that it is in a
system library. Adding that function (and all the other system
functions) into your program so it can be used is called
linking the program.
For large programs, you don't want to have to compile the whole thing every time you make a change. So, instead of the program being one huge file, it's a bunch of smaller files which are compiled separately. Linking also stitches all this code together.
In the case of the HC11, the total amount of space in the processor isn't large enough to make separate assembly worth while. So, there is no linking step in this processor.
Finally, the generated machine code has to be loaded into the computer's memory so it can be executed — that's what the S19 file is for. It isn't intended to be read by human beings; it's just something that's really easy for a computer program to read so it can be loaded into memory. Just for grins, we can take a look at the S19 file generated for our summation program:
S121F8008600970186009700960081052C0E96019B00970196008B01970020EC20FEA8
S903F80004
This is called an S19 file because every line starts with either S1 or S9. S1 lines contain code; the S9 line at the end tells where the program should start executing. No, you don't need to actually know the file format!
Both our simulator and the downloader that loads code onto our
computers for this class understand the S19 format. The
tksim11 frontend to the simulator also understands the
.lst files, so it can display the current line being
executed.