Notes on Translating Three-Address Code to Assembly Code for the X86

These nodes are no substitute for an X86 reference manual such as the 80386 Programmer's Reference Manual, published by Intel Corporation.

Notes on the X86 Architecture

General Information

The Intel X86 has been the dominant desktop PC processor family for over two decades. X86 is a CISC (complex instruction set computer) chip, with many opcodes and addressing modes, but relatively few and simple registers. Generating code for it involves understanding the instruction set as well as the syntax of the assembler used to take ascii assembly language code and produce machine code object files (in our case, /usr/bin/as is the GNU Assembler).

Assembly code files should end with the suffix ".s". Such files can be assembled to executable code by invoking the C compiler:

% cc -o foo foo.s
Note: running "as foo.s" will not produce an executable module, it will produce an object module (.o) that requires linking. For examples of the assembly code produced by the C compiler, use "cc -S". Your compiler should behave in the same manner as the standard C compiler, calling the assembler and linker by default. You may wish to leave .s files around after assembly for debugging purposes, rather than deleting them by default.

Memory Alignment

X86 architecture does not have super strict alignment requirements on memory access, but accessing 16- or 32-bit values at addresses other than multiples of 2 and 4 bytes is much slower than normal, so aligning variables at appropriate multiples of their size is recommended.

Registers and Register Windows

The X86 is a fairly traditional architecture with a relatively small number of registers; the 80386 had 8 general purpose registers, of which four are usually used for numeric values (%eax, %ebx, %ecx, %edx) and four for stack and pointer values (%ebp, %esp, %esi, %edi). The assembler's names for registers all begin with a percent sign % to distinguish them from ordinary symbols.

There are also 6 segment registers, used in memory address operations, a flags register, and an instruction pointer.

The Stack

The stack grows from high addresses towards low addresses. The stack pointer register is %esp. The frame pointer register is %ebp. A stack frame has the following structure:


X86 Stack Frame

Code Generation

Identifiers

Global identifier id in the source program translates directly to an identifier id in the assembly code generated. Local identifier id translates into an offset from the frame pointer. Be careful with negative offsets, since the memory addressed still refers to bytes extending in a positive direction from the byte referenced. For example, a 4-byte load from %ebp-4 gives addresses -4, -3, -2, and -1 from the current frame pointer, not bytes at -4, -5, -6, and -7. A load from %ebp+4 would indeed give memory at offsets 4, 5, 6, and 7.

Assembler Directives

Space for uninitialized global variables is allocated with the .comm pseudoinstruction; for example, a four-byte variable named x looks like:
 .comm  x,4,4
For initialized global variables, things are more complex:
.globl x
.data
	.align 4
	.type	 x,@object
	.size	 x,4
x:
	.long 5
Space for global variables is generated one identifier at a time. An identifier id that occupies n bytes of storage with alignment multiple m is allocated as
.comm id, n, m
Examples:
C code				 assembler directive
int x, a[5]; 			.comm		x,4,4
				.comm		a,20,4
char y;				.comm		y,1,1

Code should all be generated in the "text" segment (.text).
Before starting a portion of code for a function foo, generate
 .align 4
 .global foo
 .type foo,@function
foo:
Code and data can be intermixed, by switching segments back and forth. The code section's end should be marked with a label and a .size directive.
.Lfe1:
	.size	 main,.Lfe1-main

Parameters

Parameters all occupy space on the stack and are generally pushed from back to front; the first four parameters can, if they fit into 32 bits, passed in registers instead. Data smaller than 4 bytes is passed in a 4 byte position on the stack; data larger than 4 bytes per parameter is longword aligned.

Accessing an actual parameter from within the called function consists of accessing a register (e.g. %eax) or loading memory k(%ebp), where k is 8 + the offset of the parameter (8 + 4 * i for parameter i, if all parameters are passed as 4 bytes).

Size Conversions

On a CISC architecture there are 8- and 16-bit versions of many of the instructions. The instruction ``movsbl src, reg'' loads and converts the source byte in memory into a 32-bit register value.

A corresponding instruction, "movb i, %al" followed by "movb %al,c" stores the low byte of a 32-bit value i into a single byte of memory c, accomplishing the conversion in the other direection.

Translating Assignment Statements

Note: for simplicity, the operations below are all translated in terms of 32-bit operands. For character operations a "b" is appended to the instruction; any C language mixed-type arithmetic should result in explicit size conversions per the preceding section.

Most of these operands are using temporary registers, usually the %o registers.

Global operands

Global values have addresses computed as constants at compile times; such 32-bit values are loaded into a register
x := y + z movl y,%eax
movl z,%edx
leal (%edx,%eax),%ecx
movl %ecx,x (if x is a global)

Local operands

If in memory, locals and temporaries are accessed by displacing off of %ebp with progressively smaller numbers, e.g. starting with -4(%ebp), -8(%ebp), and so forth. y, and z in the code below are constant offsets (such as -4) the compiler computes and defines for a given local. Some locals may already be in registers and require no load instructions.
x := y + z movl y(%ebp), %eax
movl z(%ebp), reg2
leal (%edx, %eax),%ecx
movl %ecx,x (or mark %ecx as holding x)

Translating Jumps

A label is simply an identifier. They can be stored as integers in intermediate code, and written out prefixed by ".L". Goto's are simply "branch always" instructions. Pipelining causes branches to be expensive.
goto L ba L
if x op y goto L movl x,%eax
cmpl y(%ebp),%eax
jcc .Ln
where cc is a condition code, one of
e equal
ne not equal
l less than
le less than or equal
ge greater than or equal
g greater

Arrays

Global Arrays

The value of the ith element x[i] of global array x, whose elements are n bytes wide, is read or written by adding an offset to the base address. For constant indices things are simple, for example x[2] is written as x+8. To do array subscriping with a variable index, one multiplies the index by the element size and adds it to the base address. If the element size is 2, 4, or 8, an addressing mode is available to multiply it directly as part of the address calculation.
addr a movl $a,%edx
x[i] = y movl i,%edx
leal 0(,%edx,4),%eax
movl $x,%edx movl y,%esi movl %esi,(%eax,%edx)

Local Arrays

Computing the address of a local array is done by adding an offset to the frame pointer.

Procedures

Entering a Procedure

On entering a procedure, the first instructions generally save the old frame pointer register %ebp, and then set the new %ebp to the current top of stack. The stack pointer is then decremented by the amount of local variable space needed in the procedure.
  pushl %ebp
  movl %esp,%ebp
  subl $n,%esp
where n is the space need by the stack frame, computed as 4 + space for locals and temporaries --- typically some number like 48.

Calling a Procedure

First, the parameters are pushed, from right to left, onto the stack. After parameters, a call instruction does the actual call. For a call such as f(c, i) the code is:
param i movl i,%eax
pushl %eax
movl c,%eax
pushl %eax
call f

Return From a Procedure

On the X86, the return value is stored in %eax. Function exit is kind of complicated; it must have a leave instruction followed by a ret instruction, and the leave instruction must be aligned specially, so there is a .p2align pseudoinstruction. A label is included; if several return statements occur within a single function body, each will set the %eax and then jump to a single exit code.
return ret
restore
return x movl x,%eax
jmp .L2
.p2align 4,,7
.L2:
leave
ret