Notes on Translating Three-Address Code to Assembly Code for the X86
These nodes are no substitute for an X86 reference manual such as
the 80386 Programmer's Reference Manual, published by Intel
Corporation.
Notes on the X86 Architecture
General Information
The Intel X86 has been the dominant desktop PC processor family for over
two decades. X86 is a CISC (complex instruction set computer) chip, with
many opcodes and addressing modes, but relatively few and simple registers.
Generating code for it involves understanding the
instruction set as well as the syntax of the assembler used to
take ascii assembly language code and produce machine code object
files (in our case, /usr/bin/as is the GNU Assembler).
Assembly code files should end with the suffix ".s". Such
files can be assembled to executable code by invoking the C compiler:
% cc -o foo foo.s
Note: running "as foo.s" will not produce an executable
module, it will produce an object module (.o) that requires linking.
For examples of the assembly code produced by the C compiler,
use "cc -S". Your compiler should behave in the same manner
as the standard C compiler, calling the assembler and linker by default.
You may wish to leave .s files around after assembly for debugging purposes,
rather than deleting them by default.
Memory Alignment
X86 architecture does not have super strict alignment requirements on memory
access, but accessing 16- or 32-bit values at addresses other than multiples
of 2 and 4 bytes is much slower than normal, so aligning variables at
appropriate multiples of their size is recommended.
Registers and Register Windows
The X86 is a fairly traditional architecture with a relatively small
number of registers; the 80386 had 8 general purpose registers, of
which four are usually used for numeric values (%eax, %ebx, %ecx, %edx)
and four for stack and pointer values (%ebp, %esp, %esi, %edi).
The assembler's names for registers all begin with a
percent sign % to distinguish them from ordinary symbols.
There are also 6 segment registers, used in memory address operations,
a flags register, and an instruction pointer.
The Stack
The stack grows from high addresses towards low addresses. The stack
pointer register is %esp. The frame pointer register is %ebp.
A stack frame has the following structure:
X86 Stack Frame
Code Generation
Identifiers
Global identifier id in the source program translates directly to an
identifier id in the assembly code generated. Local identifier
id translates into an offset from the frame pointer. Be
careful with negative offsets, since the memory addressed still refers to
bytes extending in a positive direction from the byte referenced. For
example, a 4-byte load from %ebp-4 gives addresses -4, -3, -2, and -1 from
the current frame pointer, not bytes at -4, -5, -6, and -7. A load from
%ebp+4 would indeed give memory at offsets 4, 5, 6, and 7.
Assembler Directives
Space for uninitialized global variables is allocated with the .comm
pseudoinstruction; for example, a four-byte variable named x looks like:
.comm x,4,4
For initialized global variables, things are more complex:
.globl x
.data
.align 4
.type x,@object
.size x,4
x:
.long 5
Space for global variables is generated one identifier at a time. An
identifier id that occupies n bytes of storage with
alignment multiple m is allocated as
.comm id, n, m
Examples:
C code assembler directive
int x, a[5]; .comm x,4,4
.comm a,20,4
char y; .comm y,1,1
Code should all be generated in the "text" segment (.text).
Before starting a portion of code for a function foo, generate
.align 4
.global foo
.type foo,@function
foo:
Code and data can be intermixed, by switching segments back and forth.
The code section's end should be marked with a label and a .size
directive.
.Lfe1:
.size main,.Lfe1-main
Parameters
Parameters all occupy space on the stack and are generally pushed from
back to front; the first four parameters can, if they fit into 32 bits,
passed in registers instead. Data smaller than 4 bytes is passed in a
4 byte position on the stack; data larger than 4 bytes per parameter
is longword aligned.
Accessing an actual parameter from within the called function consists
of accessing a register (e.g. %eax) or
loading memory k(%ebp), where k is 8 + the offset of the parameter
(8 + 4 * i for parameter i, if all parameters are passed as 4 bytes).
Size Conversions
On a CISC architecture there are 8- and 16-bit versions of many of the
instructions.
The instruction ``movsbl src, reg'' loads and converts the source byte
in memory into a 32-bit register value.
A corresponding instruction, "movb i, %al" followed by
"movb %al,c" stores the low byte of
a 32-bit value i into a single byte of memory c, accomplishing the
conversion in the other direection.
Translating Assignment Statements
Note: for simplicity, the operations below are all translated in terms
of 32-bit operands. For character operations a "b" is appended to
the instruction; any C language mixed-type arithmetic should result in
explicit size conversions per the preceding section.
Most of these operands are using temporary registers, usually the %o
registers.
Global operands
Global values have addresses computed as constants at compile times;
such 32-bit values are loaded into a register
|
x := y + z | movl y,%eax
movl z,%edx
leal (%edx,%eax),%ecx
movl %ecx,x (if x is a global)
|
Local operands
If in memory, locals and temporaries are accessed by displacing off of
%ebp with progressively smaller numbers, e.g. starting with -4(%ebp),
-8(%ebp), and so forth. y, and z in the code below are constant offsets
(such as -4) the compiler computes and defines for a given local.
Some locals may already be in registers and require no load instructions.
|
x := y + z | movl y(%ebp), %eax
movl z(%ebp), reg2
leal (%edx, %eax),%ecx
movl %ecx,x (or mark %ecx as holding x)
|
Translating Jumps
A label is simply an identifier. They can be stored as integers in
intermediate code, and written out prefixed by ".L". Goto's are simply
"branch always" instructions. Pipelining causes branches to be expensive.
|
goto L | ba L |
|
if x op y goto L | movl x,%eax
cmpl y(%ebp),%eax
jcc .Ln
|
where cc is a condition code, one of
|
e | equal |
|
ne | not equal |
|
l | less than |
|
le | less than or equal |
|
ge | greater than or equal |
|
g | greater
|
Arrays
Global Arrays
The value of the ith element x[i] of global array x, whose elements
are n bytes wide, is read or written by adding an offset to the base
address. For constant indices things are simple, for example
x[2] is written as x+8. To do array subscriping with a variable index,
one multiplies the index by the element size and adds it to the base
address. If the element size is 2, 4, or 8, an addressing mode is
available to multiply it directly as part of the address calculation.
|
addr a | movl $a,%edx
|
|
x[i] = y | movl i,%edx
leal 0(,%edx,4),%eax
movl $x,%edx
movl y,%esi
movl %esi,(%eax,%edx)
|
Local Arrays
Computing the address of a local array is done by adding an offset to
the frame pointer.
Procedures
Entering a Procedure
On entering a procedure, the first instructions generally save
the old frame pointer register %ebp, and then set the new %ebp
to the current top of stack. The stack pointer is then decremented
by the amount of local variable space needed in the procedure.
pushl %ebp
movl %esp,%ebp
subl $n,%esp
where n is the space need by the stack frame, computed as 4
+ space for locals and temporaries --- typically some number like 48.
Calling a Procedure
First, the parameters are pushed, from right to left, onto the stack.
After parameters, a call
instruction does the actual call. For a call such as
f(c, i) the code is:
|
param i | movl i,%eax
pushl %eax
movl c,%eax
pushl %eax
call f
|
Return From a Procedure
On the X86, the return value is stored in %eax. Function exit is
kind of complicated; it must have a leave instruction followed by
a ret instruction, and the leave instruction must be aligned specially,
so there is a .p2align pseudoinstruction. A label is included; if
several return statements occur within a single function body, each
will set the %eax and then jump to a single exit code.
|
return | ret
restore |
|
return x | movl x,%eax
jmp .L2
.p2align 4,,7
.L2:
leave
ret
|