AVR Stack Operations
Stacks and the SP "Register"
A stack is a data structure that holds data items. It is a special type of data structure, because it is very limited in the way you can access the elements. You can push a data item onto the stack, or you can pop an item off of the stack. Think of stacking blocks one on top of each other. When you push (or place) a block onto the stack, it is at the top of the stack. When you pop (or take) a block off of the stack, you always take the one at the top. So, the last block on is the first one off. This property is known as LIFO, or Last In, First Out. It is the basic characteristic of a stack.
The AVR, like all processors, has explicit support for a stack. This support is in the form of an SP (stack pointer) register, which keeps the address of the top of the stack, and special machine instructions to manipulate the stack. On the AVR the SP "register" is actually two I/O address locations (kind of like the SREG condition flag "register"). The SP-low-byte is at I/O address 0x3d and the SP-high-byte is at 0x3e.
You can use the IN and OUT instructions to read and write the SP value. To push and pop arbitrary registers on and off the stack, the instructions PUSH and POP are available. But there are other instructions that use the stack too, that we'll get to below.
An important thing to know about the AVR is that the stack grows toward lower addresses in memory -- that is, the address of the top item of the stack gets smaller as the stack gets fuller. For some, they might consider this downward! We normally think of stacks growing upward, and so we typically draw memory with the largest address on the bottom and the smallest on top.
The startup code in the Arduino environment initializes the SP to the largest possible address. For our Arduino board this is address 0x08FF, because we have 2KB of data memory (SRAM), and the memory starts at address 0x0100 since addresses 0x0000 to 0x00FF are used for memory-mapped I/O addresses.
The compiler allocates global variables and other static data from the lowest possible address, down at 0x0100. If the stack ever grows so full that it reaches the global data and overwrites it, that is a serious program failure!
Procedures, or Subroutines
The AVR support procedures with the special instructions CALL/RCALL and RET, which respectively call a procedure and return from it. These operations perform a jump in that they jump to someplace else in your code other than continuing sequental execution, but they also use the stack to coordinate those jumps. In particular, a CALL or RCALL instruction jumps to the procedure being called and pushes the return address on the stack. The return address is the address of the instruction below the call instruction, which is where the procedure should return to when it is done.
The RET instruction pops the return address off of the stack and places it into the PC register, thus effecting a jump back to the caller, at the location where the program should continue.
If a procedure (subroutine) has parameters, on the AVR the parameters are placed in registers r25-r8 (see the register usage conventions), and then if there are more parameters, these are placed on the stack before the call instruction is executed. It is your job as the programmer to put the correct number of parameters onto the stack, and also to remove them from the stack after the procedure is done.
The Stack Frame Pointer
If some of the arguments are passed on the stack, or if some local variables are created, then the procedure must have some mechanism for accessing those. The typical way to do this on most processors, and on the AVR, is to copy the SP register into an index register, and then use indexed addressing to access the necessary items.
On the AVR, the convention is to use the Y index register (r29:r28) for this purpose. We call the register used to hold a copy of SP the frame pointer, since it points to the stack frame (more formally called the activation record) of the procedure. The normal way to copy the SP to the Y index register is simple:
in r28, 0x3d
in r29, 0x3e
Note that you could also use the "LDS" instruction, but in that case you would have to use the equivalent memory addresses of 0x5d and 0x5e.
Once the SP is copied to the Y, then the key realization when thinking about how a procedure accesses things in its activation record is that it does not pop them off the stack but simply accesses them in place, where they are within the stack!
We can do this with the special indirect addressing instruction "LDD", which is "indirect with displacement". This allows a constant unsigned offset to be applied to the address in the index register. We need to simply count out how many memory locations away from Y are the arguments or local variables that we want to access, and then use "Y+q" where q is that constant offset.
For example, if I had a function with 11 1-byte arguments, the first nine would be passed in the even-numbered registers from r24 down to r8. The last two would be passed on the stack. If I am going to use the Y register then I have to save it first, so the beginning of my function would look like:
push r28
push r29
in r28, 0x3d
in r29, 0x3e
So for the rest of my function the stack would look like:
Y+Offset Contents
Y+0 --empty-- Y+1 saved r29 Y+2 saved r28 Y+3 return addr-low Y+4 return addr-high Y+5 argument 10 Y+6 argument 11
So using the LDD instruction we can access argument 10 as "Y+5" and argument 11 as "Y+6".
Call-by-Value and Call-by-Reference
In high level programming languages, there are generally two mechanisms for passing parameters to procedures/functions. These are call-by-value and call-by-reference. Other mechanisms do exist (such as call-by-name), but we will not discuss them here.
With call-by-value, the value of the argument is passed to the procedure. On the AVR this means that the value is either loaded into the correct parameter register or is pushed onto the stack. The procedure can then access the value of the argument from the register directly or from the stack using indexed addressing as mentioned above. Call-by-value is good because the procedure cannot alter any of the variables that were used as arguments, because all the procedure gets is a copy of the current value, and doesn't know where it came from. The procedure can use its parameter as a "local variable", and can assign to it and change it. But this only changes the copy on the stack, and not anywhere else.
Call-by-reference, as its name implies, is not so safe. With this mechanism, the procedure actually gets a reference to the variable that is being used as an argument in the procedure call. Now the procedure can actually change the variable outside of itself. To do this, we need to pass the address of the variable to the procedure, not its value!. Of course, addresses on the AVR are 16 bits, so call by reference will always use 2 bytes -- either 2 registers or 2 bytes on the stack.
To access the variable in the procedure itself, we need to copy the address into an index register (usually X or Z), and then use indexed addressing with that address to access the variable.
Saving Registers
An important decision to make when writing a procedure or routine is whether or not the caller of the routine can assume that the values they had in the registers are still there after the routine returns. Since the routine needs to use the same registers (it is the same CPU, after all), one can imagine just telling the caller that they need to reload the registers with values because the registers are corrupted, having been used by the routine to do whatever it needed.
This is generally a frowned-upon solution. What if the routine only uses one register? How does the caller know which registers are corrupted and which aren't? If you told the caller, then how do you keep all the differences between the routines straight while you are programming?
On the AVR we are following the register usage conventions of the gcc compiler, and it designates that some registers are freely available for use, while others are "protected" and are not allowed to be modified by the procedure. If we want to use a protected register we must first save its original value, and after we are done with it we must restore the original value.
Where do we save the registers? Well, on the stack, of course. The first thing to do in your routine is to push the protected registers you will use onto the stack. Then, just before your RET to return from the routine, you pop the values back into the registers. Important: the pop order is exactly reverse of the push order!
Local Variables
Procedures need local variables to really be of use, and they must have the same properties as local variables in high-level programming languages. That is, they are unique to the specific call to the procedure, and do not occupy space if the procedure isn't being executed. Each call needs its own copies of local variables, so that recursion works.
How to make this work? The answer is, of course, to put the local variables on the stack. We do this at the top of the procedure, right after we have saved some of the registers on the stack. To "create" local variables, we need to open space up on the stack; we don't really care what initial values are in that space. How to do this? Well, the answer is: whatever way is easiest!
Because the SP is out in the I/O address space, if we wanted to simply decrement the SP by a certain amount, say 4 for 4 bytes of local variable space, we could do:
in r24, 0x3d
in r25, 0x3e
subiw r24, 4
out 0x3d, r24
out 0x3e, r25
This grabs the current SP, subtracts 4 from it, and stores it back. But that's pretty difficult! We could also do:
push r0
push r0
push r0
push r0
Here we don't really care what value is in r0, we're just using it to complete the push instruction; 4 pushes creates 4 bytes of space. This is good for a few bytes of local variable space; obviously if we needed 50 bytes of local variables, the subtract method would be better. A really wierd mechanism that the compiler will generate to allocate even numbers of variable bytes is:
rcall .+1
rcall .+1
The above also allocates 4 bytes of local variable space -- how? Well remember that an RCALL pushes the return address on the stack, which is 2 bytes. The "." in the operand means "current instruction address", and so each of the instructions is "calling" the next instruction, which has no effect! Well, none except pushing 2 bytes on the stack!
Note that the "indirect plus displacement" addressing modes on the Y register (in the manual as LDD) only allow an unsigned displacement. This means that transferring the SP to the Y register must be done after local variable space is created, so that the offsets are positive.
Return Values
In computer science, we generally use the term "procedure" to refer to a sub-program that does not return a value. We use "function" to refer to a sub-program that does return a value to its caller. In C, almost all sub-programs are functions, even if all they return is an integer status code that indicates whether the computation succeeded or not.
On most CPUs, functions that return an integer or something like it (i.e., anything that fits into an integer-sized register) implement the return value by just leaving it in a register. The AVR does this as well, designating R24 as the 8-bit return register (R25 must be 0), and R25:R24 for a 16-bit return value.
If a function needs to return something else, like a large data structure or something bigger than an integer, several possibilities can be considered. If an address is the same size as an integer, it can leave the address of the actual data in the agreed-upon return value register, and then the caller can copy the data from that address. Or it can place the return data "on the stack", in the area just above the return address. Both of these can be combined as well.