RISC-V: An Open CPU Architecture • Jonathan Cook

RISC-V is an open CPU architecture that is a neat development in the history of CPUs and computing. This page is a start of collecting information and links about it. This page also has direct content to help NMSU CS 370 students in using RISC-V as a compiler target language.

The official RISC-V Home is always a good place to start, but it’s documentation wiki is probably better for technical info. For CS370 we are using the RARS simulator. This is based on the long-standing MARS MIPS simulator, and is a good and solid tool. You can download a pre-built jar file from the Releases page. Other tools have moved to the Resources at the bottom of this page.

Page Sections:

The RISC-V ISA (Instruction Set Architecture)
Pseudo-Instructions
Addressing Modes
Register Names and Calling Convention
Basic Assembly Program Format
Stack Operations
Defining a Function
Function Calls
Expressions
Simple Variable Assignments and Uses
Conditionals and Loops
Complex Conditions (Logical AND/OR)
Local Variables and Arguments
Arrays
Resources

The RISC-V ISA (Instruction Set Architecture)

To program and use a CPU, it must have a well-defined interface. This is known as its Instruction Set Architecture, because it is centered around the machine instructions that the CPU actually executes in its circuits. But there is lots of other detail around the instructions that needs to be specified to actually program and use the CPU.

What is the native operand size? RISC-V comes in both a 32-bit and a 64-bit version. In CS 370 we will use the 32-bit version.

Does a CPU expect multi-byte values to be ordered most significant byte first (big endian), or least significant byte first (little endian)? The RISC-V CPU is little endian. (Intel x86 is also little endian, but some others are big endian).

CPUs use registers to efficiently store values and have instructions operate on them. RISC-V has 32 registers, but many have particular names and purposes, as seen in the table in the next section below.

How does the stack work in the CPU? In RISC-V the stack grows downward (towards smaller memory addresses), and the stack pointer register stores the address of the top item on the stack, not the next available location.

Pseudo-Instructions

RISC stands for Reduced Instruction Set Computer, meaning that the actual machine instructions are a very minimal set of instructions. RISC-V is just one example of a RISC CPU architecture.

This means that it can be annoyingly difficult for humans to program assembly language, because it might take multiple instructions to do simple things. So, the RISC-V ISA also includes pseudo-instructions that are allowed to be used in assembly programming, but that get translated to more than one machine instruction (usually two). Some common ones are la (load address), li (load immediate, and mv (move/copy).

In the RARS simulator, the help screen lists both actual and pseudo instructions; we can use both in our compiler project.

Addressing Modes

Every ISA needs to support three basic memory addressing modes to load values from memory:

Immediate: this is where a constant is embdedded in the instruction itself; RISC-V uses this in the li (load immediate) instruction; oddly enough, this is also used in the la (load address) instruction, because the address being loaded is a constant value.
Direct: this is where an address that is embedded in the instruction is used to fetch a value from memory, or store a value to memory; this is used to access a global variable by name, such as lw t0, myVar.
Indirect: this is where a register contains the address to use to access memory to fetch/store a value; this is used for array indexing, local variable access (on the stack), and argument access (after arguments are copied onto the stack). For example, the sp (stack pointer) register contains the address of the top of the stack, and so the instruction “lw t0, 0(sp)” uses indirect addressing to access the top of the stack. RISC-V and many other ISAs allow a small constant value in front of the register name to act as an offset; 0 is the most common value, but for local variables and arguments on the stack, where we know their offset from the frame pointer (fp register), we will often use something like “8(fp)” to indicate a particular variable.

Register Names and Calling Convention

Programming using machine instructions involves alot of instructions that simply move (copy, actually) values between memory and registers, and from one register to another. This is because different registers are used for different things. Code in functions also needs to know which registers it can freely use, and which registers have data in them that needs saved first if the function is to use the register.

The table below describes the registers available in RISC-V, how they are used, and who is responsible for saving them when needed.

Register	ASM Name	Description	Saver
x0	zero	Always zero (hard-wired)	n/a
x1	ra	Return Address	caller
x2	sp	Stack Pointer	callee
x3	gp	Global Pointer	n/a
x4	tp	Thread Pointer	n/a
x5-7	t0-t2	Temporary values	caller
x8	s0/fp	Frame Pointer (saved reg)	callee
x9	s1	Saved register	callee
x10-17	a0-a7	Argument values	caller
x18-27	s2-s11	Saved registers	callee
x28-31	t3-t6	Temporary values	caller
pc	n/a	Program Counter	n/a

A register marked as “caller” saved is freely available to a function to use however it wants; but if the function calls another function and expects the register to be unchanged after the call, then it must save it somewhere (usually on the stack). A register marked “callee” saved is not freely available for use; if a function wants to change one of these registers, it must first save it somewhere; however, a function can assume that these registers will not change value when it calls some other function.

The stack pointer is callee-saved in the sense that when returning from a function, whatever the function did on the stack must be undone, and the stack pointer must be back to whatever it was when the function was called.

Function return values are passed in the argument registers a0 and a1.

If a function does not make any calls to any other function (i.e., it is a leaf function), then it does not need to save the return address register. However, any function that does make calls must save its own return address, and then restore it back into the ra register before executing its ret instruction.

A function that does make calls to other functions will probably need to save its own incoming arguments so that it can use the argument registers to set up its own calls.

Basic Assembly Program Format

Assembly language is basically the textual, human-readable form of a CPU’s native machine instructions. However, to create a complete program we need to also declare data, mark functions and branch targets, and create other organization of our code and data.

So, assembly language includes not only instructions, but also directives and labels. And of course comments.

In RISC-V, and most assembly languages, comments begin with a hash/number sign (#) and end at the end of the line. Instructions and directives are single-tab-indented, forming a column. Labels (human-created names for things) are not indented and end with a colon (:).

Labels can be on the same line as an instruction or directive, or on their own line, in which case they apply to the next directive or instruction below them. Labels are addresses! A label is a name for the address where something is in memory.

Directives begin with a dot (.), as in most assembly languages, and tell the assembler to do something in the program organization. They are not machine instructions, rather they are instructions to the assembler program!

A sample program is:

#
# Sample RISC-V assembly program
#

	.data
.SC0:	.string	"The value is: "
val:	.word 0

	.text
#
# main program instructions
#
program:
	li	t0, 42
	sw	t0, val, t1
	la	a0, .SC0
	jal	printStr
	lw	a0, val
	jal	printStr
	li	a0, 0
	li	a7, 93
	ecall   # syscall 93: exit

# Function printStr
printStr:
	li	a7, 4
	ecall   # RARS syscall: print string
	ret

# Function printInt
printInt:
	li	a7, 1
	ecall   # RARS syscall: print int
	ret

The .data directive tells the assembler that what follows is for the data section of the program, and the .text directive tells the assembler that what follows is machine instructions. The .string directive puts the given string into data memory, while the .word directive puts the given integer into memory as a 32-bit signed value (2’s complement).

The labels .SC0 and val are the names we can use in the program to refer to the string and the integer variable, respectively. Think of val as the programmer’s choice for that variable name, and think of .SC0 as a compiler-generated name for the string constant.

The labels program, printStr, and printInt are all function names, although program is more the label for the main program, and doesn’t expect to be called, just started.

Everything else in the .text section are assembly instructions that the assembler turns into machine code.

Stack Operations

The RISC-V ISA does not have push and pop instructions; rather, you have to modify the sp (stack pointer) yourself, and then do normal memory load and store instructions.

The convention is that the sp register is pointing to (i.e., contains the address of) the item on the top of the stack, so a push operation involves first subtracting an offset from the sp register to make room on the stack and then storing the item onto the stack. This can look like:

	addi	sp, sp, -4   # make room for a 4-byte integer (32 bits)
	sw	a1, 0(sp)       # store value in a1 into memory at sp address

Similarly, a pop operation would first copy the item on the top of the stack into a register, and then add an offset to the stack pointer to remove the space of the item from the stack. This can look like:

	lw	a1, (sp)        # load value in memory at sp address into a1
	addi	sp, sp, 4    # remove room of a 4-byte integer (32 bits)

Defining a Function

The functions printStr and printInt in the above example are a bit simplistic, even though they work fine. For a generic function that you are compiling from source code of a programming language, you need to generate a code prologue that saves the return address on the stack, and then generate a code epilogue that pops the return address off the stack and then executes a return instruction. For a function named myFunc this looks like:

# Function myFunc
myFunc:
	addi	sp, sp, -4
	sw	ra, 0(sp)
	# code for actual function here
	lw	ra, 0(sp)
	addi	sp, sp, 4
	ret

In actuality, when functions get even more complex, more space can be allocated on (“pushed onto”) the stack, for saving arguments and having local variables, but the stack pointer (sp register) must be pointing at the spot where the return address (ra register) was pushed onto it when the lw instruction at the end restores the return address to the ra register.

Function Calls

A function is essentially, simply, a jal instruction, short for “jump and link”. The instruction jumps to the code where the function is, and the link part is that it saves the address where it came from – the return address – in the ra register. That way the function can return back to the caller when it is done.

However, the harder part of making a function call is setting up the arguments. In RISC-V, argument values are placed in the registers a0 through a7, and then on the stack if there are more than eight arguments. It can get more complicated than that, but for our compiler project we will only handle integer-type arguments (integers and addresses), and never more than eight arguments.

We basically have just two kinds of arguments: a string (address), and a numeric expression (perhaps as simple as a constant or a variable). For a string, we just load the address into the correct argument register, e.g.,:

	la	a2, .SC1

The example above loads a string with the label .SC2 into the third argument register. Since all numeric expressions leave their result in the t0 register, for this we just need to move (copy) the value into the argument register, e.g.,:

	mv	a1, t0

The instruction above copies the value in t0 into the second argument register a1.

So, for example, the function call “myFunc(“hello world!”, 76+myVar)” might look like:

	la	a0, .SC5
	# code for integer expression here
	mv	a1, t0
	jal	myFunc

Assuming that the label .SC5 is the label assigned to that string constant.

NOTE: if we treat strings as a kind of expression, then we really only have one kind of argument, and all we ever need to do is take the result of the expression, in the t0 register, and move it to the correct argument register!

Expressions

For our RISC-V compiler project, we are not going to try to do anything fancy or complicated to optimize the code for the expressions. In particular, we are not even going to try to use all of the temporary value registers t0 to t7 that the RISC-V architecture has.

We will only use the registers t0 and t1, since for binary operators we need two registers to hold the two values that the operator instruction will use.

Every expression has this one simple rule: leave its resulting value in register t0. This way it does not matter if the expression requires just one instruction (e.g., loading a numeric constant into t0) or many instructions (a whole parenthised subexpression), we always know where the resulting value is: register t0.

For a numeric constant, this is just one “load immediate” instruction, e.g.,:

	li	t0, 42

For a variable, see the next section. If we treat strings as a kind of expression, all we need to do is load t0 with the address of the string, using the “load address” instruction, e.g.,:

	la	t0, stringlabel

For an expression that must compute a binary operation (e.g., addition), the following code template will be used (with add as the example operation):

	# code for left subexpression here
	addi	sp, sp, -4
	sw	t0, 0(sp)
	# code for right subexpression here
	lw	t1, 0(sp)
	addi	sp, sp, 4
	add	t0, t0, t1

This code will compute the left subexpression’s value, and then push it on the stack, then will compute the right subexpression’s value, then pops the left’s value into the t1 register (because t0 contains the right’s value). Finally the add instruction performs the addition operation on the two values, and leaves the result in t0, just like it is supposed to do.

This code will work no matter how complicated the two subexpressions are, since the stack can hold many subexpression values.

Simple Variable Assignments and Uses

Since the right hand side of an assignment statement is an expression, the section above says that the value to be assigned will be in register t0, then all we have to do to make the variable assignment is to save the value into the variable’s memory location, using the “store word” instruction, like:

	sw	t0, myvariable, t1

The sw instruction requires a second register to use as a temporary, and so we use t1 for this purpose. This one instruction, when it follows the right-hand side’s expression evaluation code, saves the value into the variable.

To use a variable’s value, we should just follow the rule for expressions, which is to leave the value in the t0 register. For a simple variable, this is just one “load word” instruction:

	lw	t0, myvariable

Conditionals and Loops

A simple condition with a single relational operator is pretty easy. The code below assumes that the register t0 contains a value we want to compare to the constant 42:

	# if (t0 > 42) then {ifpart} else {elsepart}
	li	t1, 42
	bgt	t0, t1, ifpart
else:
	la	a0, .LC1
	jal	printStr
	b	endif		# must skip the if part
ifpart:
	la	a0, .LC0
	jal	printStr
endif:
	# done with if-else

Note that the else part is above the if part! This is so that the branch instruction is the same as the relational operator in the condition: we branch on the true condition to the if part, and we fall through on the false condition to the else part. The if-part and else-part both just print out a string message by calling a function. Note also that the else part must end with a branch that jumps around the if part.

Loops are most easily done with the condition on the bottom, since this way, like the if-else above, the true condition matches the relational condition in the code. Without any help, this is a do-while loop; to make it a while or for loop, we need to jump down to the condition before entering the loop, as below:

	# while (t0 > 0) {loop-body; t0--;}
	b	loopcond
looptop:
	la	a0, .LC0
	jal	printStr
	mv	a0, t0
	jal	printInt
	la	a0, .LC1
	jal	printStr
	addi	t0, t0, -1
loopcond:
	li	t1, 0
	bgt	t0, t1, looptop
done:
	# done with loop

Complex Conditions (Logical AND/OR)

Recall that the logical operators AND and OR, often && and || in programming languages, are, in most languages, short-circuited operators. This means that if the left-hand side of the operator already determines the operator’s outcome, the right-hand side is not evaluated (and indeed is not allowed to be evaluated).

For an AND operator, if the left-hand side is false, the whole expression must be false; for an OR operator, if the left-hand side is true, the whole expression must be true; in these cases, the right-hand side is skipped.

This is implemented in assembly language with control flow using conditional branches, and so there is no instruction that encodes the logical operator; it is all done using conditional branches and control flow. How is this done? The left-hand side subexpression should be evaluated, and then a conditional branch should branch to the short-circuit case, and fall through (not branch) to the right-hand side evaluation.

Below is an example:

	# assumes we have variables x, y, z
	# if (x < 42 && y == 7) then {
	# z = 10; } else { z = 20; }
	lw	t0, x
	li	t1, 42
	bge	t0, t1, else	# short-circuit branch
	lw	t0, y
	li	t1, 7
	bne	else
ifpart:
	li	t0, 10
	sw	t0, z, t1
	b	endif		# must skip the else part
else:
	li	t0, 20
	sw	t0, z, t1
endif:
	# done with if-else

Local Variables and Arguments

In RISC-V arguments (actual parameter values when a function is called) are placed in the argument registers a0 through a7 (if more are needed, the stack is used). But as soon as our function has to make a function call inside itself, we need these same argument registers!

So, for non-leaf functions, we must save our own arguments somewhere else so that we can use the argument registers. Where to save them? On the stack, of course! So instead of just making space for saving the return address (ra register) on the stack, we must make room for our arguments and save them on the stack, too. Just like the return address, we use the sw (store word) instruction to store the register value into memory on the stack.

Local variables must be on the stack, too. This is for two reasons. One, local variables should not take up space when a function is not being used. Two, if a function is recursive, each invocation must have its own copies of the local variables. The stack is the natural place to do this.

As explained near the top of this page in the Addressing Modes section, accessing values on the stack is done using indirect addressing. Because expressions might use the stack inside the function body, we need another register to hold an address that is in a fixed place on the stack, so that we have a consistent and fixed reference point for our local variables and arguments. We use the fp (frame pointer) register for this; once we set up our stack space, we just copy (move) the stack pointer into the frame pointer, and then leave it this way until we leave the function. But the fp register needs saved first because it is a protected register.

Below is an example:

# Function myFunc (int arg1, int arg2)
# - and with two local vars: int x, int y
myFunc:
	addi	sp, sp, -24	# space for 6 integer (4-byte) values
	sw	ra, 0(sp)	# save the return address
	sw	fp, 4(sp)	# save the frame pointer
	mv	fp, sp		# copy sp to fp to set up our frame pointer
	sw	a0, 8(fp)	# save arg1 to stack space 8(fp)
	sw	a1, 12(fp)	# save arg2 to stack space 12(fp)
	# code for actual function body begins here
	# -- lots of code, depending on statements
	lw	t0, 16(fp)	# example read of local var x
	lw	t0, 20(fp)	# example read of local var y
	lw	t0, 8(fp)	# example read of arg1
	lw	t0, 12(fp)	# example read of arg2
	sw	t0, 16(fp)	# example write of local var x
	sw	t0, 20(fp)	# example write of local var y
	sw	t0, 8(fp)	# example write of arg1
	sw	t0, 12(fp)	# example write of arg2
	# code for actual function body ends here
	# now we restore everything and return from function
	mv	sp, fp		# restore sp to fp pos, just in case func body erred
	lw	ra, 0(fp)	# restore return address
	lw	fp, 4(fp)	# restore frame pointer
	addi	sp, sp, 24	# remove stack space
	ret

The example above creates just enough space on the stack to hold the two arguments and two local variables. This is great, but to simplify your compiler I would recommend allocating enough space for everything we want to do in our assignments, and then using a constant starting position for local variables. In our assignments, you can just subtract 128 from the stack pointer – this gives you enough room for 32 4-byte values. The first six will be saved for (up to) six argument values, and the rest will be for local variables. So your first argument will be at “0(fp)”, your sixth argument will be at “20(fp)”, and your first local variable will always be at “24(fp)”. Your second local variable will be at “28(fp)”, and so on. We will never test with so many local variables that you need more than 128 bytes of stack space.

Arrays

We are only handling global arrays in our compiler assignments, so that we can use the array name to start the access. Unlike simple global variables that we can load and store by name, for arrays we must use the name as the starting address of the array, and then add the correct offset to this address for the index value being used.

Remember, an array index is an expression, and expressions always leave their value in register t0. But this is the index, not the memory offset. For integer (4-byte value) arrays, we need to multiply the index value by 4 in order to get the correct memory address offset. This is easily done by using a shift left instruction; each bit position shift multiplies a value by 2.

So an array element read can look like this:

	# code for index expression above here
	slli	t0, t0, 2	# shift left two positions (*4)
	la	t1, arrayName	# load t1 with starting address
	add	t1, t1, t0	# add index offset to starting address
	lw	t0, 0(t1)	# load array element value into t0

Saving a value into an array element is a bit more work, since the value we want to save is occupying register t0. The answer, of course, is to save it on the stack!

	# code for new value expression above here
	addi	sp, sp, -4	# save value on stack
	sw	t0, 0(sp)
	# code for index expression above here
	slli	t0, t0, 2	# shift left two positions (*4)
	la	t1, arrayName	# load t1 with starting address
	add	t1, t1, t0	# add index offset to starting address
	lw	t0, 0(sp)	# restore value to save from stack
	addi	sp, sp, 4
	sw	t0, 0(t1)	# save value in t0 into array element

Resources

This Medium post is a RISC-V assembly tutorial. A RISC-V Assembly Manual is available; it looks to be unofficial but very comprehensive.

A RISC-V Assembly and Programming Textbook has some good online content. A decent RISC-V Assembly Reference. An open-access computer organization textbook using RISC-V has an appendix with example programs.

As stated at the top, for CS370 we are using the RARS simulator. This is based on the long-standing MARS MIPS simulator, and is a good and solid tool. You can download a pre-built jar file from the Releases page.

Several online simulation environments exist, including: BRisc-V, Kite, and one at Cornell.

Many functional RISC-V instruction simulators take a compiled binary as input, not the textual assembly language, which means you need some entire RISC-V toolchain to compile or assemble code into a correct binary executable file. I just want a simple assembly simulator!

The main RISC-V site can list simulators (and other tools).

The project Masimulator maybe does provide near-assembly simulation, but it uses (and embeds?) an assembler to produce a binary, but this simple toolchain may not be a bad way to go.

PyRISC is a python-based RISC-V simulator, but I am pretty sure it only uses binary executables as input, and so requires a compiler/assembler toolchain.

TinyFive simulators RISC-V instructions in Python, but seems oriented to AI-type applications, focusing on numerical operations in neural nets.

MARSS-RISC-V may be an option. MARSS has long been a decent MIPS simulator, so this port to RISC-V should be good.

This RISC-V ISA Model seems to be someone’s beginning attempt at some Python tool. It contains the defined components of the ISA, but that seems to be it. It might be something to build on.