Registers - ezasm-org/ezasm GitHub Wiki
In a register-based programming language, it is important to understand what each register is intended for.
The Temporary and Saved Register Convention
All registers in EzASM work the same, but when collaborating on a project, it is best to have a shared understanding about which register to use in any given circumstance. A great example of this is the distinction between temporary and saved registers. The basic idea is very simple: saved registers are not changed when a function call happens, but a temporary register might be changed when a function call happens.
Saved registers are the $s registers $s0 ... $s9 and $fs0 ... $fs9.
Temporary registers are the $t registers $t0 ... $t9 and $ft0 ... $ft9.
Example
main:
add $t0 0 4 # load 4 into $t0
add $s0 0 4 # load 4 into $s0
call someFunction
printi $t0 # could be 4, could be something else
printi $s0 # will be 4 unless someone violated the Temporary/Saved Register Convention
Floating Point vs. Integer Registers
Another way to keep things clean, especially when collaborating on a project, is to separate registers so that floats are stored in different registers than integers are stored in. It's just like the temporary/saved Register convention but for floats vs integers. This is necessary because the way floats are represented as binary data is fundamentally different from the way integers are represented, so doing floating point arithmetic on an integer or vice versa will usually render useless results.
The float registers are the $f registers $ft0 ... $ft9 and $fs0 ... $fs9.
The integer registers are the other normal registers $t0 ... $t9 and $s0 ... $s9.
The $zero Register
$zero is a special register that stores a constant 0. There's no good reason to change it, so don't. That way, if you ever need to use the constant 0 but can't use an immediate, you can instead use zero.
The $pid Register
$pid represents the (P)rocess (ID)entifier of the program running. This is currently an unimplemented register with no uses, but it is intended to store a process ID for when process forking is implemented.
The $fid Register
$fid represents the (F)ile (ID)entifier: the file from which code is currently running. The main file which the program begins in will have a value of 0 and all new files read by imports will have increasing corresponding $fid values. The $fid value is updated automatically when a call/jal is executed and is stored onto the stack alongside the return address.
The $pc Register
$pc is another special register, it stands for Program Counter. The program counter tells the simulator (or a real machine for that matter) what instruction is to be run. It is modified by any branch, jump, and function instruction calls. It is rarely, if ever, necessary to directly modify the $pc register and doing so carelessly will lead to unexpected program behavior.
The program counter automatically increased by 1 after the execution of every instruction -- indlucing branch instructions. This means that the in for instance a jump instruction, the program counter will be set to the line of the specified label and then increased by 1. The line declaring that label is not run again this way.
When the program counter attempts to read a line of code directly after the last instruction, the program will exit normally because there are no longer instructions to execute. However, if the program counter is set to an invalid number, for instance 50 in a program which only has 30 instructions, the program will terminate in an error state.
The $sp Register
$sp is the Stack Pointer: a special register used to point to the current position of the stack. The stack is a programmer-controlled form of memory allocation. The stack "grows" downwards meaning that if you were to run sub $sp $sp 4, it would "grow" the stack by 4 bytes. The new location of the stack pointer has room to store 4 bytes of data. As a general rule, you should not access areas of the stack that your code did not allocate space for and you should always return the stack to its original state after your work is done. Every allocation to the stack (e.g., sub $sp $sp 4) must have an opposite deallocation (e.g., add $sp $sp 4) when you are finished (before the end of the function). A failure to properly return the stack pointer register to the state in which it was passed to your code will cause undefined behaviour, especially in function calls.
The stack pointer is often modified by the push and pop operations. These instructions allow for simpler access to data stored on the stack. The push instruction will decrease ("grow") the stack pointer by the number of bytes in a word (default 4 bytes) and write the given data to that new memory address. The pop instruction will read the data currently pointed to by the stack (often whatever was most recently pushed to it) and then increase ("shrink") the stack pointer by the number of bytes in a word (default 4 bytes).
Diagram
Imagine the stack is in the following state of memory. The left column represents the address space and the right columns represent the data stored in that address at that offset byte. For instance, the byte data at 0x10_0007 would be DC in hex or 1110_1100 in binary. You can find that by looking for the stack address closest to but less than your target -- for 0x10_000B it would be 0x10_0008 and looking at whichever offset which, when added to the stack address, sums to the target address. For this scenario, 0x10_000B is 0x10_0008 + 0x3 so you would look at the row of 0x10_0008 and the column of 3 to find the data stored in the target address.
| stack address | 0 | 1 | 2 | 3 |
|---|---|---|---|---|
| 0x10_0000 | 00 | 00 | 00 | 00 |
| 0x10_0004 | 00 | 00 | F0 | DC |
| 0x10_0008 | FF | FF | FF | FF |
Imagine the value stored in the stack pointer $sp is 0x10_0004 (underscore inserted for readability). That means that reading the value stored at this address (e.g., using ($sp) as an input) would retrieve the data stored in offsets 0 to 7 of that row 0x10_0004. The same applies for writing information to addresses. Subtracting 4 from the stack pointer would "grow" the stack such that the stack now is pointing at address 0x10_0000 and those corresponding bytes. Now, reading from the stack pointer would get the data from first row of the table and writing to the stack pointer would be writing to that newly allocated area.
It is important to note that any newly allocated space on the stack can contain garbage data, not just zeroes. Accessing the stack pointer with an offset of 4 (e.g., 4($sp)) allows the program to access data at the address that the stack pointer just moved away from: if the current stack address is 0x10_0000 then 4($sp) would resolve to use the data at 0x10_0004. Offsets are simply added to the pointer given before operating at that address. In this way, the program can allocate space for however many variables it needs on the stack in one instruction, for instance using sub $sp $sp 40 to allocate 40 bytes or 10 words (assuming a default word size). Those new 40 bytes can now be accessed using offsets from 0($sp) to read the nearest 4 bytes (note that the 0 outside of the parenthesis is not strictly necessary when no offset is needed) to 36($sp) for the furthest 4 bytes. Reading into 40($sp) would read starting at data outside of that initial allocation.
Example
push 13 # decreases ("grows") $sp by 4 (4 bytes, one word by default), then sets the data in that memory address to `13`
printi ($sp) # dereferences the stack pointer to retrieve the value of `13`, then prints `13`
pop $s0 # reads the value at the stack pointer `13` and stores it into $s0, then increases ("shrinks") $sp by 4 (4 bytes, one word)
The $ra Register
$ra is a special register used to manage the return address of any function call. If you nest function calls, $ra gets pushed to the stack and popped on the next return to allow nesting to return correctly. Like $pc it's probably best not to change it manually.
The argument $a Registers
$a0, $a1 and $a2 are the function argument registers. Used as input for functions. Write to them if you need to pass something to a function, just be careful not to lose any important data when making nested function calls.
The return $r Registers
$r0, $r1 and $r2 are the function result registers. Write to them in a function if you want that function to return something.