Today, I will unoffically document the Nand2tetris VM language. Let’s back up a bit – a short course I’m currently working on is nand2tetris project 7. It’s a lot of fun, it’s like a tour of the computer software and hardware stack. Don’t worry you don’t need to know anything about it, in this article I’m going to exhaustively explain one particular niche of it, or well at least try to!
You will need to know one piece of background information – not all programming languages are alike, there is a particular group of programming language called bytecode .They are low level enough to control things like the stack directly but just high enough to be portable. The more correct name is intermediate representation languages, the reason for this name is that it is not intended to be written by humans, but generated by code. It is something like portable assembly. The term bytecode
is specifically the binaries these intermediate languages are directly assembled to. It’s all a bit wishy-washy, never mind the specifics here – they’re not important!
So, nand2tetris features and intermediate representation (IR) language and in project 7, I am tasked with creating a compiler for it. An IR compiler can also be called a VM translator. The VM stands for virtual machine because the IR resembles assembly so closely that it resembles emulation or virtualisation. The most notable feature of this IR is that all instructions are separated into virtual ‘segments’. Most instructions manipulate the stack
segment. Here I will describe the most basic commands in a platform-agnostic way. I will skip over more advanced constructs function
if-goto
, goto
, label
, call
, and return
In a nutshell, here is the basics of the nand2tetris VM instruction set…
Push
Takes an item off the stack and places it into the segment given.
The local, argument, this and that pointers seem to be relative to the stack frame.
Command | Description |
push constant n | Adds the number (n) to the top of the stack and increases the stack pointer |
push temp n | Take the value from the top of the stack and place it into temporary register numbered n. Decreases stack pointer. |
push static n | Take the value from the top of the stack and place it into the static segment at offset n. Decreases stack pointer. |
push local n | Take the value from the top of the stack and place it in the local segment at offset n. Decreases stack pointer. |
push argument n | Take the value from the top of the stack and place it on the argument segment at offset n. Decreases stack pointer. |
push that n | Take the value from the top of the stack and place it into the that segment at offset n. Decreases stack pointer. |
push this n | Take the value from the top of the stack and place it into the this segment at offset n. Decreases stack pointer. |
push pointer 0 | Take the value from the top of the stack and place it into the this segment. Decreases stack pointer |
push pointer 1 | Take the value from the top of the stack and place it into the that segment. Decreases stack pointer |
Pop
Takes a value out of the segment given and places it on the stack
The local, argument, this and that pointers seem to be relative to the stack frame.
Command | Description |
pop temp n | Remove the item at the top of the stack and push it to temporary register number n. Increases stack pointer. |
pop static n | Takes a value from the static segment at offset and places it on the top of the stack. Increases stack pointer. |
pop local n | Takes a value from the local segment at offset n and places it on the top of the stack. Increases stack pointer. The locals segment starts at the stack frame. |
pop argument n | Takes a value from the argument segment at offset and places it on the top of the stack Increases stack pointer. |
pop this n | Takes a value from the this segment at offset n and places it on the top of the stack. Increases stack pointer. |
pop that n | Takes a value from the that segment at offset n and places it on the top of the stack. Increases stack pointer. |
pop pointer 0 | Take the value from the this segment and place it on the top of the stack. Increases stack pointer. |
pop pointer 1 | Take the value from the that segment and place it on the top of the stack. Increases stack pointer. |
Arithmetic
The arithmetic operations take one or two items off the stack and then computes the result in reverse order.
Command | Description |
neg | Short for negate. Take one item off the stack and flip the sign, pushes the result on the stack. Stack pointer does not change |
add | Take 2 items off the stack and add them, push result on stack. Stack pointer does not change |
sub | Short for subtract. Take item 1 off the stack and then item 2. Subtracts item 2 from item 1. Places result on the top of the stack. Decreases stack pointer. |
eq | Short for equals. Take 2 items off the stack Checks if the values are equal, places the result on the top of the stack. Decreases stack pointer. |
gt | Short for greater than. Take item 1 and item 2 off the stack, checks if 2 is greater than 1 places the result on the top of the stack. Decreases stack pointer. |
lt | Short for less than. Take item 1 and item 2 off the stack, checks if 2 is less than 1 places the result on the top of the stack. Decreases stack pointer. |
and | Take item 1 and item 2 off the stack, performs a bitwise and on item 1 and 2 ,places the result on the top of the stack. Decreases stack pointer. |
or | Take item 1 and item 2 off the stack, performs a bitwise or on item 1 and 2 ,places the result on the top of the stack. Decreases stack pointer. |
not | Take one item off the stack and flip the bits. Places the result on the top of the stack. Stack pointer does not change. |
Some commentary
Yes this blog may have just been a sneaky excuse to post notes but who’s gonna tell on me?
I find the use of segments in this language quite unusual, it feels like an anachronistic remnant of the 8-bit era. I could most if not all of these virtual segments rolled into a handful that contain the stack pointer, stack base and stack start. Any statics could be put at the stack base and any locals could be relative to the stack base and any arrays or objects could be pushed directly onto the stack. It also would have been nice to have some virtual registers reserved for the VM translator to use – the temporary ones are reserved by the future compiler.
There’s also some overlap too, if I understand correctly, there’s 2 separate commands to update the thi
s and that
pointers.
The implementation of the HACK chipset is also extremely tedious because there is only 2 general purpose registers and no reserved stack area for me to work on, it all has to be done in-situ, it’s definitely possible it’s just harder.
I am interested to see this justified in future, I’m sure there will be plenty more added context as I further climb the software stack. I am looking forward to completing this translator. Please reach out if you find any errors in this document.