CatCompiler.dev

The cat compiler.

Hi! I like cats, computers and compilers and I sometimes write blog posts.

The HACK architecture sucks


This title is clickbait slop, it’s actually quite neat but I have some critiques I’d like to share. Before that, I’ll give you some background.

The Some Background

The HACK architecture is an invention bNoam Nisan and Shimon Shocken. It is a part of the `Nand2Tetris` course series. It is a CPU specification which is both simple and intuitive to implement with combinational logic. Let’s look at its guts

The HACK specification describes exactly two registers A and D, A is primarily used for addressing and D is used for data, however they both can serve as general purpose registers. Here’s some common instruction examples

ExampleWhat it does
@345Sets the A register to the value 345, or any other 15-bit constant
D=MReads from the address placed in A and places it into D.
A=MReads from the address placed in A and places it into A.
M=M-1Adds one to the value at the address placed in A.
0;JMPStart running the code at the address placed in A
D;JGTJump to the address placed in A only if D is greater than 0
@SPSets the A register to the address of the stack pointer.
A few HACK instructions

I’ve kept this list very brief, there is many variations of each of these instructions that I have left out for conciseness, if you’d like to see them all see Chapter 6 of “The Elements of Computer Science” by Noam Nisan and Shimon Schocken (This chapter is Freely available through nand2tetris.org). This briefing should be enough context to backup my critiques.

The elephant in the room

There’s only two of them.

If you’ve done any assembly programming before, there’s one detail which may stand out you immediately – there is only 2 registers. That’s right – exactly two of them and no more. Please see the helpful diagram to the left to illustrate this.

To top it off, one must be used for addressing so in most practical circumstances, there’s only one real general purpose register. This might not be so bad if there were some other mechanism to rely on like a hardware stack or direct memory addressing, but this has none of that. The HACK architecture is clearly designed from the ground up with simplicity in mind, so only the bare essentials are included out of the box. However, I believe that one or two additional registers would be both convenient and still preserve the ‘je ne sais quoi‘ of this design.

Let’s swap two registers

In order to to swap two registers in the HACK architecture, we have to put on a clown outfit and go to the circus. Let’s say we had a 3rd registers called `E` (E for extra. It also happens to be the letter after D). It would be as simple as this

// Let's say D=2 and A=3
E=A    // E = 3
A=D    // A = 2
D=E    // D = 3
// done!

But we live in a sad world with damp socks and imperfect design, so we have to do it all on a software stack. Before you ask, we can’t swap them in place with multiplication or anything fancy, we don’t have a multiplication instruction in the hack chipset and there’s no way to assign 2 registers with different values at once.

// Let's say D=2 and A=3
@SP   // wuh-oh, we have to throw away one!
A=M
M=D   // Write the 1st value to the top of the stack
@SP
M=M+1

@3    // put A on the stack too
D=A
@SP
A=M
M=D   // write the 2nd value to the top of stack
@SP
M=M+1

// Read 2nd value off stack
@SP
M=M-1
D=M

// read 1st value off stack
@SP
M=M-1
A=M

Wow, that’s a lot of assembly for such a simple action isn’t is gross… Well I have some terrible news – in these 18 lines of spaghetti code, we didn’t actually swap them! We cheated! That’s right it’s impossible folks… The closest we can get requires one value be a constant, and that’s not cool! So to conclude this section, it is impossible to swap to registers in hack assembly without a 3rd temporary register. This single simple limitation bloats the code massively and constantly rubberbands the stack. It’s slow to run and error prone to write.

Swapping two registers happens to be extremely important. It has come up on several common instructions when writing a VM translator for part 7 of the nand2tetris course. Any time where I need to add 2 numbers to calculate a pointer requires this or similar workarounds. As it so happens, the majority of the VM instructions in part 7 of the course require calculating pointers. It is nauseating.

Minor gripes

Another issue with this arch is its very limited set of instructions. I want more toys to play with! My most desired new instruction would be for bit shifting. There is plenty of room in the machine code for my additional register and bit shift instruction. My particular use-case came up some time ago when I was working on assembly that modifies the screen. The screen is stored as a raw bitmap in RAM, It uses memory mapped IO. Each bit corresponds to one pixel, as such any operation done on one pixel at a time requires bitmasks. Generating such bitmasks is surprisingly tedious. Here is one I required.

    // bit_0 = b1000 0000 0000 0000 
        // (multiple steps due to 15-bit immediate value size limit)
        @30720              //A = b0111 1000 0000 0000 
        D=A                 //D = b0111 1000 0000 0000
        @4096               //A = b0000 1000 0000 0000 
        D=D+A               //D = b1000 0000 0000 0000
        @bit_0              //A = new memory address alaised as bit_0
        M=D                 //(*bit_0) = b1000 0000 0000 0000

I need 16 of these chunks to address every pixel in the screen… What a waste of variables. Let’s not mention that this could be near instant in hardware – bit shifting is an incredibly simple operation to pull off with combinational logic.

This is what the rest looks like.

  @bit_15                  // A = pointer to 0b1
    D=M                     // D = 0b1000 0000 0000 0000
    @color                  
    A=M                     
    D=A&D                   // A = Bitwise AND of color and bitmask

    // write to row 1
    @SCREEN
    M=M|D                   // Bitwise OR mem with bitfield

Shifting your bits

With your permission, let’s shift things up a bit and invent 2 instructions to solve my problem.

D=D<<A
D=D>>A

Deciding if there should be more instruction variations for each register combination can be left as an exercise to the reader.

These two goodies could bit shift by a count to the right or left as needed. Now my 180 lines of assembly could be compacted down to say 50. I can remove all those earlier variables and update the writing loop like so

    // A = pixel bit number
    D=1
    D=D<<A
    @color                  
    A=M                     
    D=A&D                   // A = Bitwise AND of color and bitmask

    // write to row 1
    @SCREEN
    M=M|D                   // Bitwise OR mem with bitfield

Unfortunately, this example doesn’t illustrate the benefits well. The only real benefit in my particular example is using less variables, however I am confident that these new instructions would serve an extremely useful purpose overall. Bit shifting is an essential feature of programming at high and low levels, it is wasteful to live without.

No multiplies?

The hack instruction set has nothing to multiply 2 values. It has no `mul` instruction or anything like it.

But that’s perfectly workable – In order to multiply 2 values, you must use a loop of additions. It works fine but it’s inelegant, if a mul instruction were gifted to me for Christmas I would not complain. It would save some precious clock cycles for a very common operation. Having said all this, it’s at the bottom of my Wishlist. It’s worth noting that there’s also no division instruction, however I wouldn’t even bother adding one if I had the choice. It’s much rarer than multiply and has much more complex edge cases which must be accounted for with lots of transistors.

Summing up

The HACK architecture is a fantastic creation. It is so simple that anyone can make it, however its simplicity is also a double edged sword – the lack of a third register or simple convenience instructions makes it more complex to implement good abstractions on top of. While I don’t expect a intricate permissions system to be implemented or a convoluted hardware stack, it would be nice to have a little bit more to work with. It would decrease the barrier of entry for students on later nand2tetris projects, and overall make it much more fun to engage with.


130 responses to “The HACK architecture sucks”

Leave a Reply

Your email address will not be published. Required fields are marked *