Jun 03, 2019

Making the HACK Assembler

A while back, I decided to spend some time learning some computer science concepts that were still foreign to me. Since I am completely self-taught there are quite a few 'holes' in my knowledge. One of these 'holes' was Computer Architecture, for which I ended up taking the NANDToTetris course, available for free online. The final project of part 1 of the course asks students to write an assembler to translate code written in the symbolic HACK assembly language into binary executeable code for the HACK hardware platform. You can read more details about the requirements for this project here: #

Writing the Assembler

Before tackling the project, one of the things I had also been wanting to spend some time doing was to refamiliarize myself with PHP. I have written a few small programs in PHP, but never really spent much time outside of that learniing the language. Thus, I decided to make my implementation of the HACK assembler in PHP. Adding this element of relearning PHP to the project definitely made the experience much more enjoyable, although as a consequence, the project is not very well organized.

I likely won't spend too much time refactoring this project, but I still wanted to share my hard work because it was a very fun and rewarding project.

You can find my completed project files and instructions on how to get things up and running on GitHub here: #

How it Works

I won't dive too deep into the details for this since you can just review the project files on GitHub, but as a general overview, the assembler is broken up into three main parts:

  • symbols.php includes the logic for adding special symbols written in the target file to a symbol table to be referenced later when generating the final output. This file also contains logic for assigning the correct address to each symbol using the SymbolTable class.
  • parser.php includes the logic for parsing through the target file and generating a multidimensional array containing each of the commands to be translated in the target file.
  • translator.php is responsible for translating the commands contained in the elements of the multidimensional array generated by parser.php into their respective 16-bit counterparts.

Putting all of these together, assembler.php does the following:

  1. opens a new file matching the filename of the target file with a .hack extension
  2. Parses the target file searching for special symbols to add to a symbol table
  3. Parses the target file again, this time assigning an address to each symbol based on file location
  4. Parses the file one final time to create a multidemntional array comprised of each command type and actual command
  5. Iterates through the array created in step 4 and writes to the output file created in step 1

What I Learned

While this project may not be very impressive for most CS students, I definitely had a lot of fun writing it. I also learned a ton about the inner workings of a computer from logic-gates to low-level machine language. I highly recommend this course to anyone who, like myself, is self-taught and is looking to go back and learn some of the more fundamental parts of working with computers.