asm
This commit is contained in:
192
asm/asm01.py
Normal file
192
asm/asm01.py
Normal file
@@ -0,0 +1,192 @@
|
||||
"""Assembler for the hack computer
|
||||
|
||||
Usage:
|
||||
|
||||
python assembler.py file
|
||||
|
||||
Loads an assembly file and translate it into machine language for the hack
|
||||
computer as specified in project 6 of the nand2tetris course.
|
||||
|
||||
# Assembler implementation details
|
||||
This assembler works in 3 steps:
|
||||
1. Load and clean the assembly file
|
||||
2. Construct a symbol table referencing the user-defined labels and variables
|
||||
3. Translate the asm file to binary code
|
||||
The file follows this pattern:
|
||||
File parsing (line 147)
|
||||
Symbol table (line 207)
|
||||
Assembling (line 322)
|
||||
??
|
||||
|
||||
# Assembly language specifications:
|
||||
|
||||
## Note on registers
|
||||
The Hack computer has three 16-bit registers: the D-, A- and M-registers.
|
||||
The D-register is used to store "data" that can be used as input for the ALU.
|
||||
The A-register can be used in the same way but it also has a second role:
|
||||
The RAM use it as address input. So any read/write instruction to the RAM is
|
||||
done on the register which has the value of the A-register as address.
|
||||
The M-register represents this register in the RAM that is 'pointed' onto by the
|
||||
A-register. Reading/writing to it consists actually in reading/writing to
|
||||
the RAM.
|
||||
|
||||
## Assembly instructions
|
||||
There are three types of assembly instructions: A-instructions, C-instructions
|
||||
and labels. Indents and blanks are ignored. Comments can only be in-line, start
|
||||
with "//" and are ignored.
|
||||
|
||||
## A-instructions
|
||||
- `"@" integer` where integer is a number in the range 0->32768. Sets the A
|
||||
register to contain the specified integer. Ex: @42
|
||||
- `"@" label` where label is a user-defined label. Sets the A register to
|
||||
contain the code address corresponding to the label.
|
||||
Labels are upper-cased by convention, with "_" as word separator. Ex: @MAIN
|
||||
- `"@" variable` where variable is a user-defined variable. Sets the A register
|
||||
to contain the RAM adress corresponding to the variable. If a variable is
|
||||
encountered for the first time, it is automatically assigned an address.
|
||||
The address assignment starts at RAM address 16 and increments.
|
||||
Variables are lowercased by convention, with "_" as word separator. Ex: @i
|
||||
|
||||
## C-instructions
|
||||
`(Dest-code "=")? op-code (";" jump-code)?`
|
||||
- op-code:
|
||||
Only the op-code is mandatory. It represents an instruction to be performed
|
||||
by the ALU. Available codes and their associated outputs are:
|
||||
- 0 -> the constant 0
|
||||
- 1 -> the constant 1
|
||||
- -1 -> the constant -1
|
||||
- D -> the value contained in the D-register
|
||||
- A -> the value contained in the A-register
|
||||
- M -> the value contained in the M-Register
|
||||
- !D -> bit-wise negation of the D-register
|
||||
- !A -> bit-wise negation of the A-register
|
||||
- !M -> bit-wise negation of the M-register
|
||||
- -D -> numerical negation of the D-register using 2's complement
|
||||
- -A -> numerical negation of the A-register using 2's complement
|
||||
- -M -> numerical negation of the M-register using 2's complement
|
||||
- D+1 -> 1 + value of the D-register
|
||||
- A+1 -> 1 + value of the A-register
|
||||
- M+1 -> 1 + value of the M-register
|
||||
- D-1 -> -1 + value of the D-register
|
||||
- A-1 -> -1 + value of the A-register
|
||||
- M-1 -> -1 + value of the M-register
|
||||
- D+A -> value of the D-register + value of the A-register
|
||||
- D+M -> value of the D-register + value of the M-register
|
||||
- D-A -> value of the D-register - value of the A-register
|
||||
- D-M -> value of the D-register - value of the M-register
|
||||
- A-D -> value of the A-register - value of the D-register
|
||||
- M-D -> value of the M-register - value of the D-register
|
||||
- D&A -> bit-wise AND of the values of the D and A registers
|
||||
- D&M -> bit-wise AND of the values of the D and M registers
|
||||
- D|A -> bit-wise OR of the values of the D and A registers
|
||||
- D|M -> bit-wise OR of the values of the D and M registers
|
||||
- dest-code:
|
||||
If specified, should be followed with a "=" character. Available codes are:
|
||||
- D -> write the ALU instruction's output to the D-register
|
||||
- A -> write the ALU instruction's output to the A-register
|
||||
- M -> write the ALU instruction's output to the M-register
|
||||
- AD -> write the ALU instruction's output to the A- and D-registers
|
||||
- AM -> write the ALU instruction's output to the A- and M-registers
|
||||
- MD -> write the DLU instruction's output to the D- and M-registers
|
||||
- ADM -> write the DLU instruction's output to the A-, D- and M-registers
|
||||
- jump-code:
|
||||
If specified, should be preceded by a ";" character. The computer is fed
|
||||
with a programm containing one binary instruction per line. Each of those
|
||||
instructions should be seen as having a number, starting at 0 and increasing
|
||||
by one. The jump-code lets the computer jump to the instruction of which the
|
||||
address is contained in the A-register if the result of the current
|
||||
operation satisfies a certain condition. Available codes and corresponding
|
||||
conditions are:
|
||||
- JEQ -> jump if the output is equal to 0
|
||||
- JLT -> jump if the output is lower than 0
|
||||
- JLE -> jump if the output is lower than 0 or equal to 0
|
||||
- JGT -> jump if the output is greater than 0
|
||||
- JGE -> jump if the output is greater than 0 or equal to 0
|
||||
- JNE -> jump if the output is not 0
|
||||
- JMP -> just jump wathever the output
|
||||
- Examples:
|
||||
@3 // Set A to 3
|
||||
0;JMP // unconditional jump to code line 3.
|
||||
@42 // Set A to 42
|
||||
D=D-A;JEQ: // Set D to D-A. if D-A == 0, jump to code line nb 42.
|
||||
@i // Point onto var i, the real RAM address is handled by the assembler
|
||||
M=A // Set corresponding value to it's own address
|
||||
A=A+1 // Point to the RAM address just after i
|
||||
|
||||
## Labels
|
||||
`"(" LABEL_NAME ")"`
|
||||
When performing a jump, the appropriate line of code should be put in the
|
||||
A-register. Setting directly the line number with a `@integer` instruction
|
||||
is delicate since one has to figure out the line number ignoring comments,
|
||||
blank lines, etc... And all the addresses have to be updated if the beginning of
|
||||
the assembly code is edited afterward.
|
||||
So the assembly language proposes to mark lines with a label using the `(LABEL)`
|
||||
syntax. The assembler will then automatically adjust any `@LABEL` instruction
|
||||
to match the desired code line at assembly time.
|
||||
Example:
|
||||
// This code runs a loop 42 times and then stops in an infinite empty loop
|
||||
00 @MAIN // @2
|
||||
01 0;JMP
|
||||
(MAIN)
|
||||
02 @42 // Set D to 42
|
||||
03 D=A
|
||||
04 @DECREMENT // @6
|
||||
05 0;JMP
|
||||
(DECREMENT)
|
||||
06 D=D-1 // Decrement D
|
||||
07 @END // @11
|
||||
08 D;JEQ // Go there if D==0
|
||||
09 @DECREMENT // Or continue the loop
|
||||
10 0;JMP
|
||||
(END)
|
||||
11 @END // Infinity loop to end the programm
|
||||
12 0;JMP
|
||||
"""
|
||||
|
||||
import sys
|
||||
import re
|
||||
|
||||
def create_symbol_table():
|
||||
# Erzeugen eines dict
|
||||
return {
|
||||
key: (value)
|
||||
for key, value in {
|
||||
**{'@SP': 0,
|
||||
'@LCL': 1,
|
||||
'@ARG': 2,
|
||||
'@THIS': 3,
|
||||
'@THAT': 4,
|
||||
'@SCREEN': 0x4000,
|
||||
'@KBD': 0x6000,},
|
||||
**{f'@R{i}': i
|
||||
for i in range(16)}
|
||||
}.items()}
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
def asm(file):
|
||||
symbol_table = create_symbol_table()
|
||||
print(symbol_table)
|
||||
|
||||
regex = r"^\n"
|
||||
asmfile = open(file, 'r')
|
||||
asmlines = asmfile.readlines()
|
||||
asm =[]
|
||||
#print(asmlines)
|
||||
for l in asmlines:
|
||||
if l.startswith('//'):
|
||||
pass
|
||||
elif re.match(regex, l):
|
||||
pass
|
||||
else:
|
||||
asm.append(l.replace('\n', ''))
|
||||
print(l)
|
||||
print(asm)
|
||||
|
||||
|
||||
|
||||
print (sys.argv)
|
||||
asm_file = asm(sys.argv[1])
|
||||
|
||||
Reference in New Issue
Block a user