This is a project, written to obtain a credit for a semester, in Formal Languages and Translation Techniques, a subject on V semester of Computer Science in Wrocław University of Science and Technology.
Using BISON and FLEX create a compiler from simple imperative language into machine code. Specification of both imperative language and virtual machine below.
Compiler should throw multiple types of errors (second declaration of variable, wrong usage of array, etc), as well as error location. In case of no errors, compiler should return code for attached virtual machine. Created code should be optimised to have the smallest possible time complexity (multiplication and division can be made in time logarythmic to value of arguments).
Language can be descibed by the graph below:
program -> DECLARE declarations BEGIN commands END
| BEGIN commands END
declarations -> declarations , pidentifier
| declarations , pidentifier (num :num )
| pidentifier
| pidentifier (num :num )
commands -> commands command
| command
command -> identifier := expression ;
| IF condition THEN commands ELSE commands ENDIF
| IF condition THEN commands ENDIF
| WHILE condition DO commands ENDWHILE
| REPEAT commands UNTIL condition ;
| FOR pidentifier FROM value TO value DO commands ENDFOR
| FOR pidentifier FROM value DOWNTO value DO commands ENDFOR
| READ identifier ;
| WRITE value ;
expression -> value
| value + value
| value - value
| value * value
| value / value
| value % value
condition -> value = value
| value != value
| value < value
| value > value
| value <= value
| value >= value
value -> num
| identifier
identifier -> pidentifier
| pidentifier ( pidentifier )
| pidentifier (num )
Additionally language should fullfill the rules below:
- Arithmetic equations use only Natural numbers (e.g. a - b = max{ a - b, 0 }, a / 0 = 0, a % 0 = 0).
+
-
*
/
%
mean respectively addition, subtraction, multiplication, division and modulo.=
!=
<
>
<=
>=
mean respectively relations equal, not equal, less, more, less or equal, more or equal.:=
means assignment.tab(10:100)
means declaration of array of 91 elements, indexed from 10 to 100. Identifiertab(11)
means 11-th element from arraytab
. Declaration with first number greater then second (starting index greater then ending) should throw an error.FOR
loop has local iterator, changing +1 or -1 each iteration (depending on used wordTO
orDOWNTO
).- Number of iterations in
FOR
loop is set at the beginning, and cannot be changed inside loop (even if values of beginning / end of loop change). - Iterator of
FOR
loop cannot be modified inside loop. REPEAT-UNTIL
loop ends, when the condition written afterUNTIL
is met (loop should run at least 1 time).- Instruction
READ
reads value from stdin and saves into variable.WRITE
shows value of variable into stdout. - The rest of functions are similar to most of programming languages.
pidentifier
can be described with regex[_a-z]+
.num
is a Natural number written in decimal[0-9]+
.- Code should be case sensitive.
- In the program, there can be used comments in format
[ comment ]
. Comments cannot be nested.
The errors found by compiler are as follow:
BadArrayScope
UndeclaredVar
UninitializedVar
AlreadyDeclaredVar
BadVarType
UnrecognizedText
which mostly are pretty self-explanatory.
Virtual machine consists of 6 registers (ra
, rb
, rc
, rd
, re
, rf
), counter of commands k
, and array of memmory slots pi
for i < 262.
Machine code follows these rules:
- Machine works on Natural numbers.
- Program consists of series of commands, numbered from 0. In each step next command is executed, until
HALT
. - At the beginning values of registers and memmory is undefined.
- In the program, there can be used comments in format
[ comment ]
. Comments cannot be nested. - White chars are omitted.
- Unrecognizable command is thrown as error.
These are possible commands (x,y ∈ {a, b, c, d, e, f} and j ∈ ℤ \ {0}):
Command | Description | Next Command | Cost |
---|---|---|---|
GET x |
reads value from stdin and saves it in memmory slot prx |
k = k + 1 | 100 |
PUT x |
shows to stdout a number saved in memmory slot prx |
k = k + 1 | 100 |
LOAD x y |
rx ← pry |
k = k + 1 | 20 |
STORE x y |
pry ← rx |
k = k + 1 | 50 |
ADD x y |
rx ← rx + ry |
k = k + 1 | 5 |
SUB x y |
rx ← max{ rx - ry , 0 } |
k = k + 1 | 5 |
RESET x |
rx ← 0 |
k = k + 1 | 1 |
INC x |
rx ← rx + 1 |
k = k + 1 | 1 |
DEC x |
rx ← max{ rx - 1 , 0 } |
k = k + 1 | 1 |
SHR x |
rx ← ⌊ rx / 2 ⌋ |
k = k + 1 | 1 |
SHL x |
rx ← rx * 2 |
k = k + 1 | 1 |
JUMP j |
jump to the j-th next command | k = k + j | 1 |
JZERO x j |
if rx = 0 , jump to the j-th next command |
k = k + j or k = k + 1 | 1 |
JODD x j |
if rx is odd, jump to the j-th next command |
k = k + j or k = k + 1 | 1 |
HALT |
end execution of program | 0 |
Compile program from infile.imp
:
$ make
$ ./kompilator <infile.imp> <outfile.mr> [--debug]
The flag --debug
is optional. When added, it shows the table of stored and used variables, as well as adds comments to created machine code.
Then run the program on virtual machine
$ make -C virtual_machine/
$ ./virtual_machine/virtual_machine <outfile.mr>
g++ 9.3.0
flex 2.6.4
bison 3.5.1
make 4.2.1