The Eight Bit Algorithmic Language for Apple II, Commodore 64 and VIC20
Includes:
- Interpreter
- Bytecode Compiler
- Virtual Machine
- Bytecode Disassembler
- Intro
- Getting Started
- Building the Code
- EightBall Language Reference and Tutorial
- Line Editor
- EightBall Compiler and Virtual Machine
- Data Types
- Code Examples
EightBall is an interpreter and bytecode compiler for a novel structured programming language. It runs on a number of 6502-based vintage systems and may also be compiled as a 32 bit Linux executable. The system also includes a simple line editor and the EightBall Virtual Machine, which runs the bytecode generated by the compiler.
EightBall tries to form a balance of the following qualities, in 20K or so of 6502 code:
- Statically typed
- Provides facilities which encourage structured programming ...
- ... Yet makes it easy to fiddle with hardware (PEEK and POKE and bit twiddling)
- Keep the language as simple and small as possible ...
- ... While providing powerful language primitives and encapsulation in subroutines, allowing the language to be extended by writing EightBall routines
- When in doubt, do it in a similar way to C!
The following 6502-based systems are currently supported:
- Apple II - EightBall runs under ProDOS and uses upper/lowercase text. It should run on 64K Apple IIe Enhanced, IIc or IIgs.
- Commodore 64 - EightBall should run on any C64.
- Commodore VIC-20 - EightBall runs on a VIC-20 with 32K of additional RAM.
EightBall also runs on Linux (built as a 32 bit process using gcc -m32
.)
With some small modifications, the code could also be built for any 6502-based system supported by the cc65
compiler. For the interpreter/compiler program, upper and lower case text support is required (so Apple II/II+ would need an 80 column card.) The virtual machine program does not necessarily require lower case (if you do not use it in your EightBall code.)
Free Software licenced under GPL v3.
EightBall is an ongoing development project. See the project roadmap here.
This is a free software / open source project and I invite anyone interested to participate via GitHub.
There are executables and disk images available to download for Apple II, Commodore 64 and VIC-20. These may be run on real hardware or one of the many emulators that are available.
The language itself is documented in this file. The best way to learn is to study example programs.
Disk images:
eightball.dsk
- ProDOS 2.4.1 bootable disk with EightBall for Apple IIe Enhanced, //c, IIgs.eightball.d64
- Commodore 1541 disk images with EightBall for VIC20 and C64.
I used ADTPro to copy eightball.dsk
to a real Disk II 140K floppy. A solid state drive such as CFFA3000 should also work.
It is also possible to run the EightBall system using the MAME Apple II emulation under Linux.
To run the main EightBall executable, which includes the line editor, interpreter and bytecode compiler, choose to start EB.SYSTEM
from within the ProDOS launcher.
You can then enter and run the test program below.
One you have entered the test program and run it in the interpreter, you can compile it to bytecode as follows:
comp "test"
quit
The compiled code is written to the file test
on the floppy diskette containing the EightBall system.
If you then invoke the EightBall Virtual Machine EBVM.SYSTEM
, it will prompt you for the name of the bytecode file to load. Enter test
at the prompt to run the code you just compiled. The VM is much faster than the interpreter.
For the Commodore 64, the file eightball.d64
can be written to a real C1541 floppy, or to a solid state drive such as SD2IEC.
It is also possible to run the EightBall system using the Vice C64 emulator under Linux.
To run the main EightBall executable, which includes the line editor, interpreter and bytecode compiler, run 8BALL64.PRG
as follows:
LOAD"8BALL64.PRG",8
RUN
You can then enter and run the test program below.
One you have entered the test program and run it in the interpreter, you can compile it to bytecode as follows:
comp "test"
quit
The compiled code is written to the file test
on the floppy diskette containing the EightBall system. (Note that if this file already exists an error will occur. This is a known deficiency which I will address in due course.)
If you then invoke the EightBall Virtual Machine 8BALLVM64.PRG
, it will prompt you for the name of the bytecode file to load. Enter test
at the prompt to run the code you just compiled. The VM is much faster than the interpreter.
LOAD"8BALLVM64.PRG",8
RUN
For the Commodore VIC20 (plus 32K expansion RAM), the file eightball.d64
can be written to a real C1541 floppy, or to a solid state drive such as SD2IEC.
It is also possible to run the EightBall system using the Vice VIC20 emulator under Linux.
To run the main EightBall executable, which includes the line editor, interpreter and bytecode compiler, run 8BALL20.PRG
as follows:
LOAD"8BALL20.PRG",8
RUN
You can then enter and run the test program below.
One you have entered the test program and run it in the interpreter, you can compile it to bytecode as follows:
comp "bytecode"
quit
The compiled code is written to the file bytecode
on the floppy diskette containing the EightBall system. (Note that if this file already exists an error will occur. This is a known deficiency which I will address in due course.)
If you then invoke the EightBall Virtual Machine 8BALLVM20.PRG
, it will load and execute this bytecode. The VM is much faster than the interpreter.
LOAD"8BALLVM20.PRG",8
RUN
Here is a simple test program you can enter to play with EightBall when getting started:
:i0
byte b=0
for b=1:10
pr.msg "Hello world ..."; pr.dec b; pr.nl
endfor
end
.
I have included the line editor commands to begin inserting text :i0
and to leave the editor and return to the interpreter (a single period on its own.)
You can list the program using the :l
(the letter Ell, not the number 1!) command and run it using the EightBall interpreter using the run
command.
I am building EightBall using cc65
v2.15 on Ubuntu Linux.
The Linux version of EightBall is currently being built using gcc
v7.3.0. It should build with whatever version of gcc
you have to hand.
In order to build Apple diskette images I use the open source Apple Commander tool. ADTPro is an awesome tool for transferring disk images to a real Apple II via a serial (RS-232) cable.
In order to build Commodore 1541 diskette images, I use the c1541
tool that comes with the open source VICE emulator.
I find the VICE emulator useful for testing on the Commodore C64 and VIC20 pathforms. MAME provides a useful Apple //e enhanced emulation.
Links to these projects:
I use Ubuntu Linux (18.04 at the current time.) It should also be possible to build the project using any relatively recent Linux distribution.
First clone the repository from GitHub.
$ git clone https://github.com/bobbimanners/EightBall.git
Then, edit the Makefile
to adjust the paths to point to your local installation of the cc65 compiler. If you wish to build disk images for Apple and Commodore machines, you will need to adjust the paths to point to your local installation of Apple Commander or VICE (for the c1541
tool).
$ cd EightBall
$ vi Makefile
Once you are satisfied with the Makefile
, building the software is simple:
$ make
This will build executables for Linux using gcc
and for 6502 targets using cc65
. The build targets are as follows:
- For Linux:
eightball
- Editor/interpreter/compiler for Linux (32 bit).eightballvm
- Virtual machine runtime for Linux (32 bit).disass
- Bytecode disassembler for Linux.
- For Apple IIe Enhanced, IIc, IIgs:
eightball.dsk
- Test diskette image for Apple II. Bootable ProDOS 2.4.1 disk.eb.system
(invokeseb
) - Editor/interpreter/compiler for Apple IIe Enhanced.ebvm.system
(invokesebvm
) - Virtual machine runtime for Apple IIe Enhanced.ebdiss.system
(invokesebdiss
) - Bytecode disassembler for Apple IIe Enhanced.
- For Commodore VIC20 + 32K expansion:
eightball.d64
- Test diskette image for Commodore VIC20 and C64.8ball20.prg
- Editor/interpreter/compiler for VIC20.8ballvm20.prg
- Virtual machine runtime for VIC20.disass20.prg
- Bytecode disassembler for VIC20.
- For Commodore 64:
eightball.d64
- Test diskette image for Commodore VIC20 and C64.8ball64.prg
- Editor/interpreter/compiler for C64.8ballvm64.prg
- Virtual machine runtime for C64.disass64.prg
- Bytecode disassembler for C64.
First start the EightBall editor/interpreter/compiler:
$ ./eightball
To load and run the unit test script within the EightBall interpreter:
:r"unittest.8b"
run
Then to compile it, and save the bytecode to the file bytecode
:
comp "bytecode"
quit
And finally, to run the bytecode under the VM:
$ ./eightballvm
The bytecode disassembler may be used to examine the bytecode in a human-readable format:
$ ./disass
Both the VM and the disassembler prompt for the name of the bytecode file to load (bytecode
in this example.)
You will have to find the Apple II ROMs online for use with MAME.
To start MAME and boot from eightball.dsk
:
$ mame -w apple2ee -sl6 diskii -floppydisk1 eightball.dsk
Look here for further instructions.
To start the x64
emulator:
$ x64 -8 eightball.d64
Note that EightBall scripts on Commodore platforms must be encoded in PETSCII rather than ASCII. unittest.8bp
is a PETSCII version of unittest.8b
(created automatically using the Linux tr
tool - see Makefile
for details of how this is done.)
Look here for further instructions.
To start the xvic
emulator:
$ xvic -mem all -drive8type 1541 -8 eightball.d64
Note that EightBall scripts on Commodore platforms must be encoded in PETSCII rather than ASCII.
Look here for further instructions.
There is a unit test script unittest.8b
written in EightBall.
It is quite large so it does not load in all 8-bit platforms. Deleting the comments would help! However I usually test using the Linux EightBall environment, so large scripts are less of a problem. Currently the script loads and runs on C64, but not Apple II or VIC20 (due to lack of memory for the source code.)
EightBall allows the programmer to define constant values as follows:
const size = 10
Constant values are represented as 16 bit words internally.
EightBall has two basic types: byte (8 bits) and word (16 bits).
word counter = 1000
byte xx = 0
Variables must be declared before use. Variables must be initialized. A constant may be used as an initializer:
const size = 10*10
word mysize = size+3
The first four letters of the variable name are significant (this may be increased by changing VARNUMCHARS
in eightball.c
). Any letters after that are simply ignored by the parser.
Variables of type word are also used to store pointers (there is no pointer type in EightBall).
At present, only 1D arrays are supported, but this will be expanded in future releases.
Arrays of byte and word may be declared as follows. The mandatory initializer is used to initialize the elements:
word myArray[100] = {1, 2, 3}; ' 1, 2, 3, 0, 0, 0 ...
byte storage[4] = {100, 200, 300, 200+200}; ' 100, 200, 300, 400, 0, 0, 0 ...
Initializer lists must be no longer than the number of elements in the array. The following is an error:
word bad[3] = {1, 2, 3, 4}; ' INITIALIZER LIST TOO LONG!
If the initializer list is shorter than the number of elements in the array then the remaining elements are set to zero. The empty list initializes all elements to zero:
word allzero[10] = {}
It is also possible to use string literals as array initializers. This is usually used with arrays of byte
to initialize strings, for example:
byte msg[100] = "Please try again!"
The array msg
will be initialized to the character values of the string literal, and a null terminator will be appended. Because strings are null-terminated, the string initializer can be no longer than the array size minus one:
byte aa[4] = "ABC"; # Okay
byte aa[4] = "ABCD"; # TOO LONG!
Note that string literals may also be used to initialize word
arrays:
word vals[10] = "ABCABCABC"
Since the Commodore VIC20 and C64 lack the {
and }
symbols, [
and ]
are used in their place, for example
word commodore[10] = [10, 9, 8 ]
Array elements begin from 0, so the array storage
above has elements from 0 to 9.
storage[0] = 0; ' First element
storage[9] = 99; ' Last element
Array dimensions must be known at compile time, but expressions made up of constants (both defined constants and literal constants are allowed for array dimensions and for the members of the initializer list (if any). This is allowed:
word knownsize[10*10+5] = {}
And so is this:
const width = 20
const margin = 4
word knownsize[10*width+margin] = {margin, margin*2, margin*3}
But this is illegal because myvar
is a regular variable, not a const
:
word myvar = 10
word knownsize[10*myvar] = {1, 2, 3}
Constants may be decimal:
byte a = 10
word w = 65535
word q = -1
or hex:
byte a = $0a
word w = $face
or character:
byte c = 'a'
word w = 'Z'
Character literals assume the ASCII value of the character in the single quotes.
EightBall supports most of C's arithmetic, logical and bitwise operators. They have the same precedence as in C as well. Since the Commodore machines do not have all the ASCII character, some substitutions have been made (shown in parenthesis below.)
EightBall also implements 'star operators' for pointer dereferencing which will also be familiar to C programmers.
- Addition: binary
+
- Subtraction: binary
-
- Multiplication: binary
*
- Division: binary
/
- Modulus: binary
%
- Power: binary
^
- Negation: unary prefix
-
- Logical equality: binary
==
- Logical inequality: binary
!=
- Logical greater-than: binary
>
- Logical greater-than-or-equal: binary
>=
- Logical less-than: binary
<
- Logical less-than-or-equal: binary
<=
- Logical and: binary
&&
- Logical or: binary
||
(binary##
on CBM) - Logical not: unary
!
- Bitwise and: binary
&
- Bitwise or: binary
|
(binary#
on CBM) - Bitwise xor: binary
!
- Left shift: binary
<<
- Right shift: binary
>>
- Bitwise not: unary prefix
~
(unary prefix.
on CBM)
The &
prefix operator returns a pointer to a variable which may be used to read and write the variable's contents. The operator may be applied to scalar variables, whole arrays and individual elements of arrays.
word w = 123
word A[10] = 0
pr.dec &w; ' Address of scalar w
pr.dec &A; ' Address of start of array A
pr.dec &A[2] ' Address of third element of array A
Note also that for arrays, evaluating just the array name with no index give the address of the start of the array. (This trick enables the array pass-by-reference feature to work.)
The following code will print "ALL THE SAME" on the console:
word A[10] = 0
word a1 = A
word a2 = &A
word a3 = &A[0]
if ((a1 == a2) && (a1 == a3))
pr.msg "ALL THE SAME"; pr.nl
endif
EightBall provides two 'star operators' which dereference pointers in a manner similar to the C star operator. One of these (*
) operates on word values, the other (^
) operates on byte values. Each of the operators may be used both for reading and writing through pointers.
Here is an example of a pointer to a word value:
word val = 0; ' Real value stored here
word addr = &val; ' Now addr points to val
*addr = 123; ' Now val is 123
pr.dec *addr; ' Recover the value via the pointer
pr.nl
Here is an example using a pointer to byte. This is similar to PEEK
and POKE
in BASIC.
word addr = $c000; ' addr points to hex $c000
byte val = ^addr; ' Read value from $c000 (PEEK)
^val = 0; ' Set value at $c000 to zero (POKE)
Parenthesis may be used to control the order of evaluation, for example:
pr.dec (10+2)*3; ' Prints 36
pr.dec 10+2*3; ' Prints 16
Precedence Level | Operators | Example | Example CBM |
---|---|---|---|
11 (Highest) | Prefix Plus | +a | |
Prefix Minus | -a | ||
Prefix Star | *a | ||
Prefix Caret | ^a | ||
Prefix Logical Not | !a | ||
Prefix Bitwise Not | ~a | .a | |
10 | Power of | a ^ b | |
Divide | a / b | ||
Multiply | a * b | ||
Modulus | a % b | ||
9 | Add | a + b | |
Subtract | a - b | ||
8 | Left Shift | a << b | |
Right Shift | a >> b | ||
7 | Greater Than | a > b | |
Greater Than Equal | a >= b | ||
Less Than | a < b | ||
Less Than Equal | a <= b | ||
6 | Equality | a == b | |
Inequality | a != b | ||
5 | Bitwise And | a & b | |
4 | Bitwise Xor | a ! b | |
3 | Bitwise Or | a | b | a # b |
2 | Logical And | a && b | |
1 (Lowest) | Logical Or | a || b | a ## b |
EightBall supports a 'structured' programming style by providing multi-line if
/then
/else
conditionals, for
loops and while
loops.
Note that the goto
statement is not supported!
Syntax is as follows:
if z == 16
pr.msg "Sweet sixteen!"
pr.nl
endif
Or, with the optional else
clause:
if x < 2
pr.msg "okay"
pr.nl
else
pr.msg "too many"; pr.nl
toomany = toomany + 1;
endif
Syntax is as per the following example:
for count = 1 : 10
pr.dec count
pr.nl
endfor
These are quite flexible, for example:
while bytes < 255
call getbyte()
bytes = bytes + 1
endwhile
EightBall allows named subroutines to be defined, for example:
sub myFirstSubroutine()
pr.msg "Hello"; pr.nl
endsub
All subroutines must end with endsub
statement.
A subroutine may return a word
value to the caller using the return
statement.
sub mySecondSubroutine()
return 2
endsub
If the flow of execution hits the endsub
(without first encountering a return
statement) then 0 is returned to the caller.
The subroutine above can be called as follows:
call myFirstSubroutine()
When myFirstSubroutine
hits a return
or endsub
statement, the flow of execution will return to the statement immediately following the call
.
Each subroutine has its own local variable scope. If a local variable is declared with the same name as a global variable, the global will not be available within the scope of the subroutine. When the subroutine returns, the local variables are destroyed.
byte val = 10; ' Global byte variable
sub myThirdSubroutine()
byte w[10] = 0; ' Local array
byte i = 0; ' Local byte iterator
for i=0 : 9
w[i] = val; ' Using both local and global variables
endfor
endsub
Just like in C, a local variable can 'hide' a global of the same name:
word hideme = 10;
call obscuredByClouds()
end
sub obscuredByClouds()
word hideme = 100;
pr.dec hideme; pr.nl; ' Prints 100 (val of local), not 10 (val of global)
endsub
Subroutines may take byte
or word
arguments, using the following syntax:
sub withArgs(byte flag, word val1, word val2)
' Do stuff
return 0
endsub
This could be called as follows:
word ww = 0; byte b = 0;
call withArgs(b, ww, ww+10)
When withArgs
runs, the expression passed as the first argument (b
) will be evaluated and the value assigned to the first formal argument flag
, which will be created in the subroutine's local scope. Similarly, the second argument (ww
) will be evaluated and the result assigned to val1
. Finally, ww+10
will be evaluated and assigned to val2
.
Argument passing is by value, which means that withArgs
can modify flag
, val1
or val2
freely without the changes being visible to the caller.
Subroutines may be invoked within an expression. In this case, the subroutine is executed and the value returned is evaluated within the expression in which it appears.
For example, the following subroutine:
sub adder(word a, word b)
return a+b
endsub
Could be used in an expression like this:
pr.dec adder(10, 5); ' Prints 15
or like this:
word res = adder(2, 3);
pr.dec res; ' Prints 5
Functions may invoke themselves recursively.
Passing by reference allows a subroutine to modify a value passed to it. EightBall does this using pointers, in a manner that will be familiar to C programmers. Here is adder
implemented using this pattern:
sub adder(word a, word b, word resptr)
*resptr = a+b
endsub
Then to call it:
word result
call adder(10, 20, &result)
This code takes the address of variable result
using the ampersand operator and passes it to subroutine adder
as resptr
. The subroutine then uses the star operator to write the result of the addition of the first two arguments (10 + 20 in this example) to the word pointed to by resptr
.
Unlike C, there are no special pointer types. Pointers must be stored in a word
variable, since they do not fit in a byte
. Pointers are dereferenced using the *
operator to reference words or the ^
operator to reference bytes.
Here is an example of using a pointer to byte:
word xx = 0
call poke(&xx, 10)
pr.dec xx; pr.nl; ' Should print 10
end
sub poke(word addr, byte val)
^addr = val
endsub
It is frequently useful to pass an array into a subroutine. It is not very useful to use pass by value for arrays, since this may mean copying a large object onto the stack. For these reasons, EightBall implements a special pass by reference mode for array variables, which operates in a manner similar to C.
Here is an example of a function which takes a regular variable and an array:
sub clearArray(byte arr[], word sz)
word i = 0
for i = 0 : sz-1
arr[i] = 0
endfor
endsub
This may be invoked like this:
word n = 10
byte A[n] = 99
call clearArray(A, n)
Note that the size of the array is not specified in the subroutine definition - any size array may be passed. Note also that the corresponding argument in the call
is simply the array name (no [] or other annotation is permitted.)
This mechanism effectively passes a pointer to the array contents 'behind the scenes'.
The end
statement marks the normal end of execution. This is often used to stop the flow of execution running off the end of the main program and into the subroutines (which causes an error):
call foo()
pr.msg "Done!"; pr.nl
end
sub foo()
pr.msg "foo"; pr.nl
endsub
EightBall code can be arranged however you wish. For example, this:
word w = 0; for w = 1 : 10; pr.dec w; pr.nl; endfor
is identical to this:
word w = 0
for w = 1 : 10
pr.dec w; pr.nl
endfor
Semicolons must be used to separate multiple statements on a line (even loop contructs as seen in the first example above.)
Indentation of the code (as shown in the examples in this manual) is optional, but encouraged.
Comments are introduced by the single quote character. A full line comment may be entered as follows:
' This is a comment
If you wish to comment after a statement, note that a semicolon is required to separate the statement and the comment:
pr.msg "Hello there"; ' Say hello!!!
Simple:
run
Program runs until it hits an end
statement, an error occurs or it is interrupted by the user.
comp "bytecodefile"
The program in memory is compiled to EightBall VM bytecode. This is written to a file specified.
The bytecode file may be executed using the EightBall Virtual Machine that is part of this package.
quit
Returns to ProDOS on Apple II, or to CBM BASIC on C64/VIC20.
new
clear
vars
Variables are shown in tabular form. The letter 'b' indicates byte type, while 'w' indicates word type. For scalar variables, the value is shown. For arrays, the dimension(s) are shown.
free
The free space available for variables and for program text is shown on the console.
Only console I/O is supported at present. File I/O is planned for a later release.
Prints a literal string to the console:
pr.msg "Hello world"
Prints an unsigned decimal value to the console:
pr.dec 123/10
Prints a signed decimal value to the console:
pr.dec.s 12-101
Prints a hexadecimal value to the console (prefixed with '$'):
pr.hex 1234
Prints a newline to the console:
pr.nl
Prints a character to the console:
pr.ch 'A'
pr.ch 65; ' Same as above
Prints a byte array as a string to the console. The string is null terminated (so printing stops at the first 0 character):
pr.str A; ' A is a byte array
This is for setting the text video mode on the Apple II only. It only works in the interpreter at present.
mode 40; ' Set 40 column mode
mode 80; ' Set 80 column mode
Allows a single character to be read from the keyboard. Be careful - this function assumes the argument passed to it a pointer to a byte value into which the character may be stored.
We can print a character obtained from the keyboard as follows:
byte c = 0
while 1
kbd.ch &c
pr.ch c
endwhile
Allows a line of input to be read from the keyboard and to be stored to an array of byte values. This statement takes two arguments - the first is an array of byte values into which to write the string, the second is the maximum number of bytes to write.
byte buffer[100] = 0;
kbd.ln buffer, 100
pr.msg "You typed> "
pr.str buffer
pr.nl
Eightball includes a simple line editor for editing program text. Programs are saved to disk in plain text format (ASCII on Apple II, PETSCII on CBM).
Be warned that the line editor is rather primitive. However we are trying to save memory.
Editor commands start with the colon character (:).
To load a new source file from disk, use the :r
'read' command:
:r "myfile.8b"
To save the current editor buffer to disk, use the :w 'write' command:
:w "myfile.8b"
On Commodore systems, this must be a new (non-existing) file, or a drive error will result.
Start inserting text before the specified line. The editor switches to insert mode, indicated by the '>' character (in inverse green on CBM). The following command will start inserting text at the beginning of an empty buffer:
:i0
>
One or more lines of code may then be entered. When you are done, enter a period '.' on a line on its own to return to EightBall immediate mode prompt.
Append is identical to the insert command described above, except that it starts inserting /after/ the specified line. This is often useful to adding lines following the end of an existing program.
This command allows one or more lines to be deleted. To delete one line:
:d33
or to delete a range of lines:
:d10,12
This command allows an individual line to be replaced (like inserting a new line the deleting the old line). It is different to the insert and append commands in that the text is entered immediately following the command (not on a new line). For example:
:c21:word var1=12
will replace line 21 with word var1=12
. Note the colon terminator following the line number.
Note that the syntax of this command is contrived to allow the CBM screen editor to work on listed output in a similar way to CBM BASIC. Code may be listed using the :l
command and the screen may then be interactively edited using the cursor keys and return, just as in BASIC.
This allows the program text to be listed to the console. Either the whole program may be displayed or just a range of lines. To show everything:
:l
To show a range of lines:
:l0-20
(The command is the letter Ell, not the number 1!)
The EightBall Virtual Machine is a simple runtime VM for executing the bytecode produced by the EightBall compiler. The EightBall VM can run on 6502 systems (Apple II, Commodore VIC20, C64) or as a Linux process.
The EightBall system is split into two separate executables:
- EightBall editor, interpreter and compiler
- EightBall VM, which runs the code built by the compiler
On Linux, the editor/interpreter/compiler is eightball
and the Virtual Machine is eightballvm
.
On Apple II ProDOS, the editor/interpreter/compiler is eightball.system
and the VM is 8bvm.system
.
On Commodore VIC20, the editor/interpreter/compiler is 8ball20.prg
and the VM is 8ballvm20.prg
.
On Commodore C64, the editor/interpreter/compiler is 8ball64.prg
and the VM is 8ballvm64.prg
.
Here is how to use the compiler:
- Start the main EightBall editor/interpreter/compiler program.
- Write your program in the editor.
- Debug using the interpreter (
run
command). - When it seems to work work okay, you can compile with the
comp
command.
The compiler will dump an assembly-style listing to the console and also write the VM bytecode to a binary file called bytecode
. If all goes well, no inscrutable error messages will be displayed.
Then you can run the VM program for your platform. It will load the bytecode from the file bytecode
and execute it. Running compiled code under the Virtual Machine is much faster than the interpreter (and also more memory efficient.)
The EightBall Virtual machine has the following features:
- 16 level evaluation stack. Each cell on the evaluation stack is 16 bits.
- Call stack. This stack is byte-orientated (rather than word-orientated like the evaluation stack). It occupies most of system memory.
- Program counter (PC) - 16 bits
- Stack pointer (SP) - 16 bits - used to address the call stack
- Frame pointer (FP) - 16 bits - makes addressing locals and parameters easier for subroutine code
The evaluation stack is used for all computations. The VM offers a variety of instructions for maniplating the evaluation stack. All calculations, regardless of the type of the variables involved, is performed using 16 bit arithmetic.
For shorthand, we define the names X
, Y
, Z
, T
for the top four slots in the evaluation stack. This notation is stolen from the world of HP RPN calculators.
The call stack is used for all memory allocation within the virtual machine, as follows:
- Global variables
- Subroutine parameters
- Local variables
- Return address when calling subroutine
- Parent frame pointer - used for unwinding the stack on
return
/endsub
Note that all the instructions with names ending in 'I' are so-called 'immediate mode' instructions. This means that the operand is the 16 bit word following the opcode, rather than the topmost element of the evaluation stack. The 'immediate mode' operand may be a data value or an address.
Relative mode instructions allow addressing relative to the frame pointer. This is helpful for easy access to local variables.
Instruction | Description | Imm? | Rel? |
---|---|---|---|
END | Terminate execution | ||
LDI | Pushes the following 16 bit word to the evaluation stack | * | |
LDAW | Replaces X with 16 bit value pointed to by X. | ||
LDAWI | Pushes the 16 bit value pointed to by following 16 bit word to evaluation stack. | * | |
LDAB | Replaces X with 8 bit value pointed to by X. | ||
LDABI | Pushes the 8 bit value pointed to by following 16 bit word to evaluation stack. | * | |
STAW | Stores 16 bit value Y in addr pointed to by X. Drops X and Y. | ||
STAWI | Stores 16 bit value X in addr pointed to by following 16 bit word. Drops X. | * | |
STAB | Stores 8 bit value Y in addr pointed to by X. Drops X and Y. | ||
STABI | Stores 8 bit value X in addr pointed to by following 16 bit word. Drops X. | * | |
LDRW | Replaces X with 16 bit value pointed to by X+FP+1 . |
* | |
LDRWI | Pushes the 16 bit value pointed to by following 16 bit word +FP+1 to evaluation stack. |
* | * |
LDRB | Replaces X with 8 bit value pointed to by X+FP+1 . |
* | |
LDRBI | Pushes the 8 bit value pointed to by following 16 bit word +FP+1 to evaluation stack. |
* | * |
STRW | Stores 16 bit value Y in addr pointed to by X+FP+1 . Drops X and Y. |
* | |
STRWI | Stores 16 bit value X in addr pointed to by following 16 bit word +FP+1 . Drops X. |
* | * |
STRB | Stores 8 bit value Y in addr pointed to by X+FP+1 . Drops X and Y. |
* | |
STRBI | Stores 8 bit value X in addr pointed to by following 16 bit word +FP+1 . Drops X. |
* | * |
SWP | Swaps X and Y | ||
DUP | Duplicates X -> X, Y | ||
DUP2 | Duplicates X -> X,Z; Y -> Y,T | ||
DROP | Drops X | ||
OVER | Duplicates Y -> X,Z | ||
PICK | Duplicates stack level specified in X+1 -> X |
||
POPW | Pop 16 bit value from call stack, push onto eval stack [X] | ||
POPB | Pop 8 bit value from call stack, push onto eval stack [X] | ||
PSHW | Push 16 bit value in X onto call stack. Drop X. | ||
PSHB | Push 8 bit value in X onto call stack. Drop X . | ||
DISC | Discard X bytes from call stack. Drop X. | ||
SPTOFP | Copy stack pointer to frame pointer. (Enter function scope) | ||
FPTOSP | Copy frame pointer to stack pointer. (Release local vars) | ||
ATOR | Convert absolute address in X to FP-relative address | ||
RTOA | Convert FP-relative address in X to absolute address | ||
INC | X = X+1 . |
||
DEC | X = X-1 . |
||
ADD | X = Y+X . Y is dropped. |
||
SUB | X = Y-X . Y is dropped. |
||
MUL | X = Y*X . Y is dropped. |
||
DIV | X = Y/X . Y is dropped. |
||
MOD | X = Y%X . Y is dropped . |
||
NEG | X = -X |
||
GT | X = Y>X . Y is dropped. |
||
GTE | X = Y>=X . Y is dropped. |
||
LT | X = Y<X . Y is dropped. |
||
LTE | X = Y<=X . Y is dropped. |
||
EQL | X = Y==X . Y is dropped. |
||
NEQL | X = Y!=X . Y is dropped. |
||
AND | X = Y&&X . Y is dropped. |
||
OR | X = Y||X . Y is dropped. |
||
NOT | X = !X |
||
BAND | X = Y&X . Y is dropped. |
||
BITOR | X = Y|X . Y is dropped. |
||
BITXOR | X = Y^X . Y is dropped. |
||
BITNOT | X = ~X . |
||
LSH | X = Y<<X . Y is dropped. |
||
RSH | X = Y>>X . Y is dropped. |
||
JMP | Jump to address X. Drop X. | ||
JMPI | Jump to 16 bit word following opcode. | * | |
BRC | If Y!= 0 , jump to address X. Drop X, Y. |
||
BRCI | If X!= 0 , jump to 16 bit word following opcode. Drop X. |
* | |
JSR | Push PC to call stack. Jump to address X. Drop X. | ||
JSRI | Push PC to call stack. Jump to 16 bit word following opcode. Drop X. | * | |
RTS | Pop call stack, jump to the address popped. | ||
PRDEC | Print 16 bit value in X in decimal. Drop X. | ||
PRHEX | Print 16 bit value in X in hexadecimal. Drop X. | ||
PRCH | Print character in X. Drop X. | ||
PRSTR | Print null terminated string pointed to by X. Drop X. | ||
PRMSG | Print literal string at PC (null terminated) | ||
KBDCH | Push character from keyboard onto eval stack | ||
KBDLN | Obtain line from keyboard and write to memory pointed to by Y. X contains the max number of bytes in buf. Drop X, Y. |
cc65 places the VM excutable code and static evaluation stack (32 bytes) in low memory. In an optimized virtual machine implementation, this would be placed in zero page.
Virtual machine addresses correspond to physical machine addresses on 6502 systems.
Under Linux, the virtual machine uses a 64K byte array as workspace, and addresses point into this space.
The call stack grows down from top of memory.
The bytecode is loaded at the start of memory. This location differs depending on the platform:
- Apple II - 0x5000
- Commodore 64 - 0x3000
- Commodore VIC20 - 0x4000
These addresses are chosen to allow space for the EightBall VM executable, which loads below these addresses. These values can be tuned by inspecting the map files generated by cc65.
EightBall was first implemented as an interpreted language (although the language design was always intended to permit compilation.) The bytecode compiler and virtual machine were added with v0.5 in April 2018.
In order to use the least code possible, the compiler uses the same data structures as the interpreter, but in a different way.
cc65 places the executable code of the EightBall line editor / interpreter / compiler in low memory.
There are two storage areas (or 'arenas') which are denoted as HEAP1
and HEAP2
in the eightball.c
code. The historical origin of this organization is the fact that EightBall first originated as a language targetting the VIC20 with 32K expansion. In this configuration, there is an 8K memory block (starting at address $A000m referred to as BLK5 in the VIC20 design) which is not contiguous with the rest of RAM. For the VIC20, BLK5 was designated as HEAP1
and the remainder of RAM (above the executable code) was designated HEAP2
. For other 6502 architectures (Apple II, Commodore 64), the HEAP1
/ HEAP2
arenas are maintained, but since there is no 'gap' in the memory map, the boundary between them may be adjusted to any arbitrary address.
The division of interpreter memory into two distinct blocks turns out to be quite useful, as we shall see below.
The source code of the program is stored in plain ASCII (or PETSCII on Commodore systems) text at the bottom of HEAP2
immediately above the EightBall executable code (using routine alloc2bttm()
). As more lines of source code are added, the it is added to the heap, growing upwards to higher addresses.
Note that the lower bounds of arena HEAP2
have to be adjusted by hand in eightball.c
when the code changes size. The size of the code segments generated by cc65 can be determined by inspecting the map file created by the compiler.
Global and local variables are allocated at the top of HEAP1
, from the highest available memory address down. For each variable a small var_t
header is stored, consisting of the first four characters of the name, a byte which records whether it is a byte
or word
variable and also the number of dimensions. If the number of dimensions is zero then this indicates a scalar variable, otherwise it is an array of the specified number of elements. The var_t
header also includes a two byte pointer to next, allowing them to be assembled into a linked list.
Following the var_t
header the actual variable data is stored:
- One byte for a
byte
scalar - Two bytes for a
word
scalar - Two byte pointer to a block of
sz
bytes for abyte[sz]
array - Two byte pointer to a block of
2*sz
bytes for aword[sz]
array
Normally when a global or local array is allocated, the data block immediately follows. However the pointer to the data block is exploited to allow the 'array pass by reference' feature to be implemented. In this case, the var_t
header and the two byte datablock pointer is copied into the local frame (the pointer still refers to the original datablock of the array passed by reference.)
The interpreter maintains a pointer to the beginning of the local stack frame (varslocal
) as well as to the beginning of the list (varsbegin
) which allows the global variables to be located. When operating at the global scope (ie: not within a subroutine) varslocal
points to varsbegin
.
When entering a subroutine a special var_t
entry is made for a word
variable using the otherwise illegal name "----"
to mark the stack frame and this is pushed to the call stack. The value of this this variable is used to store the current value of varslocal
(ie: the previous stack frame). This is used to unwind the stack when a subroutine exits.
Local variables are allocated on HEAP1
in exactly the same way as globals. The variable search routine getintvar()
knows to search the local variables and then (if within a subroutine) the globals also. The stack frame marks allow getintvar()
to know where the globals end and the stack frame of the first subroutine begins.
The interpreter creates a local variable for each parameter, copying the value provided by the caller. Parameters behave exactly like local variables, because they are local variables like any other.
When leaving a subroutine with return
or endsub
, the interpreter uses the innermost stack frame (which, remember, records the stack frame of its calling subroutine) to unwind the stack. The local variables and the innermost stack frame are released and varslocal
is set to point to the caller stack frame. Finally, the flow of control returns to the statement following the call
(or the evaluation of the expression including the function continues, in the case of function invocation.)
HEAP1
- Global and local variables, growing down from the top of the arena.HEAP2
- Source code, growing up from the bottom of the arena.
The compiler shares most of the infrastructure with the interpreter. The source code of the program is obviously still stored at the bbottom of HEAP2
.
The compiled bytecode is written to the beginning of HEAP1
, starting from the lowest address and working up. Since no actual data is stored in HEAP1
when compiling (only var_t
headers and addresses), it is hoped that there will be enough space for the compiled code without having it collide with the symbol tables (which are stored from the top of HEAP1
going down).
The main difference is that instead of storing global and local variables in HEAP1
, the compiler uses the var_t
data structures to keep track of the variable during compilation only - they serve as temporary symbol tables so the compiler can keep track of the address of all the variables in scope. Instead of the payload described above, the entries created by the compiler contain a pointer to the address of the variable in the virtual machine's address space.
Within the VM there is no 'management overhead' for storing variables - a word
is always two bytes, a byte
always one byte. All of the housekeeping takes place within the compiler (which has to keep track of the address of every variable in scope.)
The compiler has a simple allocator (managed by rt_push_callstack()
and rt_pop_callstack()
) that mimics the behaviour of the virtual machine, keeping track of the value of the stack pointer (SP). In the same way that the interpreter allocates all variables (global and local) on the call stack, the compiler uses the same strategy of allocating all variables on the call stack of the virtual machine ("VM call stack" from now on.) Since the compiler target memory allocator functions keep track of the VM SP register, the compiler is able to push values to the call stack and still know the addresses to be able to access them later. This can make the compiler output hard to read for humans however!
The EightBall Virtual Machine has a number of features which are intended to make it easier to implement subroutine call and return, argument passing etc. In particular, there is a special frame pointer (FP) register which is useful for easily accessing parameter and locals.
Before generating code to enter a subroutine, the compiler ensures code has been generated to evaluate any parameters and push the result to the call stack. Then the compiler emits a JSR
instruction to call the subroutine entry point. The virtual machine will automatically store the return address on the VM call stack and the VM program counter will be set to the entry point.
On entry to the subroutine, the compiler will emit VM instruction SPFP
which pushes the current value of the frame pointer (FP) to the VM call stack and copies the stack pointer (SP) to the frame pointer (FP). This sets up the call frame allowing us to easily refer to the parameters and the local variables.
The virtual machine makes this simple by providing special instructions LDRW
, LDRB
, STRW
and STRB
which load and store word
and byte
values to memory using addressing relative to the frame pointer FP. In this relative addressing mode, the parameters which were pushed to the call stack before entry have small positive valued addresses (FP + offset). Local variables are pushed to the call stack, which grows down as usual. As a result, the local variables will have small negative addresses relative to the frame pointer (FP - offset).
At the same time, absolute addressing via instructions LDAW
, LDAB
, STAW
and STAB
can be used to access the global variables.
On exit from the subroutine, the compiler emits code to evaluate the return value and leave it on the evaluation stack in the topmost slot (X). It then emits a FPSP
instruction which copies the frame pointer (FP) to the stack pointer (SP) and restores the value of the frame pointer by popping a word from the call stack. Copying FP to SP has the effect of immediately releasing all of the space (local variables) allocated in the topmost stack frame. The stack pointer is then positioned to where the frame pointer is topmost, so it is available to be popped and restored to FP. The overall effect is to unwind the stack back to the calling stack frame.
The return value is left on the evaluation stack. If the calling code does not use it, the compiler must issue a DROP
instruction to discard it.
The compiler also maintains a linked list of subroutine calls and a linked list of subroutine entry points which are used for the final step of compilation - internal linkage. Subroutine calls and entry points are both represented using records of type sub_t
, each of which contain the first eight characters of the subroutine name, a two byte address pointer and a two byte pointer to the next record.
The compiler allocates these linked lists (anchored by callsbegin
and subsbegin
) at the end of HEAP2
, growing down towards the source code, which grows up from the bottom of this same arena. The linked list of subroutine calls is freed as soon as compilation is completed.
HEAP1
- Generated bytecode, growing up from the bottom of the arena. Discarded after compilation.
- Global and local variables, growing down from the top of the arena.
HEAP2
- Source code, growing up from the bottom of the arena.
- Compiler linkage tables, growing down from the top of the arena. Discarded after compilation.
When compiling EightBall code, there are instances where the generated code needs to jump or branch ahead, to some location within code that has yet to be generated. In this case, the compiler will emit the dummy address $ffff
and will come back later to insert the correct address, once it is known. This is referred to as an "address fixup."
When compiling if
/ endif
or if
. else
, endif
conditionals, the compiler needs to generate code to branch forward to jump over the if
or else
code blocks. Similarly, for while
/ endwhile
loops, the compiler needs to branch forward to jump over the loop body if the condition is false. In all these cases, the address fixup is computed when the destination code is generated.
Another situation where address fixups are required is subroutine calls. When a subroutine is called, a new entry is recorded in the callsbegin
linked list, containing the beginning of the subroutine name and a pointer to the VM address of the call address to be fixed up. When a subroutine definition is encountered, a new entry is recorded in the subsbegin
linked list, again containing the subroutine name but this time with the address of the entry point.
The final step of compilation involves iterating through the callsbegin
list, looking up each subroutine name in the subsbegin
list. If the name is found, then the dummy $ffff
at the fixup address is replaced with the entry point of the filename. Otherwise a linkage error is (cryptically) reported.
A byte
variable is one byte everywhere. A word
variable is two bytes everywhere, except in the Linux interpreter (where is is 32 bit word, 4 bytes.)
Platform | Size in Bytes |
---|---|
6502 Interpreter | word 2, byte 1 |
6502 VM | word 2, byte 1 |
Linux Interpreter | word 4, byte 1 |
Linux VM | word 2, byte 1 |
This one is obligatory:
pr.msg "Hello world!"; pr.nl
end
You can omit the end
statement if you like.
This example shows how EightBall can support recursion. I should point out that it is much better to do this kind of thing using iteration, but this is a fun simple example:
pr.dec fact(3); pr.nl
end
sub fact(word val)
pr.msg "fact("; pr.dec val; pr.msg ")"; pr.nl
if val == 0
return 1
else
return val * fact(val-1)
endif
endsub
fact(3)
calls fact(2)
, which calls fact(1)
, then finally fact(0)
.
See eightballvm.h
for technical details.
Here is the well-known Sieve of Eratosthenes algorithm for finding prime numbers, written in EightBall:
' Sieve of Eratosthenes
const sz=20
byte A[sz*sz] = {}
word i=0
pr.msg "Initializing array ..."; pr.nl
for i=0:sz-1
A[i]=1
endfor
call doall(sz, A)
end
sub doall(word nr, byte array[])
word n = nr * nr
pr.msg "Sieve of Eratosthenes ..."
pr.msg "nr is "; pr.dec nr; pr.nl
call sieve(n, nr, array)
call printresults(n, array)
return 0
endsub
sub sieve(word n, word nr, byte AA[])
pr.msg "Sieve"
word i = 0; word j = 0
for i = 2 : (nr - 1)
if AA[i]
j = i * i
while (j < n)
AA[j] = 0
j = j + i
endwhile
endif
endfor
return 0
endsub
sub printresults(word n, byte AA[])
word i = 0
for i = 2 : (n - 1)
if AA[i]
if i > 2
pr.msg ", "
endif
pr.dec i
endif
endfor
pr.msg "."
pr.nl
return 0
endsub
(See the Wiki for more code examples.)