- Getting Started
- Introduction to Programming With uCISC <-- you are here
- uCISC Syntax Quick Reference
- Standard Libraries
- Instruction Set Details
All turing-complete processors fundamentally do one thing:
Processors conditionally move data between locations while modifying it in transit
That is all they do. Let's start with a simple representation of this logic:
if last value is zero then compute 'points' + 20 and store it in 'score'
Every computation, disk read, screen draw, everything can be reduced to this at the processor level (with different variables and mathematical operators).
|----------------| |----------------| |----------------|
|- *Input* -| -> |- *transform* -| -> |- *Output* -|
|- -| -> |- -| -> |- -|
|- Device/ -| -> |- CPU -| -> |- Device/ -|
|- Memory -| -> |- -| -> |- Memory -|
|----------------| |----------------| |----------------|
The processor simply inputs, transforms and outputs data.
uCISC embraces this principle fully. If you turn the computation into a programming statement, it might look like this:
# If previous value is 0, compute 'points' + 20, store in 'score'
score <0 points + 20
Note: '#' is the start of a comment. Everything on the line starting with the '#' is ignored. I'll just use it for descriptive explanation as we go.
Since that's all it needs to do, that's all uCISC does. However, in order to fit into a 16-bit architecture, we can only support 2 variables in a single statement:
# We only have 16-bits, so make the destination one of the arguments.
score <0 copy 20
score <0 score + points
It turns out that this doesn't limit your options, you just may need to make a copy of one of the values first.
Let's simplify the syntax a bit so that we don't have to type 'score' twice in one instruction:
# If previous value is zero, set score to points + 20
score <0 copy 20
score <0 add points
You are now reading real uCISC code. With one addition we will get to later, this is structure for every single uCISC statement. However, we need to have better control over the source and destination to be able to operate as a general purpose processor.
uCISC processors have registers just like most processors. Registers are just a circuit inside the processor that can hold the location of a value (or they can just hold the value itself). uCISC has 6 general purpose registers named r1, r2, r3, r4, r5, and r6.
# If previous value is zero, set score(r1) to points(r2) + 20
r1 <0 copy 20
r1 <0 add r2
In this example, r1 doesn't hold the score itself. It points to where the score is located in memory. This is a step backwards from using human-readable names, so let's add that back:
def score/r1
def points/r2
score <0 copy 20
score <0 add points
Now we are using the processor registers to locate data and move it around in memory arbitrarily, but with nice names that we can remember what they are.
Using registers to point directly to things is useful, but sometimes the data is more complicated than that. Let's say you wanted to define a reference to a 'player' and a 'trap':
def player/r4
var player.number/0
var player.score/1
var player.health/2
def trap/r5
var trap.type/0
var trap.damage/1
var trap.points/2
The number after the forward slash '/' is called the offset. The 'player' register will point to some memory location. I can now keep track of where the parts of the player data are relative to the base player address. In this case, the health is offset from the base address by 2 locations.
This allows me to do fun things like:
# Add points to player after defeating the trap
player.score <0 add trap.points
Now we are getting somewhere!
The operations you can use to transform source and destination values in uCISC are:
Non-modifying:
- copy
Bitwise operations: 2. and 3. or 4. xor 5. inv (invert) 6. shl (shift left) 7. shr (shift right)
Btye operation: 8. swap (swaps the bytes of a 16-bit word) 9. msb (gets the most significant byte only) 10. lsb (gets the least significant byte only)
Arithmetic: 11. add 12. addc (add with carry in) 13. sub 14. mult 15. multsw (gets the most significant word of a multiplication)
The exact details of what these do are described in the instruction set reference. Since uCISC is a 16-bit platform the byte operations (8 - 10) were included to allow you to manipulate 8-bit bytes easily.
For example, we can do a subtract rather than add:
# Subtract points from player after tripping the trap
player.score <0 sub trap.points
Often in programming, you just need a way to add or refer to a constant number value. uCISC supports that:
# Add 250 points to the player score
player.score <0 add val/250
# % designates a hexidecimal number, also adds 250 points
player.score <0 add val/%FA
The 'val' syntax allows you to include values directly in the instruction without having to refer to them from somewhere else.
If you are going to use registers to refer to things, you will need to initialize them with an address. You can refer to the contents of a register (rather than what it points to) using the '&' symbol.
If we stored the player data at memory address 4096, we could do the following:
def player/r3
var player.score/1
&player <- copy val/%1000 # player now refers to memory address 4096
player.score <- add val/250 # adds 250 to the score at address 4097
Since you often need to point a register to a specific memory location when you define it, uCISC supports a more concise syntax for this:
# define player and point it to address %1000
def player/r3 <- copy val/%1000
var player.score/1
# adds 250 to the value at address 4097
player.score <- add val/250
It can be useful sometimes to limit the impact of some of your labels. From our example earlier, if I want to add 'a' and 'b' I may want to avoid making every other part of the code have to be careful not to use 'a' and 'b' elsewhere.
def sum/r4 <- copy val/%1000
{
def a/r3
def b/r2
sum <- copy a
sum <- add b
}
# 'a' and 'b' are no longer in scope, and are meaningless outside the { } block
# ... but, I can do other things with 'sum' here, it is still in scope
This is a powerful feature. I can now start to limit the number of widely used 'def' statements to just the most common I need. I can contain my 'def' statements inside a scoping block to avoid polluting the rest of the program with my names.
I can now include reusable libraries in my code that hide all their internal variable names from me. We won't mess each other up.
Of course uCISC needs a program counter to keep track of where it is in the execution. This is just another register that happens to control the execution flow of the program.
def count/r2 <- copy val/%1000
count <- add val/1
pc <- sub val/1
Since the program counter always holds the location of the currently executing instruction, subtracting one from it will cause the program to jump back one instruction. In this case, this code will just infinitely count up with an increment of one on each loop.
You can also achieve the same effect with an offset. The 'pc' is just a register after all:
def count/r2 <- copy val/%1000
count <- add val/1
pc <- copy pc/-1
Note: due to technical limitations of 16-bits, negative offsets are not allowed when
referring to memory locations. Since pc
is a register value, rather than a memory
location, it works here. More on those details are in the instruction set reference.
Counting numbers of instructions to figure out offsets will become very tedious, very quickly. Thankfully, uCISC supports labels:
def count/r2 <- copy val/%1000
beginning:
count <- add val/1
pc <- copy pc/beginning
We don't have to count the instructions, uCISC will do it for us. The relative
difference between the current instruction and beginning
will be calculated
automatically (-1 in this example). The beginning
reference in the instruction
will be replaced in the instruction.
We are now free to change our code without having to recalculate all the offsets.
Without conditionals, we are severely limited in the programs we can write. Conditionals test the result of the previous math operation and do something different based on the result.
uCISC supports the following conditionals:
- <-, <| (store always)
- <
, <|(don't store, just set the flags for the operation result) - <0, <|0, (store if zero)
- <1, <|1, (store if !zero)
- <+, <p, <|+, <|p (store if positive)
- <n, <|n, <|- (store if negative)
- <&, <|&, <o (store if overflow)
- <#, <|# (store if error)
Side Bar on Readability:
You can use any of the varieties listed. Some are easier/harder to type, so it depends on how much effort you want to put in for readability and your preferences on that. The <| versions look great in a font with ligatures (e.g. Fira Code). I haven't settled on these and may add text only versions (e.g. neg, pos, err, etc) in the future. Personally, I think assembly programs are significantly impacted with poor readability, so I don't mind a bit of extra typing to keep it clean. On the other hand without ligatures, some of these don't look great, so idk. ¯_(ツ)_/¯
We can incorporate this into our previous example:
def count/r2 <- copy val/%1000
count <- copy val/0 # init count to zero
nextCount:
count <- add val/1 # add 1 to count
count <~ sub val/100 # subtract 100 from count, but don't store it
pc <|0 copy pc/nextCount # if the subtract result was not zero, jumpt to nextCount
This program will start at zero and count to 100.
Most programs will need some sort of data. Rather than hand coding these values into
instructions, you can create blocks of space to hold data values. You can then use
labels to refer to that data. The line must begin with a %
symbol.
my_data:
% 001F 0A48 656C 6C6f 2C20 776F 726C 6421 2075 4349 5343 2069 7320 6865 7265 210A 0A00
# Refer to my data
def myData/r2 <- copy pc/my_data
# Read/write data as before
A common type of data is strings. Instead of having to manually encode strings into ascii values, you can just directly add them. They behave just like data blocks otherwise. Strings are in quotes and automatically converted into ascii values. Strings are encoded 1 character per word and are zero terminated.
For example:
hello_world: "\nHello, world! uCISC is here!\n\n"
# Refer to the string
def helloString/r2 <- copy pc/hello_world
Scopes come with a neat trick:
def count/r2 <- copy val/%1000
count <- copy val/0 # init count to zero
{
count <- add val/1
count <~ sub val/100
pc <!? copy pc/loop
}
When entering a scope, uCISC automatically defines the 'loop' and 'break' labels. Loop refers to the first instruction in the scope, break refers to the first instruction after the scope.
Same code, using a break instead of loop:
def count/r2 <- copy val/%1000
count <- copy val/0 # init count to zero
{
count <~ sub val/100
pc <0? copy pc/break
count <- add val/1
pc <- copy pc/loop
}
This isn't a great example for 'break' since the 'loop' version is more concise and more clear. That's not always the case though, so you have both to work with. You can use whichever makes your code easier to read.
Despite the power we have so far, we still have to explicitly point the registers to memory locations. We don't have a great way of keeping track of temporary values that we may need during our programs.
To support that, we need a stack. A stack is just what it sounds like, a stack of values. You can 'push' things on top of the stack or 'pop' them off the top of the stack. The uCISC processor natively supports these operations. We can use them as follows:
def stack/r1 <- copy val/0 # Define the stack pointer to address %0
var stack.count push <- copy val/0 # push count on the stack, initialized with 0
{
stack.count <- add val/1
stack.count <~ sub val/100
pc <!? copy pc/loop
}
def result/r2 <- copy stack.count pop # store the result, stack is now empty
Once we initialize the stack pointer, we can just push and pop values on and off the stack (this is that extra bit of statement syntax I referred to earlier). You can push multiple values on the stack:
def stack/r1 <- copy val/0
var stack.health push <- copy val/100
var stack.damage push <- copy val/10
var stack.x push <- copy val/4
var stack.y push <- copy val/5
stack.health <- sub stack.damage
Under the hood, uCISC is using offsets to keep track of where things are on the stack and refer to them directly. In this case, for example, that last operation resolves to:
r1/3 <- sub r1/2
Letting uCISC keep track of your offsets is way easier than manually managing them and updating them every time you decide to add or remove a variable to the stack.
You now know everything you need to know to build some simple uCISC programs. You should now be able to understand examples/knight.ucisc.
There is, however, one critically useful syntax feature that really steps up the complexity of programs you can write.
We can already manually define functions, they are just really, really cumbersome:
def stack/r1 <- copy val/0
# setup our function variables according to some calling convention:
var stack.fibResult push <- copy val/0 # push a spot for the result on the stack
var stack.return push <- copy pc/fib_return # push the return address on the stack
var stack.argument <- copy val/24 # push the argument 'n' onto the stack
pc <- copy pc/fib # jump to the 'fib' label
fib_return:
# fibonacci result is in stack.fibResult now
# do other stuff...
fib:
{
# define our variables to refer the values passed in
var stack.result\2 # result is on the stack, 2 positions down
var stack.return\1 # return address is on the stack, 1 position down
var stack.n\0 # the argument 'n' is on the top of the stack
# fibonacci code here ...
copy stack.result <- ... # copy the result to the expected location
copy pc <- stack.return pop # jump return, pop everything including stack.return
}
The code above uses a commonly defined stack to pass variables back and forth. At a minimum, the caller and the callee need to agree on a way to pass the arguments, the return value and the return address to come back to. This is quite bit of code to write every time you want to define or call a function. Worse, this code is always the same, except with different arguments.
Functions in uCISC let the uCISC translator write this boilerplate code for you:
def stack/r1 <- copy val/0 # Setup a common stack reference
stack.fib(val/24) -> fibAnswer # Call fib
# fibonacci result is in stack.fibResult now
# do other stuff...
fun stack.fib(n) -> result {
# fibonacci code here ...
copy stack.result <- ... # copy the result to the expected location
copy pc <- stack.return pop
}
Breaking this down:
stack.fib(val/24) -> fibAnswer
stack
: For this call, thestack
will be the common reference between the caller and the callee. The variables will be push/popped from this stack.fib
: The label of the function to call.(...)
: Zero or more comma separated args to pass to the function. Any uCISC source can be used.-> fibAnswer
: definestack.fibAnswer
and use it to store the result
fun stack.fib(n) -> result {
}
fun
: Start a function definitionstack
: For this function, thestack
will be the common reference between the caller and the callee. The variables will be push/popped from this stack.fib
: The label to use for the function(...)
: A comma separated list of args. A variable reference will be created for each one (e.g.stack.n
in this example).-> result
: definestack.result
and use it to store the result{ ... }
: define a new scope. All function stack variables will be invisible outside this scope. It behaves just like any other scope.copy pc <- stack.return pop
: Copies the return address to the pc register, pops all values off the stack up-to-and-including the return address, leaving only the result on the stack.
You can see how this simplifies things a lot, even more with large numbers of arguments. You can now write much more complex programs that are much easier to follow and understand rather than just writing machine instructions directly.
uCISC is nothing more than a thin translation layer on top of the machine instructions. There is no true compiling going on here. The instructions are laid out in the order you specify and most lines have a 1-to-1 correlation with a machine instruction.
At this point, you know the core syntax for uCISC. There isn't anything else. However, there are some really powerful things you can do to make developing complex software easier to do. See the Advanced uCISC Techiques guide for an exploration of those techniques.
In addition to the instructions themselves, you also need to know how to manipulate the hardware to do interesting things like send data to other devices like the video card, I2C or UART connections. Accessing Other Hardware might be a great place to go next.