Skip to content

Commit

Permalink
Pass on stack frames
Browse files Browse the repository at this point in the history
  • Loading branch information
guillep committed Jul 25, 2024
1 parent 837daf9 commit 4b21782
Show file tree
Hide file tree
Showing 2 changed files with 98 additions and 3 deletions.
36 changes: 36 additions & 0 deletions Chapters/3-MethodsAndBytecode/methodsbytecode.md
Original file line number Diff line number Diff line change
Expand Up @@ -321,6 +321,42 @@ The Sista bytecode set is a stack-based bytecode using common optimization as de

This section explains some particularities of the bytecode set and finishes with a table showing the bytecodes, their encoding and summarizing their optimizations.

#### Bytecode and Variable Indexes are 0-based, Primitives 1-based

While the Pharo programming language designs its indexed accesses with 1-base offsets, this is not the case of the underlying implementation.
Actually, it is only the primitives that start with a 1-based index.
In contrast:

- bytecode instructions are encoded in a 0-based fashion, making the value `0` a valid encoded instruction.
- all variables, temporaries and instance, use 0-based indexing. Thus, the bytecode to read the first instance variable is `push instance variable 0`. Similarly, the bytecode to read the first temporary variable is `push temporary variable 0`.


#### Temporary Variables vs Arguments

The Sista bytecode set inherits, mostly for historical reasons, several traits from previous the bytecode design.
One particularly interesting trait is that method arguments are modelled as the first (read-only) temporary variables in a method.
For example, while the method that follows has syntactically one argument and one temporary variable, the underlying implementation will have two temporary variables, from which the first is an argument.

```caption=Arguments are the first temporaries in a method.
MyClass >> methodWithOneArgAndOneTemp: arg
| temp |
...
(MyClass >> #methodWithOneArgAndOneTemp: ) numArgs. "1"
(MyClass >> #methodWithOneArgAndOneTemp: ) numTemps. "2"
```

This decision impacts the bytecode design in different ways.

1. First, to get the real number of temporaries in a method we need to substract the number of arguments from it.

```caption=Obtaining the real number of temporaries from a method.
realNumberOfTemporaries := aMethod numTemps - aMethod numArgs
```

2. We need to know the arguments of a method to index its temporaries. For example, reading the nth real temporary variable in a method, we need to read the temporary at offset `numArgs + nth - 1`~(remember that we need to substract 1 because variable indexes are 0-based).

#### Bytecode Extension Prefixes

Some bytecode instructions are limited by the encoding: for example, 2-byte instructions usually use one byte as opcode and one byte as argument, limiting the argument to a maximum of 255 values.
Expand Down
65 changes: 62 additions & 3 deletions Chapters/4-Interpreter/theInterpreter.md
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,10 @@ A frame is suspended by pushing its instruction pointer to the stack before crea
Thus, the stack can be reconstructed by iterating from the top frame up to its caller's frame start until the end of the stack.
Notice that the stack pointer needs not to be stored: a suspended frame's stack pointer is the slot that precedes its suspended instruction pointer, which is found relative to its following frame.

#### Stack Frame Flags

EXPLAIN THE FLAGS and ENCODING

#### Setting up Stack Frames

The code that follows illustrates how a new frame is created.
Expand Down Expand Up @@ -200,13 +204,68 @@ Interpreter >> setUpFrameForMethod: aMethod receiver: rcvr

#### Bytecodes Accessing the Stack Frame

PUSH RECEIVER
Given the structure of a stack, we can see that the all the fields in the fixed part of a frame can be found relative to the start of a frame, which for the first frame is the `framePointer`.
This includes in particular the receiver and the temporary variables.
Thus, the bytecode that access these values are defined to read/write at an offset from the frame pointer.

The code that follows shows the bytecode that pushes `self` to the stack:

```caption=The push receiver bytecode reads the receiver relative from the framePointer
Interpreter >> receiver
^memory readWordAt: framePointer + FoxReceiver
Interpreter >> pushReceiverBytecode
self fetchNextBytecode.
self push: self receiver.
```

Moreover, to read/write the nth real temporary variable we need to read/write the nth _above_ the fixed fields of the frame.
The code below shows the bytecode that stores the top of the stack into a temporary variable and pops.
This bytecode uses the `itemporary:in:put:` method that is used to write into a method's temporary.
A similar method, `itemporary:in:` exists to read temporary variables.

PUSH TEMP INSTRUCTIONS
Remember that the bytecode set is designed so arguments are treated as temporaries.
The code below shows that there are two execution paths: one for arguments, one for temporaries.
The interpreter decides if the asked offset is for a temporary of an argument by comparing it to the number of arguments.
The number of arguments is obtained from the fields flag.
In this section we focus on the path for temporaries.
We will explain the path for arguments in the next section.

STORETEMP
The interpreter indexes temporary variables using as base the field that follows the receiver field,
and as offset the offset of the temporary without taking arguments into account.
In these two lines we see clearly that the stack grows down: to get the field _after_ we need to subtract from its position.
Moreover, since all accesses are written as direct memory addresses, all offsets are computed in bytes, thus multiplying by the number of bytes in a word `objectMemory wordSize`.


```caption=Storing
Interpreter >> iframeNumArgs: theFP
^memory readByteAt: theFP + FoxFrameFlags + 1
Interpreter >> iframeReceiverLocation: theFP
^theFP + FoxReceiver
Interpreter >> itemporary: offset in: theFP put: valueOop
| frameNumArgs |
^ offset < (frameNumArgs := self iframeNumArgs: theFP)
ifTrue: [ "Write an argument" ... ]
ifFalse: [
memory
writeWordAt:
(self iframeReceiverLocation: theFP) - objectMemory wordSize
+ (frameNumArgs - offset * objectMemory wordSize)
put: valueOop ]
Interpreter >> storeAndPopTemporaryVariableBytecode
self fetchNextBytecode.
self
itemporary: (currentBytecode bitAnd: 7)
in: framePointer
put: self stackTop.
self pop: 1
```

### Interpreting Message sends

#### Calling Convention
Expand Down

0 comments on commit 4b21782

Please sign in to comment.