diff --git a/Part1-InterpreterAndBytecode/4-Interpreter/theInterpreter.md b/Part1-InterpreterAndBytecode/4-Interpreter/theInterpreter.md index 8c34ac7..c839368 100644 --- a/Part1-InterpreterAndBytecode/4-Interpreter/theInterpreter.md +++ b/Part1-InterpreterAndBytecode/4-Interpreter/theInterpreter.md @@ -2,7 +2,7 @@ At the heart of the Pharo virtual machine there is the interpreter, Pharo's main execution engine. Folklore says that interpreters are slow. Truth is they are. But they are also easy to implement and maintain. -They play a very important role in a multi-tiered architecture: optimized execution engines such as JIT compiler execute usually the commonly-executed code and fast execution paths, falling back to slower layers such as interpreters for the rest of the execution. +They play a very important role in a multi-tiered architecture: optimized execution engines (such as JIT compilers) execute usually the commonly-executed code and fast execution paths, falling back to slower layers such as interpreters for the rest of the execution. Thus, the Pharo interpreter implements the entire semantics of Pharo: all bytecodes and primitives. This makes it a good target to study and understand how Pharo works. @@ -10,9 +10,9 @@ This chapter visits the interpreter from a high-level point of view, using code This includes how the interpreter manages its execution state and the call stack, how it dispatches instructions, some illustrative instructions and some of its platform independent optimizations: instruction lookahead and static type predictions. The chapter on Slang VM generation presents machine dependent optimizations such as interpreter threading and autolocalization. -A section of this chapter is dedicated to message sends: the most important instruction in the Pharo programming language and in many object oriented languages. +A section of this chapter is dedicated to message sends: the most important instruction in the Pharo programming language and in many object-oriented languages. Message-sends, as similar as they may look to statically-bound function calls, must implement late-bound execution. -Sending a message requires looking up methods up in the receiver class hierarchy, which is one of the most common operations in Pharo, and one of the most complex and expensive too. +Sending a message requires looking up the corresponding method up in the receiver class hierarchy, which is one of the most common operations in Pharo, and one of the most complex and expensive too. One simple way to cope with such costs are global lookup caches. @@ -20,13 +20,14 @@ One simple way to cope with such costs are global lookup caches. Pharo's execution engine is based on the execution of bytecode instructions. Bytecode instructions are organized as a list of instructions, where one instruction is executed after the other. -For each instruction, an interpreter typically follows two main steps: fetching the next instruction and dispatching to the code implementing the instruction. +An interpreter typically follows two main steps for each instruction: it fetches the next instruction and dispatches to the code that implements the instruction. + Typically, interpreters are implemented using a `while` and a `switch` statement. The `while` statement executes indefinitely (or until the program exits). The `switch` statement contains a case per instruction and dispatches to the corresponding code depending on the read instruction. -The following code illustrates a simple interpreter written a the C programming language. +The following code (Listing *@Cinterp@*) illustrates a simple interpreter written the C programming language. -```caption=Sketching an interpreter in C +```caption=Sketching an interpreter in C&anchor=Cinterp void interpret(){ while(true){ int instruction = fetchNextInstruction(); @@ -48,9 +49,12 @@ void interpret(){ Pharo's bytecode interpreter is not written as a case statement as shown before. Instead, it uses a table dispatch mechanism, where a table, namely the _bytecode table_ associates an instruction with the code to execute for that instruction. The bytecode table is an array ranging from 0 to 255, thus covering the entire bytecode set, and containing unary selectors. -Instructions are dispatched by looking up the selector for an instruction, then using the selector to send a message to the interpreter using `#perform:`. +Instructions are dispatched by looking up the selector for an instruction, then using the selector to send a message to the interpreter using the message `perform:` (as shown in Listing *@dispatch@*). + +Note that while we write the message `perform:`, the C transpiler will generate a call to the corresponding function implementing the bytecode semantics. + -```caption=Pharo interpreter uses a table dispatch with symbols +```caption=Pharo interpreter uses a table dispatch with symbols&anchor=dispatch Interpreter >> interpret ... self fetchNextBytecode. @@ -60,27 +64,30 @@ Interpreter >> interpret ... Interpreter >> dispatchOn: anInteger in: selectorArray - self perform: (selectorArray at: (anInteger + 1)). ```` -#### Instruction Fetching and the Instruction Pointer +### Instruction Fetching and the Instruction Pointer Mimicking how the CPU handles its own program counter, the Pharo interpreter manages its instruction stream using a variable called `ìnstructionPointer`, which is a pointer to the current instruction in the method under execution. -Using the instruction pointer, the interpreter fetches instructions one-by-one sequentially from a method's bytecode list. -```caption=Low-level stack operations in the interpreter +Using the instruction pointer, the interpreter fetches instructions one-by-one sequentially from a method's bytecode list as shown in Listing *@fetchnext@*. Here the variable `objectMemory` is a pointer to the complete memory of the interpreter and the instructionPointer is just an address within the range of the currently executed compiled method. + +```caption=Low-level stack operations in the interpreter&anchor=fetchnext Interpreter >> fetchNextBytecode - ^objectMemory byteAt: (instructionPointer := instructionPointer + 1) + ^ objectMemory byteAt: (instructionPointer := instructionPointer + 1) ``` +>[! note ] +> Each bytecode instruction fetches the next instruction. We will observe in the methods defining bytecodes that, differently from the interpreter C sketch at the beginning of the chapter, each instruction is responsible for fetching its next instruction. This implementation detail duplicates the instruction fetching code around the interpreter, while simplifying its translation to a threaded interpreter as it will be explained in the Slang chapter. + The instruction pointer is not only manipulated to fetch instructions. -Later in this chapter we will see how bytecode instructions manipulate it to implement control flow operations: jumps, message sends and returns. +Later in this chapter we will see how bytecode instructions manipulate it to implement control flow operations: jumps, message sends, and returns. Moreover, later chapters will show the role of the instruction pointer to implement green-thread based concurrency. -**Each bytecode instruction fetches the next instruction.** We will observe in the methods defining bytecodes that, differently from the interpreter C sketch at the beginning of the chapter, each instruction is responsible for fetching its next instruction. This implementation detail duplicates the instruction fetching code around the interpreter, while simplifying its translation to a threaded interpreter as it will be explained in the Slang chapter. -#### Pushing and Popping, the Stack and the Stack Pointer + +### Pushing and Popping, the Stack and the Stack Pointer The execution of Pharo code is supported mainly by a stack supporting operations such as push, pop and indexed access from the top. The stack is a contiguous region of memory of fixed size organized in slots of one word each. @@ -91,9 +98,9 @@ The stack is manipulated through a variable called `stackPointer`, which is a po **The stack grows down:** in Pharo, as in many other language implementations, the stack base is in a higher address than the stack top. Pushing a value to the stack moves the stack pointer towards lower addresses. Popping moves it towards higher addresses. -Using the `stackPointer` variable the interpreter implements the following methods: +Using the `stackPointer` variable the interpreter implements the following methods shown in Listing *@stackOps@*. -```caption=Low-level stack operations in the interpreter +```caption=Low-level stack operations in the interpreter&anchor=stackOps Interpreter >> stackValue: offset ^memory readWordAt: stackPointer + (offset*objectMemory wordSize) @@ -152,7 +159,7 @@ The fixed fields in a frame are the following: - **Flags:** Only available in interpreter frames (as opposed to compiled code frames). It contains a bit mask with the number of arguments, a flag indicating if the context object is set, and a flag indicating if the frame is a method or a closure execution. - **Receiver:** The message receiver _i.e.,_ the object referenced by `self` in the current method execution. -![The structure of a stack frame and the interpreter pointers. %anchor=interpreterVariables](figures/interpreter_variables.pdf) +![The structure of a stack frame and the interpreter pointers. %width=70&anchor=interpreterVariables](figures/interpreter_variables.pdf) The frame on the top of the stack is said to be _active_. The active frame is delimited by the variables `stackPointer` at its top and `framePointer` at its base. @@ -165,7 +172,7 @@ A frame is suspended by pushing its instruction pointer to the stack before crea Thus, the stack can be reconstructed by iterating from the top frame up to its caller's frame start until the end of the stack. Notice that the stack pointer needs not to be stored: a suspended frame's stack pointer is the slot that precedes its suspended instruction pointer, which is found relative to its following frame. -![The call stack as a linked lists of frames in the stack. %anchor=interpreter_stack](figures/interpreter_call_stack.pdf) +![The call stack as a linked lists of frames in the stack. %width=70&anchor=interpreter_stack](figures/interpreter_call_stack.pdf) #### Stack Frame Flags @@ -180,17 +187,20 @@ The stack frame flags store, from lower to higher bits in the bit field: ![Structure of the frame flags in 64 bits. %anchor=interpreterFlags](figures/interpreter_flags.pdf) -Notice that the frame flags have a first constant field in the range 0-7 with value 1. + +>[! Design Tips ] +> Notice that the frame flags have a first constant field in the range 0-7 with value 1. This value is set to make the bitfield look as a tagged small integer, thus guarding the GC walking the stack from interpreting this value as a pointer. -#### Setting up Stack Frames +### Setting up Stack Frames + +Listing *@settingStackFrames@* shows how a new frame is created. -The code that follows illustrates how a new frame is created. -First the current frame is suspended by pushing the instruction pointer and frame pointer. -Once pushed, the `framePointer` can be overridden to mark the start of a new frame at the position of `stackPointer`. -Then the method, `nil` for context, flags, receiver and temps are pushed. +- First the current frame is suspended by pushing the instruction pointer and frame pointer. +- Once pushed, the `framePointer` can be overridden to mark the start of a new frame at the position of `stackPointer`. +- Then the method, `nil` for context, flags, receiver, and temps are pushed. -```caption=Setting up a frame +```caption=Setting up a frame&anchor=settingStackFrames Interpreter >> setUpFrameForMethod: aMethod receiver: rcvr | methodHeader numArgs numTemps rcvr | @@ -284,7 +294,7 @@ Interpreter >> storeAndPopTemporaryVariableBytecode ### Interpreting Message sends -The key operation in object oriented programs is the message send. +The key operation in object-oriented programs is the message send. A message send is a late bound method activation on a receiver with a collection of arguments. We say a message send is late bound because the decision of the method to execute is done at runtime, when the message send is executed. Thus, message sends are implemented as two-step operations: first the interpreter finds the method to execute, then the method found is activated. @@ -306,7 +316,7 @@ send #between:and: When a message send instruction is executed, the stack looks as follows: -![Preparing the Stack to send a message. %anchor=interpreterBeforeSend](figures/interpreter_send.pdf) +![Preparing the Stack to send a message. %width=60&anchor=interpreterBeforeSend](figures/interpreter_send.pdf) #### Message Send Bytecodes @@ -351,17 +361,21 @@ Interpreter >> normalSend self commonSendOrdinary ``` -#### Method Lookup Overview +### Method Lookup Overview The first step of interpreting a message send is finding the method to execute. -Starting from the class of the receiver, the method lookup algorithm linearly searches on each class in the hierarchy, as shown in the code below. -For each class, it takes the method dictionary from the second slot of the class object (`MethodDictionaryIndex` has value 1, second starting from 0), and looks up in the method dictionary for the method with the . +Starting from the class of the receiver, the method lookup algorithm linearly searches on each class in the hierarchy, as shown in the code below (Listing *@methodlookup@*). + +For each class, it takes the method dictionary from the second slot of the class object (`MethodDictionaryIndex` has value 1, second starting from 0), and looks up in the method dictionary for the method with the ???. + Here, the method `lookupMethodInDictionary:` has two possible outcomes: - - If the method is found, it sets the interpreter variable `newMethod` with the method found and returns `true`. - - if the method is not found, it returns `false`. -If the method is found in the current method dictionary, the method `lookupMethodInClass:` returns -```caption=The lookup method +- If the method is found, it sets the interpreter variable `newMethod` with the method found and returns `true`. +- if the method is not found, it returns `false`. + +If the method is found in the current method dictionary, the method `lookupMethodInClass:` returns. + +```caption=The lookup method&anchor=methodlookup Interpreter >> findNewMethodInClassTag: classTagArg ifFound: aBlock "Entry was not found in the cache; look it up the hard way " lkupClass := objectMemory classForClassTag: classTag. @@ -385,26 +399,30 @@ Interpreter >> lookupMethodInClass: class ... ``` -#### Hashing and Structure of Method Dictionaries -Classes store method in a method dictionary. +### Hashing and Structure of Method Dictionaries + +Classes store methods in a method dictionary. A method dictionary is a hash table indexing methods by selector. -Method dictionaries are implemented using two arrays, an array of selector keys and an array of corresponding methods, associated by their index. +Method dictionaries are implemented using two arrays, an array of selector keys, and an array of corresponding methods, associated by their index. If a method dictionary has a given selector at index `n` of the selector array, then the corresponding method is at index `n` of the method array. A note on the implementation is that method dictionaries do not actually employ two arrays. The method dictionary itself _is_ the selector array. -For this, a method dictionary uses a hibrid layout with fixed instance variables and a variable indexable part used as the selector array. +For this, a method dictionary uses a hybrid layout with fixed instance variables and a variable indexable part used as the selector array (See Figure *@methoddict@*). The fixed instance variables are: + - `tally`: the number of occupied slots in the dictionary - `array`: the method array -![Method dictionaries implement hash table using a double array. %anchor=methoddict](figures/interpreter_method_dictionary.pdf) +![Method dictionaries implement hash table using a double array. %width=60&anchor=methoddict](figures/interpreter_method_dictionary.pdf) -Although commonly method dictionaries use selector objects as keys, any object is accepted. +Although method dictionaries use selector objects as keys, any object is accepted. The search algorithm uses a simple hash function over the keys: + - a mask over the integer value of immediate keys. - a mask over the selector's identity hash. + The mask used is the variable size of the method dictionary, ensuring that the index is always within bounds of the dictionary without extra checks. @@ -417,7 +435,7 @@ Interpreter >> methodDictionaryHash: oop mask: mask ``` Small dictionaries, of size 8 or less by default, use a linear search and no hashing. -Larger dictionaries, instead, use a hash based lookup using open addressing with linear probing and wrap around. +Larger dictionaries, instead, use a hash-based lookup using open addressing with linear probing and wrap around. That is, the hashed index is probed to check if it is either the searched element, an empty slot indicated by `nil`, or another element. If the element is found or an empty slot is found, the search stops. However, if another element is found in the hashed index, the next index if probed. @@ -461,7 +479,7 @@ Interpreter >> lookupMethodInDictionary: dictionary index := SelectorStart]]. ```` -#### Method Lookup and Super Sends +### Method Lookup and Super Sends Super sends only differ from normal sends in the way the method lookup starts. Instead of starting from the receiver's class, super sends start from the superclass of the current method's owner class, as shown in the expression `self superclassOf: (self methodClassOf: method).` @@ -470,7 +488,6 @@ The method's owner class is stored in an association in the last literal of the ``` Interpreter >> extSendSuperBytecode | byte selectorIndex numArgs | - byte := self fetchByte. selectorIndex := byte >> 3 + (extA << 5). numArgs := (byte bitAnd: 7) + (extB << 3). @@ -488,14 +505,12 @@ Interpreter >> sendSuper: selectorIndex numArgs: numArgs Interpreter >> superclassSend | superclass | - superclass := self superclassOf: (self methodClassOf: method). lkupClassTag := objectMemory classTagForClass: superclass. self commonSendOrdinary Interpreter >> methodClassOf: methodPointer | literal | - literal := self followLiteral: (objectMemory literalCountOf: methodPointer) - 1 ofMethod: methodPointer. @@ -504,7 +519,7 @@ Interpreter >> methodClassOf: methodPointer ifFalse: [ objectMemory nilObject ] ``` -#### Method Activation +### Method Activation Once the method to execute is found, the following step is to activate the method in the stack. Activating a method implies creating a stack frame for it and prepare the interpreter to execute the instructions in that method. @@ -523,7 +538,7 @@ Interpreter >> activateNewMethod method: newMethod) - 1. ``` -#### Handling Return instructions +### Handling Return instructions Related to send instructions are return instructions which return the control to the caller method. A return instruction has to _undo_ what a send instruction does. @@ -580,10 +595,11 @@ Now that we have seen in detail how message sends and the call stack work, let u We will slightly revisit this calling convention when introducing just-in-time compilation. + ### Interpreting Primitives So far we have concentrated on interpreting bytecodes. -In this chapter section we concentrate on the interpretation of primitive methods. +In this section we concentrate on the interpretation of primitive methods. Recall that primitive methods are normal methods that contain a reference to a primitive function. For example, the next code snippet shows the method `SmallInteger>>+` implementing the addition of two tagged integers. @@ -647,13 +663,13 @@ Primitives that load/store from/into object slots type check the receiver and in Notice that such a design guide primitives to provide a clear separation between the _fast and slow paths_. Primitives generally provide a safe implementation for a fast path, leaving the slow path to the fallback code. -#### Decoding a Method's Primitive Index +### Decoding a Method's Primitive Index We distinguish primitive methods from non-primitive methods using a bit in their first literal. The primitive index of a primitive method is stored in its first bytecode instruction: a call primitive bytecode. The call primitive bytecode is a 3-byte bytecode instruction with opcode 248. The two subsequent bytes represent the primitive index in little endian: `byte1 + (byte2 << 8)`. -The following code shows how the runtime extracts the primitive index of a method. +The following code shows how the runtime extracts the method primitive index of a method. ``` Interpreter >> primitiveIndexOf: methodPointer @@ -677,7 +693,7 @@ Interpreter >> literalCountOfHeader: headerPointer bitAnd: HeaderNumLiteralsMask ```` -#### Primitive Dispatch and Failure Fallback +### Primitive Dispatch and Failure Fallback When a method is executed, the interpreter checks first if the method is a primitive method _i.e.,_ if it has its flag turned on. If that is the case, the interpreter invokes the primitive instead of activating the method. @@ -743,7 +759,7 @@ Then the original message reification is set up by: - popping the arguments from the stack and store them in the new array - allocating a message object and storing in it the original selector, argument array and lookup class -![Reifying the message for does not understand. %anchor=dnu_message](figures/interpreter_dnu.pdf) +![Reifying the message for does not understand. %width=80&anchor=dnu_message](figures/interpreter_dnu.pdf) Then, the `messageSelector` interpreter variable is overwritten with the `doesNotUnderstand:` selector found in the special objects array, and the lookup is restarted. @@ -768,7 +784,7 @@ Interpreter >> lookupMethodInClass: class ^self lookupMethodInClass: class ``` -#### Cannot Interpret +### Cannot Interpret The VM implements also the `cannotInterpret:` hook: a hook that gets activated when the VM finds `nil` in the place of a class' method dictionary. Similar to `doesNotUnderstand:`, `cannotInterpret:` requires massaging the stack and setting up the message reification.