Skip to content

Interpreter optimization resources

Mamy Ratsimbazafy edited this page Jun 11, 2018 · 14 revisions

This is a collection of links for interpreter optimizations. Target audience is Nimbus developers.

Pure interpreter

Description Link
Design of a bytecode interpreter, including Stack vs Register, how to represent values (single type, tagged unions, untagged union, interface/virtual function) http://gameprogrammingpatterns.com/bytecode.html
Writing a fast interpreter: control-flow graph optimization from LuaJIT author http://lua-users.org/lists/lua-l/2011-02/msg00742.html
In-depth dive on how to write an emulator http://fms.komkon.org/EMUL8/HOWTO.html
Review of interpreter dispatch strategies to limit branch mispredictions: direct threaded code vs indirect threaded code vs token threaded code vs switch based dispatching vs replicated switch dispatching + Bibliography http://realityforge.org/code/virtual-machines/2011/05/19/interpreters.html
Fast VMs without assembly - speeding up the interpreter loop: threaded interpreter, duff's device, JIT, Nostradamus distributor http://www.emulators.com/docs/nx25_nostradamus.htm
Switch case vs Table vs Function caching/dynarec http://ngemu.com/threads/switch-case-vs-function-table.137562/
Jump tables vs Switch http://www.cipht.net/2017/10/03/are-jump-tables-always-fastest.html
Paper: branch prediction and the performance of Interpreters - Don't trust the folklore https://hal.inria.fr/hal-01100647/document
Paper by author of ANTLR: The Structure and Performance of Efficient Interpreters https://www.jilp.org/vol5/v5paper12.pdf
Paper by author of ANTLR introducing dynamic replication: Optimizing Indirect Branch Prediction Accuracy in Virtual Machine Interpreter https://www.scss.tcd.ie/David.Gregg/papers/toplas05.pdf
Benchmarking VM Dispatch strategies in Rust: Switch vs unrolled switch vs tail call dispatch vs Computed Gotos https://pliniker.github.io/post/dispatchers/
Computed Gotos for fast dispatching in Python https://github.com/python/cpython/blob/9d6171ded5c56679bc295bacffc718472bcb706b/Python/ceval.c#L571-L608

JIT / Dynamic recompilation

Description Link
Dynamic recompilation introduction http://ngemu.com/threads/dynamic-recompilation-an-introduction.20491/
Dynamic recompilation guide with Chip8 https://github.com/marco9999/Dynarec_Guide/blob/master/Introduction%20to%20Dynamic%20Recompilation%20in%20Emulation.pdf
Dynamic recompilation - accompanying source code https://github.com/marco9999/Super8_jitcore/
Presentation: Interpretation (basic indirect and direct threaded) vs binary translation http://www.ittc.ku.edu/~kulkarni/teaching/EECS768/slides/chapter2.pdf
Threaded interpretation vs Dynarec http://www.emutalk.net/threads/55275-Threaded-interpretation-vs-Dynamic-Binary-Translation
Dynamic recompilation wiki http://emulation.gametechwiki.com/index.php/Dynamic_recompilation

Context Threading

Context threading is a promising alternative to Direct/Indirect/Call/Token/Subroutine/Switch threading that makes interpretation nice with the hardware branch predictor. Practical implementation wanted:

Codebases (WIP)

Clone this wiki locally