diff --git a/.gitignore b/.gitignore new file mode 100644 index 000000000..7585238ef --- /dev/null +++ b/.gitignore @@ -0,0 +1 @@ +book diff --git a/README.md b/README.md index d37111a97..1ce2759a7 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,5 @@ # ykdocs + Documentation for the Yorick Metatracer. + +Use [mdbook](https://crates.io/crates/mdbook) to generate the docs. diff --git a/book.toml b/book.toml new file mode 100644 index 000000000..9cf9b3caf --- /dev/null +++ b/book.toml @@ -0,0 +1,6 @@ +[book] +authors = ["The Software Development Team"] +language = "en" +multilingual = false +src = "src" +title = "Yorick" diff --git a/src/SUMMARY.md b/src/SUMMARY.md new file mode 100644 index 000000000..2e522564a --- /dev/null +++ b/src/SUMMARY.md @@ -0,0 +1,8 @@ +# Summary + +- [Introduction](./intro.md) +- [Technical Details](./tech/index.md) + - [Intermediate Representations](./tech/irs.md) + - [Software and Hardware Tracing](./tech/sw_hw.md) +- [Miscellaneous Topics](./misc/index.md) + - [Continuous Integration Cycles](./misc/ci-cycles.md) diff --git a/src/intro.md b/src/intro.md new file mode 100644 index 000000000..a265803ea --- /dev/null +++ b/src/intro.md @@ -0,0 +1,14 @@ +# Introduction + +Yorick is a fork of the Rust programming language which aims to build a +meta-tracing system using Rust as a meta-language. Imagine PyPy, but where your +interpreter is written in Rust instead of RPython. + +Note that the system is in early stages of development: don't expect it to work +yet. + +Yorick is developed by [The Software Development Team](https://soft-dev.org/) +at King's College London. + +The source for this documentations is +[available on GitHub](https://github.com/softdevteam/ykdocs). diff --git a/src/misc/ci-cycles.md b/src/misc/ci-cycles.md new file mode 100644 index 000000000..dcbab5e11 --- /dev/null +++ b/src/misc/ci-cycles.md @@ -0,0 +1,69 @@ +# Continuous Integration (CI) Cycles + +When working on Yorick, it's possible to create "CI cycles", which require +special handling. This chapter outlines the problem scenario and how we work +around it. + +## Dependency Architecture of ykrustc. + +There are two main repos for Yorick: + + * `ykrustc`: the compiler. + * `yk`: the other stuff, including `ykpack`. + +`ykpack` is the library that deals with SIR serialisation and de-serialisation. +The compiler uses it to encode SIR into an ELF section, and the JIT runtime +uses it to decode the section. + +This leads to a problem: if you change the format of the SIR (change the +serialised types in any way that would change the binary representation once +serialised), then CI cannot succeed. This is because the `yk` repo needs to be +built with a `ykrustc` which uses the new `ykpack`, but `yk` itself contains +the new `ykpack`. + +## How do we Break the Cycle? + +We break the cycle using a 3-PR model as follows: + +1. **Raise the first two PRs** + +The change author raises two PRs: one for `yk` and one for `ykrustc`. + +2. **yk review** + +The reviewer conducts a review of the changes to the `yk` repo first. We get +the branch squashed and ready, but we don't merge it just yet. If we were to +try it would fail. Remember: we need the compiler change in first. + +3. **Review ykrustc** + +Next the reviewer switches to the `ykrustc` pull request and conducts a review +there. We get this branch squashed and ready to merge, but stop short of +invoking bors. + +4. **Cycle breaker commit** + +Next the reviewer should ask the PR author to push a "cycle breaker" commit to +`ykrustc`. This commit overrides the compiler's dependency on `ykpack`, making +it instead use the commit hash for the head of the branch we reviewed in stage +2. + +[Here's an example cycle breaker commit](https://github.com/softdevteam/ykrustc/commit/abd1c2b7669c4ab6be8f9a9e6c1704a7e70c2088) + +5. **Merge ykrustc** + +Once the cycle breaker commit is added, the reviewer can invoke bors to have +the compiler change merged. + +6. **Merge the ykpack change** + +Assuming the previous step worked, the CI server has now cached a version of +`ykrustc` which uses the new `ykpack`. Now we can continue to invoke bors on +the branch we reviewed in step 2 and CI will use the newly cached compiler to +build it, before merging the branch. + +7. **Undo the cycle breaker commit** + +Now that the `ykpack` changes are on the master branch of `yk`, we can revert +the cycle breaker commit we introduced to `ykrustc` in step 4. The change +author raises a new PR for this. diff --git a/src/misc/index.md b/src/misc/index.md new file mode 100644 index 000000000..202b6b11e --- /dev/null +++ b/src/misc/index.md @@ -0,0 +1,3 @@ +# Miscellaneous Topics + +Topics which don't belong anywhere else live here. diff --git a/src/tech/index.md b/src/tech/index.md new file mode 100644 index 000000000..cb932e2bd --- /dev/null +++ b/src/tech/index.md @@ -0,0 +1,3 @@ +# Technical Details + +This chapter contains the nitty-gritty details of the system. diff --git a/src/tech/irs.md b/src/tech/irs.md new file mode 100644 index 000000000..8eed014c1 --- /dev/null +++ b/src/tech/irs.md @@ -0,0 +1,34 @@ +# Intermediate Representations + +Yorick uses two additional intermediate representations (IRs) on top of those +already found in rustc: + + * Serialised IR (SIR) + * Tracing IR (TIR) + +## Serialised IR (SIR) + +During compilation the Rust's Middle Intermediat Representation (MIR) is +traversed and serialised into a simpler +representation called SIR. SIR is a flow-graph IR very similar to MIR. It +mostly exists so that high-level program information can be reconstructed at +runtime without a need for an instance of the compiler (and its `tcx` struct). + +The resulting SIR is serialised using serde and linked into the `.yk_sir` ELF +section of binaries compiled with `ykrustc`. At runtime, the tracer will +collect SIR traces, which can then be mapped back to the serialised SIR +information. + +The SIR data structures are in an +[externally maintained crate](https://github.com/softdevteam/yk/tree/master/ykpack) +so that they can be shared by the compiler and the JIT runtime. + +SIR lowering +[is performed here](https://github.com/softdevteam/ykrustc/blob/master/src/librustc_yk_sections/emit_sir.rs). + +## Tracing IR (TIR) + +TIR is basically SIR with guards instead of branches. TIR is the basis for a +compiled trace. + +TIR [lives in yktrace](https://github.com/softdevteam/yk/tree/master/yktrace). diff --git a/src/tech/sw_hw.md b/src/tech/sw_hw.md new file mode 100644 index 000000000..492647e91 --- /dev/null +++ b/src/tech/sw_hw.md @@ -0,0 +1,69 @@ +# Software and Hardware Tracing + +Yorick has two tracing modes: + + * Software tracing. + * Hardware tracing. + +## Software Tracing + +In software tracing mode, `ykrustc` inserts calls to a trace recorder at the +beginning of each basic block. The arguments to the call identify the +(statically known) location of the call-site. At runtime, when tracing is +enabled the trace recorder stores each location into a trace buffer. + +### Further Reading + + * A + [MIR pass](https://github.com/softdevteam/ykrustc/blob/master/src/librustc_mir/transform/add_yk_swt_calls.rs) + adds calls to the trace recorder. + + * The + [trace recorder](https://github.com/softdevteam/ykrustc/blob/master/src/libcore/yk_swt.rs) + lives in `libcore` and is implemented in C so that its contents are not + recursively traced. + +## Hardware Tracing + +In hardware tracing mode, we use +[Intel Processor Trace](https://software.intel.com/en-us/blogs/2013/09/18/processor-tracing) +to do trace collection. The chip gives us a trace of virtual addresses which we +then map back to SIR locations using DWARF labels (`DILabel`). + +### Further Reading + +* The LLVM code for the insertion of the Yorick debug labels can be found + [here](https://github.com/softdevteam/ykrustc/blob/master/src/rustllvm/RustWrapper.cpp). + Those functions can be accessed from within rustc's code generator using + helper functions in the + [codegen builder](https://github.com/softdevteam/ykrustc/blob/master/src/librustc_codegen_llvm/builder.rs). + +* The actual label generation happens during the code generation of + [blocks](https://github.com/softdevteam/ykrustc/blob/master/src/librustc_codegen_ssa/mir/block.rs). + Labels are inserted at the beginning of each block, as well as when returning + from function calls. + +## Selecting a Tracing Mode + +When you build a Rust program that you want to trace, both `ykrustc` (for the +standard library), and your code must be built for a specific tracing backend. + +To choose a backend, you pass `-C tracer=T` to `rustc`, where `T` is one of +`hw`, `sw, or `off`. Passing `off` is the same as omitting the option +altogether. + +If you are using `cargo`, you will need to add this flag to the `RUSTFLAGS` +environment. + +`-C tracer` is a tracked flag: changing it will trigger a rebuild (but bear in +mind that your standard library will not be rebuilt. See below). + +Note that hardware tracing currently doesn't work together with optimisations, +and is thus automatically disabled whenever optimisations are enabled. + +## Considerations when Building ykrustc Itself. + +When you build ykrustc using `x.py` you will need to decide what tracing +support the standard library should be built with. You must set +`STD_TRACER_MODE` to `hw`, `sw`, or `off`. If you fail to set this variable, +the bootstrap will refuse to run.