Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Fix and clean up the terminology for stages #807

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 49 additions & 8 deletions src/building/bootstrapping.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,13 +57,13 @@ When you use the bootstrap system, you'll call it through `x.py`.
However, most of the code lives in `src/bootstrap`.
`bootstrap` has a difficult problem: it is written in Rust, but yet it is run
before the rust compiler is built! To work around this, there are two
components of bootstrap: the main one written in rust, and `bootstrap.py`.
components of bootstrap: the main one written in Rust, called `rustbuild`,
and `bootstrap.py`.
`bootstrap.py` is what gets run by x.py. It takes care of downloading the
`stage0` compiler, which will then build the bootstrap binary written in
Rust.
bootstrap compiler, which will then compile `rustbuild`.

Because there are two separate codebases behind `x.py`, they need to
be kept in sync. In particular, both `bootstrap.py` and the bootstrap binary
be kept in sync. In particular, both `bootstrap.py` and `rustbuild`
parse `config.toml` and read the same command line arguments. `bootstrap.py`
keeps these in sync by setting various environment variables, and the
programs sometimes to have add arguments that are explicitly ignored, to be
Expand All @@ -78,6 +78,47 @@ contribution [here][bootstrap-build].

## Stages of bootstrap

Like most other bootstrapping compilers, `rustc` is compiled in stages.
_Unlike_ most other compilers, where `stage0` refers to the bootstrap compiler,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is .. well, not quite true. stage0/bin/rustc is the bootstrap compiler. Really, we should rename it, but "to what?" then the question is.

`stage0` refers to the first compiler built by bootstrap. So the following command:

```sh
x.py build --stage 0 src/rustc
```
Comment on lines +82 to +87
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw, I feel like this doesn't seem correct. I call the stage0 compiler the beta compiler which appears in the stage0 directory which is used during stage0. It creates artifacts in the stage0-* directories, which during stage1 uses Assemble to create the stage1 compiler. There's a phrase in the bootstrapping guide (which I may have written) which I think sums it up succinctly:

Stage 0 uses the stage0 compiler to create stage0 artifacts which will later be uplifted to stage1.

It seems to me a lot of confusion stems from the --stage command-line flag, and that the terminology for "the stage" and "the compiler used in the stage" and "the compiler created by the stage" are all sort of the same. Another problem I think is that a stage's output is written to directories with the same stage, but not actually used from there (they get copied out into stageN+1), so it straddles the boundary. Also, people have different preconceived notions, and the numbering often seems off-by-one (rust-lang/rust#57963 and rust-lang/rust#59864). But just changing that in the documentation I think will make things worse.


will actually perform a full build of rustc. Confusingly, the `build/$target/stageN` directories are named after the compiler they are *used to build*, not the commands you need to build them.

- **Stage 0:** The stage0 compiler is built by the bootstrap compiler.
The bootstrap compiler is usually (you can configure `x.py` to use
something else) the current beta `rustc` compiler and its associated dynamic
libraries (which `x.py` will download for you). This bootstrap compiler is then
used to compile `rustbuild`, `std`, and `rustc` (plus a few other tools, like `tidy`). When compiling
`rustc`, this bootstrap compiler uses the freshly compiled `std`.
There are two concepts at play here: a compiler (with its set of dependencies)
and its 'target' or 'object' libraries (`std` and `rustc`).
Both are staged, but in a staggered manner.
The `stage0` standard library is the one _linked_ to `stage0` rustc
(allowing you to `use std::vec::Vec` from within the compiler itself),
while `stage1 libstd` is the library _built_ by stage1. `libstd` also include `libcore`,
so without it there are very few programs that rustc can compile.

- **Stage 1:** In theory, the stage0 compiler is functionally identical to the
stage1 compiler, but in practice there are subtle differences. In
particular, the stage0 compiler was built by bootstrap and
hence not by the source in your working directory: this means that
the symbol names used in the compiler source may not match the
symbol names that would have been made by the stage1 compiler. This is
important when using dynamic linking because Rust does not have ABI compatibility
between versions. This primarily manifests when tests try to link with any
of the `rustc_*` crates or use the (now deprecated) plugin infrastructure.
These tests are marked with `ignore-stage1`. The `stage1` is because
these plugins *link* to the stage1 compiler, even though they are being
*built* by stage0. Rebuilding again also gives us the benefit of the latest optimizations (i.e. those added since the beta fork).

- _(Optional)_ **Stage 2**: to sanity check our new compiler, we
can build the libraries with the stage1 compiler. The result ought
to be identical to before, unless something has broken.

This is a detailed look into the separate bootstrap stages. When running
`x.py` you will see output such as:

Expand Down Expand Up @@ -163,11 +204,11 @@ The following tables indicate the outputs of various stage actions:
`--stage=2` stops here.

Note that the convention `x.py` uses is that:
- A "stage N artifact" is an artifact that is _produced_ by the stage N compiler.
- The "stage (N+1) compiler" is assembled from "stage N artifacts".
- A `--stage N` flag means build _with_ stage N.
- A "stage N artifact" is an artifact that is part of stage N.
- The compiler in `build/$target/stage(N+1)` is assembled from "stage N artifacts".
- A `--stage N` flag means build a stage N artifact.

In short, _stage 0 uses the stage0 compiler to create stage0 artifacts which
In short, _stage 0 uses the bootstrap compiler to create stage0 artifacts which
will later be uplifted to stage1_.

Every time any of the main artifacts (`std` and `rustc`) are compiled, two
Expand Down
32 changes: 0 additions & 32 deletions src/building/how-to-build-and-run.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,38 +109,6 @@ to build it, such as `libstd` and other tooling, may use some unstable features
internally, requiring a specific version which understands these unstable
features.

The result is that compiling `rustc` is done in stages:

- **Stage 0:** the stage0 compiler is usually (you can configure `x.py` to use
something else) the current _beta_ `rustc` compiler and its associated dynamic
libraries (which `x.py` will download for you). This stage0 compiler is then
used only to compile `rustbuild`, `std`, and `rustc`. When compiling
`rustc`, this stage0 compiler uses the freshly compiled `std`.
There are two concepts at play here: a compiler (with its set of dependencies)
and its 'target' or 'object' libraries (`std` and `rustc`).
Both are staged, but in a staggered manner.
- **Stage 1:** the code in your clone (for new version) is then
compiled with the stage0 compiler to produce the stage1 compiler.
However, it was built with an older compiler (stage0), so to
optimize the stage1 compiler we go to next the stage.
- In theory, the stage1 compiler is functionally identical to the
stage2 compiler, but in practice there are subtle differences. In
particular, the stage1 compiler itself was built by stage0 and
hence not by the source in your working directory: this means that
the symbol names used in the compiler source may not match the
symbol names that would have been made by the stage1 compiler. This is
important when using dynamic linking and the lack of ABI compatibility
between versions. This primarily manifests when tests try to link with any
of the `rustc_*` crates or use the (now deprecated) plugin infrastructure.
These tests are marked with `ignore-stage1`.
- **Stage 2:** we rebuild our stage1 compiler with itself to produce
the stage2 compiler (i.e. it builds itself) to have all the _latest
optimizations_. (By default, we copy the stage1 libraries for use by
the stage2 compiler, since they ought to be identical.)
- _(Optional)_ **Stage 3**: to sanity check our new compiler, we
can build the libraries with the stage2 compiler. The result ought
to be identical to before, unless something has broken.

To read more about the bootstrap process, [read this chapter][bootstrap].

[bootstrap]: ./bootstrapping.md
Expand Down