Skip to content
Permalink

Comparing changes

This is a direct comparison between two commits made in this repository or its related repositories. View the default comparison for this range or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: rust-lang/rustc-dev-guide
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: ae323960c3bf28071942eee4e457ce9d7f4d76a3
Choose a base ref
..
head repository: rust-lang/rustc-dev-guide
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: 1a83967f375accf2b89ab238d2040887c1d1c272
Choose a head ref
Showing with 29 additions and 68 deletions.
  1. +2 −1 .travis.yml
  2. +1 −1 src/backend/backend-agnostic.md
  3. +1 −1 src/backend/codegen.md
  4. +21 −61 src/backend/monomorph.md
  5. +4 −4 src/overview.md
3 changes: 2 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
@@ -19,7 +19,8 @@ script:
- git log --oneline | head -n 10
- mdbook build
notifications:
email: false
email:
on_success: never
env:
global:
secure: YQX/AWq5KsvAFYqcCK6c1DmOZX9EMrecBM5qnc4uE2HvEBS+x0l8xatI2Nv8U9eiasZYfsqmHn0ANvxu6e4oqL15m4cVsdliCzdkrPsDapxTnwwJvMQg+yHZiEd5BPlaDQt/wYvP8QBXgQsXoAJKrfAS+BFsowBFHt/LOFOunbAQrtQZqwqrnI6+xh+2TRMckws/VcTLRqwl3pyEyfacJhbbv1V3gJh7Y17hELsgsP7+7cMXT0bK6dtf7a9vne9Hsm5fw7VeMKBn1/dJ82fyEK6HHjkjdw1/OoY35YVyNZ/9ZxP2u1ClEXzCRJQ2CvKr8Tuoh/AuoL0pwrfhOTaOuWU0QZT4QBqjTimsgBLqiJicMiSndgsXinLWvlDqrMS1XfleqCKqAQy9AJTCR1LnwR90/HRxfE5YDAL/mbc0Su4jj+l5Zv3UE8vUqFE34E/jzip17JkDT5aMkl4bgW65lqJE7SLWl7gXT7eYbPEtQZoucR1hkSsBu/4YTvcxSlD98spWZ68mWwYyjLJSQDES+GefUnHJ/RbBVl9pW+sL7jXJ+kZ/NBCtCIgrkGchudEMDEvS6rcOzwCejxqL1of0jYHGopkBXSVHOPneWIdNeKXwBZA9hp0yKh0sWwrKHrA3wYhS/kF9uO19l/RnSTXAfApYR/yJUbYliuMJYCgNeKE=
2 changes: 1 addition & 1 deletion src/backend/backend-agnostic.md
Original file line number Diff line number Diff line change
@@ -4,7 +4,7 @@ In the future, it would be nice to allow other codegen backends (e.g.
[Cranelift]). To this end, `librustc_codegen_ssa` provides an
abstract interface for all backends to implement.

[Cranelift]: https://github.com/bytecodealliance/wasmtime/tree/HEAD/cranelift
[Cranelift]: https://github.com/bytecodealliance/wasmtime/tree/master/cranelift

> The following is a copy/paste of a README from the rust-lang/rust repo.
> Please submit a PR if it needs updating.
2 changes: 1 addition & 1 deletion src/backend/codegen.md
Original file line number Diff line number Diff line change
@@ -7,7 +7,7 @@ codegen itself. It's worth noting, though, that in the rust source code, many
parts of the backend have `codegen` in their names (there are no hard
boundaries).

[Cranelift]: https://github.com/bytecodealliance/wasmtime/tree/HEAD/cranelift
[Cranelift]: https://github.com/bytecodealliance/wasmtime/tree/master/cranelift

> NOTE: If you are looking for hints on how to debug code generation bugs,
> please see [this section of the debugging chapter][debugging].
82 changes: 21 additions & 61 deletions src/backend/monomorph.md
Original file line number Diff line number Diff line change
@@ -61,73 +61,33 @@ units](../appendix/glossary.md#codegen-unit).
## Polymorphization

As mentioned above, monomorphization produces fast code, but it comes at the
cost of compile time and binary size. [MIR optimizations][miropt] can help a
bit with this.

In addition to MIR optimizations, rustc attempts to determine when fewer
copies of functions are necessary and avoid making those copies - known
as "polymorphization". When a function-like item is found during
monomorphization collection, the
[`rustc_mir::monomorphize::polymorphize::unused_generic_params`][polymorph]
query is invoked, which traverses the MIR of the item to determine on which
generic parameters the item might not need duplicated.

Currently, polymorphization only looks for unused generic parameters. These
are relatively rare in functions, but closures inherit the generic
parameters of their parent function and it is common for closures to not
use those inherited parameters. Without polymorphization, a copy of these
closures would be created for each copy of the parent function. By
creating fewer copies, less LLVM IR is generated and needs processed.

`unused_generic_params` returns a `FiniteBitSet<u64>` where a bit is set if
the generic parameter of the corresponding index is unused. Any parameters
after the first sixty-four are considered used.

The results of polymorphization analysis are used in the
[`Instance::polymorphize`][inst_polymorph] function to replace the
[`Instance`][inst]'s substitutions for the unused generic parameters with their
identity substitutions.

Consider the example below:
cost of compile time and binary size. [MIR
optimizations](../mir/optimizations.md) can help a bit with this. Another
optimization currently under development is called _polymorphization_.

The general idea is that often we can share some code between monomorphized
copies of code. More precisely, if a MIR block is not dependent on a type
parameter, it may not need to be monomorphized into many copies. Consider the
following example:

```rust
fn foo<A, B>() {
let x: Option<B> = None;
pub fn f() {
g::<bool>();
g::<usize>();
}

fn main() {
foo::<u16, u32>();
foo::<u64, u32>();
fn g<T>() -> usize {
let n = 1;
let closure = || n;
closure()
}
```

During monomorphization collection, `foo` will be collected with the
substitutions `[u16, u32]` and `[u64, u32]` (from its invocations in `main`).
`foo` has the identity substitutions `[A, B]` (or
`[ty::Param(0), ty::Param(1)]`).

Polymorphization will identify `A` as being unused and it will be replaced in
the substitutions with the identity parameter before being added to the set
of collected items - thereby reducing the copies from two (`[u16, u32]` and
`[u64, u32]`) to one (`[A, u32]`).

`unused_generic_params` will also invoked during code generation when the
symbol name for `foo` is being computed for use in the callsites of `foo`
(which have the regular substitutions present, otherwise there would be a
symbol mismatch between the caller and the function).

As a result of polymorphization, items collected during monomorphization
cannot be assumed to be monomorphic.

It is intended that polymorphization be extended to more advanced cases,
such as where only the size/alignment of a generic parameter are required.
In this case, we would currently collect `[f, g::<bool>, g::<usize>,
g::<bool>::{{closure}}, g::<usize>::{{closure}}]`, but notice that the two
closures would be identical -- they don't depend on the type parameter `T` of
function `g`. So we only need to emit one copy of the closure.

More details on polymorphization are available in the
[master's thesis][thesis] associated with polymorphization's initial
implementation.
For more information, see [this thread on github][polymorph].

[miropt]: ../mir/optimizations.md
[polymorph]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/monomorphize/polymorphize/fn.unused_generic_params.html
[inst]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/instance/struct.Instance.html
[inst_polymorph]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/instance/struct.Instance.html#method.polymorphize
[thesis]: https://davidtw.co/media/masters_dissertation.pdf
[polymorph]: https://github.com/rust-lang/rust/issues/46477
8 changes: 4 additions & 4 deletions src/overview.md
Original file line number Diff line number Diff line change
@@ -87,10 +87,10 @@ we'll talk about that later.
- We then begin what is vaguely called _code generation_ or _codegen_.
- The [code generation stage (codegen)][codegen] is when higher level
representations of source are turned into an executable binary. `rustc`
uses LLVM for code generation. The first step is to convert the MIR
to LLVM Intermediate Representation (LLVM IR). This is where the MIR
is actually monomorphized, according to the list we created in the
previous step.
uses LLVM for code generation. The first step is the MIR is then
converted to LLVM Intermediate Representation (LLVM IR). This is where
the MIR is actually monomorphized, according to the list we created in
the previous step.
- The LLVM IR is passed to LLVM, which does a lot more optimizations on it.
It then emits machine code. It is basically assembly code with additional
low-level types and annotations added. (e.g. an ELF object or wasm).