Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update memory.md #1907

Merged
merged 1 commit into from
Sep 24, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 33 additions & 19 deletions src/memory.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Memory Management in Rustc

Rustc tries to be pretty careful how it manages memory. The compiler allocates
_a lot_ of data structures throughout compilation, and if we are not careful,
it will take a lot of time and space to do so.
Generally rustc tries to be pretty careful how it manages memory.
The compiler allocates _a lot_ of data structures throughout compilation,
and if we are not careful, it will take a lot of time and space to do so.

One of the main way the compiler manages this is using [arena]s and [interning].

Expand All @@ -11,16 +11,18 @@ One of the main way the compiler manages this is using [arena]s and [interning].

## Arenas and Interning

We create a LOT of data structures during compilation. For performance reasons,
we allocate them from a global memory pool; they are each allocated once from a
long-lived *arena*. This is called _arena allocation_. This system reduces
allocations/deallocations of memory. It also allows for easy comparison of
types for equality: for each interned type `X`, we implemented [`PartialEq for
X`][peqimpl], so we can just compare pointers. The [`CtxtInterners`] type
contains a bunch of maps of interned types and the arena itself.
Since A LOT of data structures are created during compilation, for performance
reasons, we allocate them from a global memory pool.
Each are allocated once from a long-lived *arena*.
This is called _arena allocation_.
This system reduces allocations/deallocations of memory.
It also allows for easy comparison of types (more on types [here](./ty.md)) for equality:
for each interned type `X`, we implemented [`PartialEq` for X][peqimpl],
so we can just compare pointers.
The [`CtxtInterners`] type contains a bunch of maps of interned types and the arena itself.

[peqimpl]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Ty.html#implementations
[`CtxtInterners`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.CtxtInterners.html#structfield.arena
[peqimpl]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Ty.html#implementations

### Example: `ty::TyKind`

Expand All @@ -30,7 +32,7 @@ compiler doesn’t naively allocate from the buffer. Instead, we check if that
type was already constructed. If it was, we just get the same pointer we had
before, otherwise we make a fresh pointer. With this schema if we want to know
if two types are the same, all we need to do is compare the pointers which is
efficient. `TyKind` should never be constructed on the stack, and it would be unusable
efficient. [`TyKind`] should never be constructed on the stack, and it would be unusable
if done so.
You always allocate them from this arena and you always intern them so they are
unique.
Expand All @@ -43,26 +45,33 @@ to that buffer is freed and our `'tcx` references would be invalid.
In addition to types, there are a number of other arena-allocated data structures that you can
allocate, and which are found in this module. Here are a few examples:

- [`GenericArgs`], allocated with `mk_args` – this will intern a slice of types, often used
- [`GenericArgs`], allocated with [`mk_args`] – this will intern a slice of types, often used
to specify the values to be substituted for generics args (e.g. `HashMap<i32, u32>` would be
represented as a slice `&'tcx [tcx.types.i32, tcx.types.u32]`).
- [`TraitRef`], typically passed by value – a **trait reference** consists of a reference to a trait
along with its various type parameters (including `Self`), like `i32: Display` (here, the def-id
would reference the `Display` trait, and the args would contain `i32`). Note that `def-id` is
defined and discussed in depth in the [`AdtDef and DefId`][adtdefid] section.
- [`Predicate`] defines something the trait system has to prove (see `traits` module).
- [`Predicate`] defines something the trait system has to prove (see [traits] module).

[`GenericArgs`]: ./ty_module/generic_arguments.md#the-genericargs-type
[adtdefid]: ./ty_module/generic_arguments.md#adtdef-and-defid
[`TraitRef`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/type.TraitRef.html
[`AdtDef` and `DefId`]: ./ty.md#adts-representation
[`def-id`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/def_id/struct.DefId.html
[`GenericArgs`]: ./generic_arguments.html#GenericArgs
[`mk_args`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/context/struct.TyCtxt.html#method.mk_args
[adtdefid]: ./ty_module/generic_arguments.md#adtdef-and-defid
[`Predicate`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Predicate.html

[`TraitRef`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.TraitRef.html
[`ty::TyKind`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/sty/type.TyKind.html
[`TyKind`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/ty_kind/enum.TyKind.html
[traits]: ./traits/resolution.md

## The tcx and how it uses lifetimes
## The `tcx` and how it uses lifetimes

The `tcx` ("typing context") is the central data structure in the compiler. It is the context that
you use to perform all manner of queries. The struct `TyCtxt` defines a reference to this shared
The typing context (`tcx`) is the central data structure in the compiler. It is the context that
you use to perform all manner of queries. The `struct` [`TyCtxt`] defines a reference to this shared
context:

```rust,ignore
Expand All @@ -76,10 +85,13 @@ As you can see, the `TyCtxt` type takes a lifetime parameter. When you see a ref
lifetime like `'tcx`, you know that it refers to arena-allocated data (or data that lives as long as
the arenas, anyhow).

[`TyCtxt`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.TyCtxt.html

### A Note On Lifetimes

The Rust compiler is a fairly large program containing lots of big data
structures (e.g. the AST, HIR, and the type system) and as such, arenas and
structures (e.g. the [Abstract Syntax Tree (AST)][ast], [High-Level Intermediate
Representation (`HIR`)][hir], and the type system) and as such, arenas and
references are heavily relied upon to minimize unnecessary memory use. This
manifests itself in the way people can plug into the compiler (i.e. the
[driver](./rustc-driver.md)), preferring a "push"-style API (callbacks) instead
Expand All @@ -90,4 +102,6 @@ duplication while also preventing a lot of the ergonomic issues due to many
pervasive lifetimes. The [`rustc_middle::ty::tls`][tls] module is used to access these
thread-locals, although you should rarely need to touch it.

[ast]: ./ast-validation.md
[hir]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/index.html
[tls]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/tls/index.html