Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make ensure_sufficient_stack() non-generic, using cargo-llvm-lines #76680

Merged

Conversation

Julian-Wollersberger
Copy link
Contributor

Inspired by this blog post from @nnethercote, I used cargo-llvm-lines on the rust compiler itself, to improve it's compile time. This PR contains only one low-hanging fruit, but I also want to share some measurements.

The function ensure_sufficient_stack() was monomorphized 1500 times, and with it the stacker and psm crates, for a total of 1.5% of all llvm IR lines. With some trickery I convert the generic closure into a dynamic one, and thus all that code is only monomorphized once.

Measurements

Getting these numbers took some fiddling with CLI flags and I modified cargo-llvm-lines to read from a folder instead of invoking cargo. Commands I used:

./x.py clean
RUSTFLAGS="--emit=llvm-ir -C link-args=-fuse-ld=lld -Z self-profile=profile" CARGOFLAGS_BOOTSTRAP="-Ztimings" RUSTC_BOOTSTRAP=1 ./x.py build -i --stage 1 library/std

# Then manually copy all .ll files into a folder I hardcoded in cargo-llvm-lines in main.rs#L115
cd ../cargo-llvm-lines
cargo run llvm-lines

The result is this list (see first 500 lines ), before the change:

  Lines            Copies        Function name
  -----            ------        -------------
  16894211 (100%)  58417 (100%)  (TOTAL)
   2223855 (13.2%)   502 (0.9%)  rustc_query_system::query::plumbing::get_query_impl::{{closure}}
   1331918 (7.9%)   1287 (2.2%)  hashbrown::raw::RawTable<T>::reserve_rehash
    774434 (4.6%)  12043 (20.6%) core::ptr::drop_in_place
    294170 (1.7%)    499 (0.9%)  rustc_query_system::dep_graph::graph::DepGraph<K>::with_task_impl
    245410 (1.5%)   1552 (2.7%)  psm::on_stack::with_on_stack
    210311 (1.2%)      1 (0.0%)  rustc_target::spec::load_specific
    200962 (1.2%)    513 (0.9%)  rustc_query_system::query::plumbing::get_query_impl
    190704 (1.1%)      1 (0.0%)  rustc_middle::ty::query::<impl rustc_middle::ty::context::TyCtxt>::alloc_self_profile_query_strings
    180272 (1.1%)    468 (0.8%)  rustc_query_system::query::plumbing::load_from_disk_and_cache_in_memory
    177396 (1.1%)    114 (0.2%)  rustc_query_system::query::plumbing::force_query_impl
    161134 (1.0%)    445 (0.8%)  rustc_query_system::dep_graph::graph::DepGraph<K>::with_anon_task
    141551 (0.8%)    186 (0.3%)  rustc_query_system::query::plumbing::incremental_verify_ich
    110191 (0.7%)      7 (0.0%)  rustc_middle::ty::context::_DERIVE_rustc_serialize_Decodable_D_FOR_TypeckResults::<impl rustc_serialize::serialize::Decodable<__D> for rustc_middle::ty::context::TypeckResults>::decode::{{closure}}
    108590 (0.6%)    420 (0.7%)  core::ops::function::FnOnce::call_once
     88488 (0.5%)     21 (0.0%)  rustc_query_system::dep_graph::graph::DepGraph<K>::try_mark_previous_green
     86368 (0.5%)      1 (0.0%)  rustc_middle::ty::query::stats::query_stats
     85654 (0.5%)   3973 (6.8%)  <&T as core::fmt::Debug>::fmt
     84475 (0.5%)      1 (0.0%)  rustc_middle::ty::query::Queries::try_collect_active_jobs
     81220 (0.5%)    862 (1.5%)  <hashbrown::raw::RawIterHash<T> as core::iter::traits::iterator::Iterator>::next
     77636 (0.5%)     54 (0.1%)  core::slice::sort::recurse
     66484 (0.4%)    461 (0.8%)  <hashbrown::raw::RawIter<T> as core::iter::traits::iterator::Iterator>::next

All .ll files together had 4.4GB. After my change they had 4.2GB. So a few percent less code LLVM has to process. Hurray!
Sadly, I couldn't measure an actual wall-time improvement. Watching YouTube while compiling added to much noise...

Here is the top of the list after the change:

  16460866 (100%)  58341 (100%)  (TOTAL)
   1903085 (11.6%)   504 (0.9%)  rustc_query_system::query::plumbing::get_query_impl::{{closure}}
   1331918 (8.1%)   1287 (2.2%)  hashbrown::raw::RawTable<T>::reserve_rehash
    777796 (4.7%)  12031 (20.6%) core::ptr::drop_in_place
    551462 (3.4%)   1519 (2.6%)  rustc_data_structures::stack::ensure_sufficient_stack::{{closure}}

Note that the total was reduced by 430 000 lines and psm::on_stack::with_on_stack has disappeared. Instead rustc_data_structures::stack::ensure_sufficient_stack::{{closure}} appeared. I'm confused about that one, but it seems to consist of inlined calls to rustc_query_system::* stuff.

Further note the other two big culprits in this list: rustc_query_system and hashbrown. These two are monomorphized many times, the query system summing to more than 20% of all lines, not even counting code that's probably inlined elsewhere.
Assuming compile times scale linearly with llvm-lines, that means a possible 20% compile time reduction.

Reducing eg. get_query_impl would probably need a major refactoring of the qery system though. Everything in there is generic over multiple types, has associated types and passes generic Self arguments by value. Which means you can't simply make things dyn.


This PR is a small step to make rustc compile faster and thus make contributing to rustc less painful. Nonetheless I love Rust and I find the work around rustc fascinating :)

@rust-highfive
Copy link
Collaborator

r? @estebank

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 13, 2020
@jyn514
Copy link
Member

jyn514 commented Sep 13, 2020

@bors try @rust-timer queue

I wish perf measured compile times for the rust compiler itself :(

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Sep 13, 2020

⌛ Trying commit d744dc857ba132dfc9016717864aec7f5d2f2a24 with merge f242334c1e2e719cf1cba923923ad8ec62affb71...

@jyn514 jyn514 added I-compiletime Issue: Problems and improvements with respect to compile times. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 13, 2020
/// Measuring with cargo-llvm-lines revealed that `psm::on_stack::with_on_stack` was
/// monomorphized 1552 times and was responsible for 1.5% of rustc's total llvm-lines.
/// Making this wrapper without a generic bound removes all of that duplication.
#[inline(never)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inlining is done by llvm, right? Is inline(never) necessary here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. It has no effect on llvm-lines, but I was afraid that it would all be inlined again and thus negating the benefit of this change.
Now thinking about it, that probably won't happen though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the #[inline(never)].

@bors
Copy link
Contributor

bors commented Sep 13, 2020

☀️ Try build successful - checks-actions, checks-azure
Build commit: f242334c1e2e719cf1cba923923ad8ec62affb71 (f242334c1e2e719cf1cba923923ad8ec62affb71)

@rust-timer
Copy link
Collaborator

Queued f242334c1e2e719cf1cba923923ad8ec62affb71 with parent 7402a39, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (f242334c1e2e719cf1cba923923ad8ec62affb71): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never

@jyn514
Copy link
Member

jyn514 commented Sep 13, 2020

Oh wow! -2.5% on hello world, -1.5-2% on several other benchmarks.

@jyn514
Copy link
Member

jyn514 commented Sep 13, 2020

I'm curious to see if it would be improved by removing the inline(never) ...

@Marwes
Copy link
Contributor

Marwes commented Sep 14, 2020

It might be better to apply this change in stacker itself.

It's internal _grow function could be changed to take a &mut dyn FnMut() https://github.com/rust-lang/stacker/blob/7a1d421a91c3f83d307bbdd86237dc6c9121b130/src/lib.rs#L64-L66 . That way this is only a dynamic call if the stack actually needs to grow (when the overhead is unlikely to matter) but keeps the common path fast (no dynamic call) when the stack does not grow.

@jyn514
Copy link
Member

jyn514 commented Sep 14, 2020

@Marwes won't that prevent using FnOnce with stacker?

@Marwes
Copy link
Contributor

Marwes commented Sep 14, 2020

No, because it would be the exact same trick as you do here, wrap the FnOnce in an Option, then take and unwrap in the FnMut.

@Julian-Wollersberger
Copy link
Contributor Author

It might be better to apply this change in stacker itself.

I'll try that too and we can compare it.

And wow, I'm surprised this is a perf improvement ^^

@Julian-Wollersberger
Copy link
Contributor Author

Julian-Wollersberger commented Sep 14, 2020

I've now moved the conversion to a dyn closure into stacker.
For that I temporarily set the stacker dependency to my fork on GitHub. (I have no idea if CI allows such a hack.)
I also don't have time today to measure llvm-lines again.

If this version turns out to be better, I will open a PR to stacker, and probably change this PR to be just a version bump.

With that, can I have another perf run?

@jyn514
Copy link
Member

jyn514 commented Sep 14, 2020

Relevant commit: Julian-Wollersberger/stacker@71993c7

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion

@bors
Copy link
Contributor

bors commented Sep 14, 2020

⌛ Trying commit 2b1badd692a5f92acd49ac01bdccb0a92b0020ae with merge 3a9349835672afaadc58a1327d44e3453b34a415...

@Julian-Wollersberger
Copy link
Contributor Author

(I have no idea if CI allows such a hack.)

No, Tidy doesn't. But bors is still going? I guess it doesn't run tidy?
Anyway, I'm going to bed now.

@jyn514
Copy link
Member

jyn514 commented Sep 14, 2020

No, bors doesn't run tidy (otherwise you'd have to wait to run perf runs until the author ran x.py fmt, which is annoying).

@rust-log-analyzer
Copy link
Collaborator

The job dist-x86_64-linux of your PR failed (pretty log, raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
   Compiling num-traits v0.2.12
   Compiling num-integer v0.1.43
   Compiling rustc_lexer v0.1.0 (/checkout/compiler/rustc_lexer)
   Compiling unicode-normalization v0.1.13
   Compiling psm v0.1.11 (https://github.com/Julian-Wollersberger/stacker#71993c7a)
   Compiling stacker v0.1.11 (https://github.com/Julian-Wollersberger/stacker#71993c7a)
   Compiling jemalloc-sys v0.3.2
   Compiling crossbeam-queue v0.1.2
   Compiling rustc_version v0.2.3
   Compiling ena v0.14.0
---
   Compiling num-traits v0.2.12
   Compiling num-integer v0.1.43
   Compiling unicode-normalization v0.1.13
   Compiling crossbeam-queue v0.1.2
   Compiling psm v0.1.11 (https://github.com/Julian-Wollersberger/stacker#71993c7a)
   Compiling stacker v0.1.11 (https://github.com/Julian-Wollersberger/stacker#71993c7a)
   Compiling jemalloc-sys v0.3.2
   Compiling rustc_version v0.2.3
   Compiling smallvec v0.6.13
   Compiling ena v0.14.0
---
   Compiling rustc_version v0.2.3
    Checking smallvec v0.6.13
    Checking rustc_apfloat v0.0.0 (/checkout/compiler/rustc_apfloat)
 Documenting rustc_apfloat v0.0.0 (/checkout/compiler/rustc_apfloat)
   Compiling psm v0.1.11 (https://github.com/Julian-Wollersberger/stacker#71993c7a)
   Compiling stacker v0.1.11 (https://github.com/Julian-Wollersberger/stacker#71993c7a)
    Checking ena v0.14.0
    Checking polonius-engine v0.12.1
    Checking tracing-log v0.1.1
    Checking aho-corasick v0.7.13

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @rust-lang/infra. (Feature Requests)

@jyn514
Copy link
Member

jyn514 commented Sep 21, 2020

On a side note, I have three weeks of free time in October coming up, and plan to spend it programming on rustc. I enjoy optimising things, like in this PR.

I'm just reading Discuss results of 2020 Contributor Survey, where rustc's compile times are mentioned multiple times. And with the llvm-lines measurements in the top comment in mind, and cargo-timings saying that 95% of the compile time of rustc_mir, rustc_middle and others are spent in LLVM, I think optimising llvm-lines has great potential.

More specific, rustc_query_system has way to many generics, but I'd need help tackling that beast. Do you know who I could ping, or were to ask (GitHub, Zulip, Discord)?

Some more places to look at after this:

Let me know if any of those sound interesting, I don't know how to help with all of them but I know who does ;)

@ecstatic-morse
Copy link
Contributor

Hi! This PR showed up in the weekly perf triage report. It resulted in a moderate improvement in instruction counts (up to -3.2% on incr-full builds of coercions-debug).

A really nice result @Julian-Wollersberger, especially since the main goal of this PR seems to have been to speed up rustc. Interestingly, rustdoc seems to have slowed down somewhat, but historically we don't care very much about its performance compared to actual builds.

Also, thank you for iterating on this, since the first version was indeed a small regression on most "real-world" benchmarks. @jyn514, here's an example of a perf run for a PR that doesn't affect perf (i.e. it's just noise). It's important to look past the first few numbers and see if there's a larger trend, especially if those results aren't for real-world benchmarks.

wip-sync pushed a commit to NetBSD/pkgsrc-wip that referenced this pull request Nov 24, 2020
Clean up some of the pkgsrc Makefile, there's still lots in here that
should just be deleted though.  Switch SunOS to the illumos bootstrap
by default.

Version 1.48.0 (2020-11-19)
==========================

Language
--------

- [The `unsafe` keyword is now syntactically permitted on modules.][75857] This
  is still rejected *semantically*, but can now be parsed by procedural macros.

Compiler
--------
- [Stabilised the `-C link-self-contained=<yes|no>` compiler flag.][76158] This tells
  `rustc` whether to link its own C runtime and libraries or to rely on a external
  linker to find them. (Supported only on `windows-gnu`, `linux-musl`, and `wasi` platforms.)
- [You can now use `-C target-feature=+crt-static` on `linux-gnu` targets.][77386]
  Note: If you're using cargo you must explicitly pass the `--target` flag.
- [Added tier 2\* support for `aarch64-unknown-linux-musl`.][76420]

\* Refer to Rust's [platform support page][forge-platform-support] for more
information on Rust's tiered platform support.

Libraries
---------
- [`io::Write` is now implemented for `&ChildStdin` `&Sink`, `&Stdout`,
  and `&Stderr`.][76275]
- [All arrays of any length now implement `TryFrom<Vec<T>>`.][76310]
- [The `matches!` macro now supports having a trailing comma.][74880]
- [`Vec<A>` now implements `PartialEq<[B]>` where `A: PartialEq<B>`.][74194]
- [The `RefCell::{replace, replace_with, clone}` methods now all use `#[track_caller]`.][77055]

Stabilized APIs
---------------
- [`slice::as_ptr_range`]
- [`slice::as_mut_ptr_range`]
- [`VecDeque::make_contiguous`]
- [`future::pending`]
- [`future::ready`]

The following previously stable methods are now `const fn`'s:

- [`Option::is_some`]
- [`Option::is_none`]
- [`Option::as_ref`]
- [`Result::is_ok`]
- [`Result::is_err`]
- [`Result::as_ref`]
- [`Ordering::reverse`]
- [`Ordering::then`]

Cargo
-----

Rustdoc
-------
- [You can now link to items in `rustdoc` using the intra-doc link
  syntax.][74430] E.g. ``/// Uses [`std::future`]`` will automatically generate
  a link to `std::future`'s documentation. See ["Linking to items by
  name"][intradoc-links] for more information.
- [You can now specify `#[doc(alias = "<alias>")]` on items to add search aliases
  when searching through `rustdoc`'s UI.][75740]

Compatibility Notes
-------------------
- [Promotion of references to `'static` lifetime inside `const fn` now follows the
  same rules as inside a `fn` body.][75502] In particular, `&foo()` will not be
  promoted to `'static` lifetime any more inside `const fn`s.
- [Associated type bindings on trait objects are now verified to meet the bounds
  declared on the trait when checking that they implement the trait.][27675]
- [When trait bounds on associated types or opaque types are ambiguous, the
  compiler no longer makes an arbitrary choice on which bound to use.][54121]
- [Fixed recursive nonterminals not being expanded in macros during
  pretty-print/reparse check.][77153] This may cause errors if your macro wasn't
  correctly handling recursive nonterminal tokens.
- [`&mut` references to non zero-sized types are no longer promoted.][75585]
- [`rustc` will now warn if you use attributes like `#[link_name]` or `#[cold]`
  in places where they have no effect.][73461]
- [Updated `_mm256_extract_epi8` and `_mm256_extract_epi16` signatures in
  `arch::{x86, x86_64}` to return `i32` to match the vendor signatures.][73166]
- [`mem::uninitialized` will now panic if any inner types inside a struct or enum
  disallow zero-initialization.][71274]
- [`#[target_feature]` will now error if used in a place where it has no effect.][78143]
- [Foreign exceptions are now caught by `catch_unwind` and will cause an abort.][70212]
  Note: This behaviour is not guaranteed and is still considered undefined behaviour,
  see the [`catch_unwind`] documentation for further information.

Internal Only
-------------
These changes provide no direct user facing benefits, but represent significant
improvements to the internals and overall performance of rustc and
related tools.
- [Building `rustc` from source now uses `ninja` by default over `make`.][74922]
  You can continue building with `make` by setting `ninja=false` in
  your `config.toml`.
- [cg_llvm: `fewer_names` in `uncached_llvm_type`][76030]
- [Made `ensure_sufficient_stack()` non-generic][76680]

[78143]: rust-lang/rust#78143
[76680]: rust-lang/rust#76680
[76030]: rust-lang/rust#76030
[70212]: rust-lang/rust#70212
[27675]: rust-lang/rust#27675
[54121]: rust-lang/rust#54121
[71274]: rust-lang/rust#71274
[77386]: rust-lang/rust#77386
[77153]: rust-lang/rust#77153
[77055]: rust-lang/rust#77055
[76275]: rust-lang/rust#76275
[76310]: rust-lang/rust#76310
[76420]: rust-lang/rust#76420
[76158]: rust-lang/rust#76158
[75857]: rust-lang/rust#75857
[75585]: rust-lang/rust#75585
[75740]: rust-lang/rust#75740
[75502]: rust-lang/rust#75502
[74880]: rust-lang/rust#74880
[74922]: rust-lang/rust#74922
[74430]: rust-lang/rust#74430
[74194]: rust-lang/rust#74194
[73461]: rust-lang/rust#73461
[73166]: rust-lang/rust#73166
[intradoc-links]: https://doc.rust-lang.org/rustdoc/linking-to-items-by-name.html
[`catch_unwind`]: https://doc.rust-lang.org/std/panic/fn.catch_unwind.html
[`Option::is_some`]: https://doc.rust-lang.org/std/option/enum.Option.html#method.is_some
[`Option::is_none`]: https://doc.rust-lang.org/std/option/enum.Option.html#method.is_none
[`Option::as_ref`]: https://doc.rust-lang.org/std/option/enum.Option.html#method.as_ref
[`Result::is_ok`]: https://doc.rust-lang.org/std/result/enum.Result.html#method.is_ok
[`Result::is_err`]: https://doc.rust-lang.org/std/result/enum.Result.html#method.is_err
[`Result::as_ref`]: https://doc.rust-lang.org/std/result/enum.Result.html#method.as_ref
[`Ordering::reverse`]: https://doc.rust-lang.org/std/cmp/enum.Ordering.html#method.reverse
[`Ordering::then`]: https://doc.rust-lang.org/std/cmp/enum.Ordering.html#method.then
[`slice::as_ptr_range`]: https://doc.rust-lang.org/std/primitive.slice.html#method.as_ptr_range
[`slice::as_mut_ptr_range`]: https://doc.rust-lang.org/std/primitive.slice.html#method.as_mut_ptr_range
[`VecDeque::make_contiguous`]: https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.make_contiguous
[`future::pending`]: https://doc.rust-lang.org/std/future/fn.pending.html
[`future::ready`]: https://doc.rust-lang.org/std/future/fn.ready.html
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Jan 1, 2021
Pkgsrc changes:
 * Compensate for files being moved around upstream.
 * Introduce optional, on-by-default semi-static building of cargo,
   using the internal curl and openssl sources.  This reduces the dynamic
   dependencies of cargo and therefore the rust package itself.
   Ref. options.mk.
 * The 1.47.0 bootstrap kits have been re-built with the above option
   turned on, so no longer depends on curl or openssl from pkgsrc and/or
   from earlier OS or pkgsrc versions.  This should hopefully fix
   installation of rust with non-default PREFIX, ref. PR#54453.


Upstream changes:

Version 1.48.0 (2020-11-19)
==========================

Language
--------
- [The `unsafe` keyword is now syntactically permitted on modules.][75857] This
  is still rejected *semantically*, but can now be parsed by procedural macros.

Compiler
--------
- [Stabilised the `-C link-self-contained=<yes|no>` compiler flag.][76158]
  This tells `rustc` whether to link its own C runtime and libraries
  or to rely on a external linker to find them. (Supported only on
  `windows-gnu`, `linux-musl`, and `wasi` platforms.)
- [You can now use `-C target-feature=+crt-static` on `linux-gnu` targets.]
  [77386]
  Note: If you're using cargo you must explicitly pass the `--target` flag.
- [Added tier 2\* support for `aarch64-unknown-linux-musl`.][76420]

\* Refer to Rust's [platform support page][forge-platform-support] for more
information on Rust's tiered platform support.

Libraries
---------
- [`io::Write` is now implemented for `&ChildStdin` `&Sink`, `&Stdout`,
  and `&Stderr`.][76275]
- [All arrays of any length now implement `TryFrom<Vec<T>>`.][76310]
- [The `matches!` macro now supports having a trailing comma.][74880]
- [`Vec<A>` now implements `PartialEq<[B]>` where `A: PartialEq<B>`.][74194]
- [The `RefCell::{replace, replace_with, clone}` methods now all use
  `#[track_caller]`.][77055]

Stabilized APIs
---------------
- [`slice::as_ptr_range`]
- [`slice::as_mut_ptr_range`]
- [`VecDeque::make_contiguous`]
- [`future::pending`]
- [`future::ready`]

The following previously stable methods are now `const fn`'s:

- [`Option::is_some`]
- [`Option::is_none`]
- [`Option::as_ref`]
- [`Result::is_ok`]
- [`Result::is_err`]
- [`Result::as_ref`]
- [`Ordering::reverse`]
- [`Ordering::then`]

Cargo
-----

Rustdoc
-------
- [You can now link to items in `rustdoc` using the intra-doc link
  syntax.][74430] E.g. ``/// Uses [`std::future`]`` will automatically generate
  a link to `std::future`'s documentation. See ["Linking to items by
  name"][intradoc-links] for more information.
- [You can now specify `#[doc(alias = "<alias>")]` on items to add
  search aliases when searching through `rustdoc`'s UI.][75740]

Compatibility Notes
-------------------
- [Promotion of references to `'static` lifetime inside `const fn`
  now follows the same rules as inside a `fn` body.][75502] In
  particular, `&foo()` will not be promoted to `'static` lifetime
  any more inside `const fn`s.
- [Associated type bindings on trait objects are now verified to meet the bounds
  declared on the trait when checking that they implement the trait.][27675]
- [When trait bounds on associated types or opaque types are ambiguous, the
  compiler no longer makes an arbitrary choice on which bound to use.][54121]
- [Fixed recursive nonterminals not being expanded in macros during
  pretty-print/reparse check.][77153] This may cause errors if your macro wasn't
  correctly handling recursive nonterminal tokens.
- [`&mut` references to non zero-sized types are no longer promoted.][75585]
- [`rustc` will now warn if you use attributes like `#[link_name]` or `#[cold]`
  in places where they have no effect.][73461]
- [Updated `_mm256_extract_epi8` and `_mm256_extract_epi16` signatures in
  `arch::{x86, x86_64}` to return `i32` to match the vendor signatures.][73166]
- [`mem::uninitialized` will now panic if any inner types inside
  a struct or enum disallow zero-initialization.][71274]
- [`#[target_feature]` will now error if used in a place where it
  has no effect.][78143]
- [Foreign exceptions are now caught by `catch_unwind` and will
  cause an abort.][70212] Note: This behaviour is not guaranteed
  and is still considered undefined behaviour, see the [`catch_unwind`]
  documentation for further information.

Internal Only
-------------
These changes provide no direct user facing benefits, but represent significant
improvements to the internals and overall performance of rustc and
related tools.

- [Building `rustc` from source now uses `ninja` by default over `make`.][74922]
  You can continue building with `make` by setting `ninja=false` in
  your `config.toml`.
- [cg_llvm: `fewer_names` in `uncached_llvm_type`][76030]
- [Made `ensure_sufficient_stack()` non-generic][76680]

[78143]: rust-lang/rust#78143
[76680]: rust-lang/rust#76680
[76030]: rust-lang/rust#76030
[70212]: rust-lang/rust#70212
[27675]: rust-lang/rust#27675
[54121]: rust-lang/rust#54121
[71274]: rust-lang/rust#71274
[77386]: rust-lang/rust#77386
[77153]: rust-lang/rust#77153
[77055]: rust-lang/rust#77055
[76275]: rust-lang/rust#76275
[76310]: rust-lang/rust#76310
[76420]: rust-lang/rust#76420
[76158]: rust-lang/rust#76158
[75857]: rust-lang/rust#75857
[75585]: rust-lang/rust#75585
[75740]: rust-lang/rust#75740
[75502]: rust-lang/rust#75502
[74880]: rust-lang/rust#74880
[74922]: rust-lang/rust#74922
[74430]: rust-lang/rust#74430
[74194]: rust-lang/rust#74194
[73461]: rust-lang/rust#73461
[73166]: rust-lang/rust#73166
[intradoc-links]: https://doc.rust-lang.org/rustdoc/linking-to-items-by-name.html
[`catch_unwind`]: https://doc.rust-lang.org/std/panic/fn.catch_unwind.html
[`Option::is_some`]: https://doc.rust-lang.org/std/option/enum.Option.html#method.is_some
[`Option::is_none`]: https://doc.rust-lang.org/std/option/enum.Option.html#method.is_none
[`Option::as_ref`]: https://doc.rust-lang.org/std/option/enum.Option.html#method.as_ref
[`Result::is_ok`]: https://doc.rust-lang.org/std/result/enum.Result.html#method.is_ok
[`Result::is_err`]: https://doc.rust-lang.org/std/result/enum.Result.html#method.is_err
[`Result::as_ref`]: https://doc.rust-lang.org/std/result/enum.Result.html#method.as_ref
[`Ordering::reverse`]: https://doc.rust-lang.org/std/cmp/enum.Ordering.html#method.reverse
[`Ordering::then`]: https://doc.rust-lang.org/std/cmp/enum.Ordering.html#method.then
[`slice::as_ptr_range`]: https://doc.rust-lang.org/std/primitive.slice.html#method.as_ptr_range
[`slice::as_mut_ptr_range`]: https://doc.rust-lang.org/std/primitive.slice.html#method.as_mut_ptr_range
[`VecDeque::make_contiguous`]: https://doc.rust-lang.org/std/collections/struct.VecDeque.html#method.make_contiguous
[`future::pending`]: https://doc.rust-lang.org/std/future/fn.pending.html
[`future::ready`]: https://doc.rust-lang.org/std/future/fn.ready.html
bors pushed a commit to rust-lang-ci/rust that referenced this pull request Mar 16, 2021
Includes rust-lang/hashbrown#204 and rust-lang/hashbrown#205 (not yet merged) which both server to reduce the amount of IR generated for hashmaps.

Inspired by the llvm-lines data gathered in rust-lang#76680
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 18, 2021
feat: Update hashbrown to instantiate less llvm IR

Includes rust-lang/hashbrown#204 and rust-lang/hashbrown#205 (not yet merged) which both serve to reduce the amount of IR generated for hashmaps.

Inspired by the llvm-lines data gathered in rust-lang#76680 (cc `@Julian-Wollersberger)`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-compiletime Issue: Problems and improvements with respect to compile times. merged-by-bors This PR was explicitly merged by bors. relnotes-perf Performance improvements that should be mentioned in the release notes. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants