Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downloading source #11

Open
ehuss opened this issue Jul 20, 2019 · 8 comments
Open

Downloading source #11

ehuss opened this issue Jul 20, 2019 · 8 comments
Labels
implementation Implementation exploration and tracking issues S-needs-design Status: needs design work stabilization blocker This needs a resolution before stabilization

Comments

@ehuss
Copy link
Contributor

ehuss commented Jul 20, 2019

This issue is for working through the implementation issues with downloading the standard library source.

Rustup is currently capable of downloading the rust-src component which contains the source for the standard library crates. It does not include dependencies (like libc) but it does include Cargo.lock from which dependencies can be inferred.

There are multiple questions on how Cargo will handle acquiring the standard library source:

  • I would prefer if it didn't solely rely on rustup being installed, though what the alternative is I do not know.
  • Acquiring dependencies may be tricky. Likely the Cargo.lock file will need to be parsed to find the versions, and then download from crates.io. Some possible alternatives:
    • Include all dependencies in the original source download.
    • Record dependency versions in a different format than Cargo.lock.
@ehuss ehuss added the implementation Implementation exploration and tracking issues label Jul 20, 2019
@alexcrichton
Copy link
Member

For this to be a truly first class feature I think that this download needs to be handled transparently. Cargo already has tons of support for downloading things but the major trickiness will be finding where the source code is hosted. Rustup knows the toolchain and commit (especially for things like rustup toolchain install master), but cargo generally doesn't know this.

It I think would be a reasonable first shot though to say cargo has a default probing location (for Linux distroa) and it then attempts to query rustup. All of that could have a manual override as well, and I think that would roughly cover most setups at the start

@SimonSapin
Copy link

Two random ideas for specific mechanims:

  • If the argv[0] for Cargo (not rustup’s proxy) is $PREFIX/bin/cargo, look for source code at $PREFIX/lib/rustlib/src/rust. This is where rustup installs its rust-src components. Maybe it’s reasonable to declare that non-rustup installers / packaging should have a similar directory layout? (Possibly with tweaks for Windows or other platforms where bin and lib directory are not usual.)

  • Rustup has a ~/.cargo/bin/cargo proxy executable that decides which toolchain to use, then calls the "real" cargo executable. It could at that time set a RUST_SRC_PATH environment variable that Cargo could read without making a query back to rustup.

@Ericson2314
Copy link

In https://github.com/rust-lang/cargo/pull/2768/files I just put in a config setting for a path source/registry for the stdlib sources. Simple and stupid, but fine for MVP I'd hope.

@crlf0710
Copy link
Member

A (maybe stupid) approach after MVP is to reserve core, std, etc... on crates.io, and publish the source of every version of libcore and libstd on it as 0.0 for rustc 1.0.0, 1.0 for rustc 1.1.0, ... 37.0 for rustc 1.37.0, etc.

@Ericson2314
Copy link

I wouldn't mind that at all. Nightly and alternate implementations can use [patch].

bors added a commit to rust-lang/cargo that referenced this issue Sep 3, 2019
Basic standard library support.

This is not intended to be useful to anyone. If people want to try it, that's great, but do not rely on this. This is only for experimenting and setting up for future work.

This adds a flag `-Zbuild-std` to build the standard library with a project. The flag can also take optional comma-separated crate names, like `-Zbuild-std=core`. Default is `std,core,panic_unwind,compiler_builtins`.

Closes rust-lang/wg-cargo-std-aware#10.

Note: I can probably break some of the refactoring into smaller PRs if necessary.

## Overview
The general concept here is to use two resolvers, and to combine everything in the Unit graph. There are a number of changes to support this:

- A synthetic workspace for the standard library is created to set up the patches and members correctly.
- Decouple `unit_dependencies` from `Context` to make it easier to manage.
- Add `features` to `Unit` to keep it unique and to remove the need to query a resolver.
- Add a `UnitDep` struct which encodes the edges between `Unit`s. This removes the need to query a resolver for `extern_crate_name` and `public`.
- Remove `Resolver` from `BuildContext` to avoid any confusion and to keep the complexity focused in `unit_dependencies`.
- Remove `Links` from `Context` since it used the resolver. Adjusted so that instead of checking links at runtime, they are all checked at once in the beginning. Note that it does not check links for the standard lib, but it should be safe? I think `compiler-rt` is the only `links`?

I currently went with a strategy of linking the standard library dependencies using `--extern` (instead of `--sysroot` or `-L`). This has some benefits but some significant drawbacks. See below for some questions.

## For future PRs
- Add Cargo.toml support. See rust-lang/wg-cargo-std-aware#5
- Source is not downloaded. It assumes you have run `rustup component add rust-src`. See rust-lang/wg-cargo-std-aware#11
- `cargo metadata` does not include any information about std. I don't know how this should work.
- `cargo clean` is not std-aware.
- `cargo fetch` does not fetch std dependencies.
- `cargo vendor` does not vendor std dependencies.
- `cargo pkgid` is not std-aware.
- `--target` is required on the command-line. This should default to host-as-target.
- `-p` is not std aware.
- A synthetic `Cargo.toml` workspace is created which has to know about things like `rustc-std-workspace-core`. Perhaps rust-lang/rust should publish the source with this `Cargo.toml` already created?
- `compiler_builtins` uses default features (pure Rust implementation, etc.). See rust-lang/wg-cargo-std-aware#15
    - `compiler_builtins` may need to be built without debug assertions, see [this](https://github.com/rust-lang/rust/blob/8e917f48382c6afaf50568263b89d35fba5d98e4/src/bootstrap/bin/rustc.rs#L210-L214). Could maybe use profile overrides.
- Panic issues:
    - `panic_abort` is not yet supported, though it should probably be easy. It could maybe look at the profile to determine which panic implementation to use? This requires more hard-coding in Cargo to know about rustc implementation details.
    - [This](https://github.com/rust-lang/rust/blob/8e917f48382c6afaf50568263b89d35fba5d98e4/src/bootstrap/bin/rustc.rs#L186-L201) should probably be handled where `panic` is set for `panic_abort` and `compiler_builtins`. I would like to get a test case for it. This can maybe be done with profile overrides?
- Using two resolvers is quite messy and causes a lot of complications. It would be ideal if it could only use one, though that may not be possible for the foreseeable future. See rust-lang/wg-cargo-std-aware#12
- Features are hard-coded. See rust-lang/wg-cargo-std-aware#13
- Lots of various platform-specific support is not included (musl, wasi, windows-gnu, etc.).
- Default `backtrace` is used with C compiler. See rust-lang/wg-cargo-std-aware#16
- Sanitizers are not built. See rust-lang/wg-cargo-std-aware#17
- proc_macro has some hacky code to synthesize its dependencies. See rust-lang/wg-cargo-std-aware#18. This may not be necessary if this uses `--sysroot` instead.
- Profile overrides cause weird linker errors.
  That is:
  ```toml
  [profile.dev.overrides.std]
  opt-level = 2
  ```
  Using `[profile.dev.overrides."*"]` works. I tried fiddling with it, but couldn't figure it out.
  We may also want to consider altering the syntax for profile overrides. Having to repeat the same profile for `std` and `core` and `alloc` and everything else would not be ideal.
- ~~`Context::unit_deps` does not handle build overrides, see #7215.~~ FIXED

## Questions for this PR
- I went with the strategy of using `--extern` to link the standard lib. This seems to work, and I haven't found any problems, but it seems risky. It also forces Cargo to know about certain implicit dependencies like `compiler_builtins` and `panic_*`. The alternative is to create a sysroot and copy all the crates to that directory and pass `--sysroot`. However, this is complicated by pipelining, which would require special support to copy `.rmeta` files when they are generated. Let me know if you think I should use a different strategy. I'm on the fence here, and I think using `--sysroot` may be safer, but adds more complexity.
    - As an aside, if rustc ever tries to grab a crate from sysroot that was not passed in via `--extern`, then it results in duplicate lang items. For example, saying `extern crate proc_macro;` without specifying `proc_macro` as a dependency. We could prevent rustc from ever trying by passing `--sysroot=/nonexistent` to prevent it from trying. Or add an equivalent flag to rustc.
- How should this be tested? I added a janky integration test, but it has some drawbacks. It requires internet access. It is slow. Since it is slow, it reuses the same target directory for multiple tests which makes it awkward to work with.
    - What interesting things are there to test?
    - We may want to disable the test before merging if it seems too annoying to make it the default. It requires rust-src to be downloaded, and takes several minutes to run, and are somewhat platform-dependent.
- How to test that it is actually linking the correct standard library? I did tests locally with a modified libcore, but I can't think of a good way to do that in the test suite.
- I did not add `__CARGO_DEFAULT_LIB_METADATA` to the hash. I had a hard time coming up with a test case where it would matter.
    - My only thought is that it is a problem because libstd includes a dylib, which prevents the hash from being added to the filename. It does cause recompiles when switching between compilers, for example, when it normally wouldn't.
    - Very dumb question: Why exactly does libstd include a dylib? This can cause issues (see rust-lang/rust#56443).
    - This should probably change, but I want to better understand it first.
- The `bin_nostd` test needs to link libc on linux, and I'm not sure I understand why. I'm concerned there is something wrong there. libstd does not do that AFAIK.
@ehuss
Copy link
Contributor Author

ehuss commented Sep 6, 2019

If using the rustup-downloaded component is used long-term, then we need to use the cached value of rustc --print=sysroot. The problem is that TargetInfo is created too late, so BuildConfig will need to be restructured so that it can be created earlier.

@SimonSapin
Copy link

Some names are already reserved on crates.io. However I don’t think this is a good strategy: we should aim to support not only official releases and Nightlies from rust-lang.org, but also toolchains compiled elsewhere for various reasons.

@ehuss ehuss added S-needs-design Status: needs design work stabilization blocker This needs a resolution before stabilization labels May 3, 2023
@madsmtm
Copy link

madsmtm commented Feb 28, 2024

we should aim to support not only official releases and Nightlies from rust-lang.org, but also toolchains compiled elsewhere for various reasons.

As a user, I'd expect that to happen using [crates-io.patch], though of course that's probably difficult to actually implement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
implementation Implementation exploration and tracking issues S-needs-design Status: needs design work stabilization blocker This needs a resolution before stabilization
Projects
None yet
Development

No branches or pull requests

6 participants