-
-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rustPlatform.fetchCargoVendor: init #349360
Conversation
8f86bba
to
8581797
Compare
143eb2c
to
eac2cca
Compare
eac2cca
to
06035ed
Compare
06035ed
to
e4ece40
Compare
e4ece40
to
440b366
Compare
440b366
to
ca03b0b
Compare
@ofborg build granian popsicle |
I wonder why ofborg is failing... |
6533da7
to
260c771
Compare
@ofborg build granian popsicle |
260c771
to
d3d5f70
Compare
I don't think we should do that btw. Handling plain files is simpler and faster. If you cared about space usage of intermediate build files etc. you should just use a filesystem that supports transparent compression. It's a >50% reduction in space usage on my Nix store with only zstd:1.
It's actually slightly worse for hydra/cache.nixos.org as it has to cache this for every system separately, so take everything here x4. And of course rebuilds, so multiply by ~20-30 per year. For "home users" the same argument w.r.t. space use applies here though. Tis and the hydra issue could be remedied by unpacking the tarballs in the stable FOD format and them simply symlinking the actual sources into the unstable format. That way the "heavy-weight" ground truth data is in the FOD while the system- and python-specific derivations are just a light-weight translation layers. This would also protect us against crates.io possibly changing their gzip implementation or version of downloaded tarballs. I don't know whether there is any "contract" around that. This has bitten us with GitHub tarballs before for instance.
That's not possible with the current implementation of nix remote builds AFAIK. |
The cargo tarballs downloaded are checksummed, so it should be fine. I did not think of the symlinking option, seems like a promising idea. The unmodified stuff would be symlinks (non-git) and the normalized stuff would be just a copy (git with workspace stuff). Though, yes, that would require unpacking in the fod, which I was afraid to do, but should be viable. |
Is there any particular reason you were "afraid" to do it? |
I wanted to do the minimum possible processing in the staging phase, to lower the possibility of messing something up in the implementation. |
Also, importCargoLock should also "suffer" from the same storage problem, since it doesn't use its internal FODs directly either. This means that fetchCargoVendor should not be worsening the storage situation by much, compared to importCargoLock |
I see. That's reasonable but I also think unpacking a tarball is hard to get wrong. (If you remember the command)
Indeed. Given that it has more pressing issues, I don't think we should place much importance on it for sustainable use within Nixpkgs.
You are right of course. This is merely a low-hanging fruit optimisation beyond the status quo. I'll open a separate issue to discuss optimisations for |
Successfully created backport PR for |
Making a backport as this is going to prevent a lot of other backports from going through, and shouldn't be a breaking change in any way. It can also lead to some nasty surprises on stable, as only setting |
Related: #327063
This PR adds an alternative method to vendor the dependencies of a cargo project.
It's currently implemented in python, but could be changed later, if need be.
I converted some packages to use this fetcher as an example.
If you can come up with a good name for this fetcher, please share it below.
If we use this fetcher we won't have to worry about cargo changing its vendoring implementation (since this is completely custom), thus we can replace all
importCargoLock
usages where it was only required because of the presence of git dependencies.The main idea is that we try to create a minimal staging FOD.
We fetch the tarballs into
$out/tarballs/${name}-${version}.tar.gz
and we fetch the git repos to$out/git/${git_sha_rev}
.This is the most important part to get right, since we will only be able to minimally change this in the future.
We would still be able to add some extra behaviours in cases that currently just throw an error (e.g. fetching other registries).
The staging FOD will then be used to create the actual vendor dir.
Since this no longer requires network access, we can freely change the implementation of this part without worrying about that hash.
The cargo workspace inheritance logic was already reimplemented by a script written for
importCargoLock
, so I just reused that.Since this is a two-step process and only the first step is a FOD, we won't have
outputHash
and friends on the result. If we want to override theoutputHash
, we would have to do so through thevendorStaging
attribute.TODO: it might be useful to add an easier way to override the
outputHash
ofvendorStaging
.I added a commit that makes
buildCargoPackage
swap over to this new fetcher implementation. This is currently done through a boolean flag.Should it maybe be an enum instead? e.g.
cargoFetcher = "custom"
vscargoFetcher = "default"
;Sidenote:
The current vendoring script only requires the
Cargo.lock
file and nothing else. This would mean thatfetchCargoVendor
may easily be changed to not take insrc
and friends, but only a path to the cargo lock, like${src}/Cargo.lock
. This is not IFD, since we never import the path from nix, only from a script. This would be similar to how it's done withfetchYarnDeps
.Having it as a normal mkDerivation, like it is now, we can also apply patches and use hooks, which might be preferred.
It would be possible to skip the staging phase and make the vendor dir the FOD, however, this would lock our output format to never be able to change again.
The current implementation uses
nix-prefetch-git
to fetch a reproducible git tree, as I wanted to make sure I didn't make any implementation mistakes. I wonder if there's a better method than this...Nothing is stopping us from also compressing the final vendor directory into a tarball, just like with
fetchCargoTarball
.Tarballs - obviously - take up less space, so ideally we would do this.
Though a better fetcher name might be needed in that case.
I'm keeping the output as a directory for now, until the testing is done.
I also worry about the fact that non-FOD derivations will rebuild whenever the inputs change. This would mean that whenever
python
needs to be rebuilt, every vendor directory created by this would also need to be rebuilt. The rebuilds of the vendor directories would not take much time, but the used up space would be a concern.It would be great if we could tell hydra not to cache certain derivations.
Things done
nix.conf
? (See Nix manual)sandbox = relaxed
sandbox = true
nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)Add a 👍 reaction to pull requests you find important.