-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Impure derivations #520
Comments
To make things even weirder, hydra could use this for its job specification with nondeterministic calls to |
Nobody have any comments? I can flesh out the idea more if it would help. I think it could be a pretty cool way to manage the (limited but often necessary) pieces of mutable state in a Nix-based system. |
To sum up, these derivations would:
Do I get this right? Current status of code generators?I'm certain there are already general tools that prefetch latest source and update hashes in *.nix files – currently I don't see a distinct advantage in having this built in. For example, @MarcWeber has these |
I talked about some similar things in my somewhat-recent |
From these issues with my new fetchgitlocal, NixOS/nixpkgs#10873 I am starting to think we need non-deterministic packages which run under the current user to generalize things putting like private directories in the store. |
I'll probably see if I can drum up some interest about this (and flesh out my proposal) at NixCon in Berlin. @Ericson2314, will you be there? |
That would be great! Unfortunately, school will keep me away from NixCon, but let me know how it goes. |
I've been tinkering with this recently, and might be able to put up a PR for a hypothetical implementation (subject to lots of implementation and design feedback) in the next week or so, if I get some time. Edit: turned out to be more complicated than expected :( |
Tagging #904 for posterity. |
@edolstra I'm considering working on this. Is there any chance I can get some assurance of a timely review and/or permission to merge myself before I put a large amount of work in? |
I posted this in another ticket:
|
Impure derivations are derivations that can produce a different result every time they're built. Example: stdenv.mkDerivation { name = "impure"; __impure = true; # marks this derivation as impure buildCommand = "date > $out"; }; Some important characteristics: * Impure derivations are not "cached". Thus, running "nix-build" on the example above multiple times will cause a rebuild every time. In the future, we could implement some mechanism for reusing impure builds across invocations. * The outputs of impure derivations are moved to a content-addressed location after the build (i.e., the resulting store path will correspond to the hash of the contents of the path). This way, multiple builds of the same impure derivation do not collide. * Because of content-addressability, the output paths of an impure derivation recorded in its .drv file are "virtual" placeholders for the actual outputs which are not known in advance. This also means that "nix-store -q bla.drv" gives a meaningless path. * Pure derivations are not allowed to depend on impure derivations. The only exception is fixed-output derivations. Because the latter always produce a known output, they can depend on impure shenanigans just fine. Also, repeatedly running "nix-build" on such a fixed-output derivation will *not* cause a rebuild of the impure dependency. After all, if the fixed output exists, its dependencies are no longer relevant. Thus, fixed-output derivations form an "impurity barrier" in the dependency graph. * When sandboxing is enabled, impure derivations can access the network in the same way as fixed-output derivations. In relaxed sandboxing mode, they can access the local filesystem. * Currently, the output of an impure derivation must have no references. This is because the content-addressing scheme must be extended to handle references, in particular self-references (as described in the ASE-2005 paper.) * Currently, impure derivations can only have a single output. No real reason for this. * "nix-build" on an impure derivation currently creates a result symlink to the incorrect, virtual output. A motivating example is the problem of using "fetchurl" on a dynamically generated tarball whose contents are deterministic, but where the tarball does not have a canonical form. Previously, this required "fetchurl" to do the unpacking in the same derivation. (That's what "fetchzip" does.) But now we can say: tarball = stdenv.mkDerivation { __impure = true; name = "tarball"; buildInputs = [ curl ]; buildCommand = "curl --fail -Lk https://github.com/NixOS/patchelf/tarball/c1f89c077e44a495c62ed0dcfaeca21510df93ef > $out"; }; unpacked = stdenv.mkDerivation { name = "unpacked"; outputHashAlgo = "sha256"; outputHashMode = "recursive"; outputHash = "1jl8n1n36w63wffkm56slcfa7vj9fxkv4ax0fr0mcfah55qj5l8s"; buildCommand = "mkdir $out; tar xvf ${tarball} -C $out"; }; I needed this because <nix/fetchurl.nix> does not support unpacking, and adding untar/unzip functionality would be annoying (especially since we can't just call "tar" or "unzip" in a sandbox). #520
So Shea told me about |
Yeah, there's |
Internally at Target we expose fetchGit through an interface that enforces specifying either a revision or a tag (we map tags to |
The motivation why |
@edolstra does |
Impure derivations are derivations that can produce a different result every time they're built. Example: stdenv.mkDerivation { name = "impure"; __impure = true; # marks this derivation as impure buildCommand = "date > $out"; }; Some important characteristics: * Impure derivations are not "cached". Thus, running "nix-build" on the example above multiple times will cause a rebuild every time. In the future, we could implement some mechanism for reusing impure builds across invocations. * The outputs of impure derivations are moved to a content-addressed location after the build (i.e., the resulting store path will correspond to the hash of the contents of the path). This way, multiple builds of the same impure derivation do not collide. * Because of content-addressability, the output paths of an impure derivation recorded in its .drv file are "virtual" placeholders for the actual outputs which are not known in advance. This also means that "nix-store -q bla.drv" gives a meaningless path. * Pure derivations are not allowed to depend on impure derivations. The only exception is fixed-output derivations. Because the latter always produce a known output, they can depend on impure shenanigans just fine. Also, repeatedly running "nix-build" on such a fixed-output derivation will *not* cause a rebuild of the impure dependency. After all, if the fixed output exists, its dependencies are no longer relevant. Thus, fixed-output derivations form an "impurity barrier" in the dependency graph. * When sandboxing is enabled, impure derivations can access the network in the same way as fixed-output derivations. In relaxed sandboxing mode, they can access the local filesystem. * Currently, the output of an impure derivation must have no references. This is because the content-addressing scheme must be extended to handle references, in particular self-references (as described in the ASE-2005 paper.) * Currently, impure derivations can only have a single output. No real reason for this. * "nix-build" on an impure derivation currently creates a result symlink to the incorrect, virtual output. A motivating example is the problem of using "fetchurl" on a dynamically generated tarball whose contents are deterministic, but where the tarball does not have a canonical form. Previously, this required "fetchurl" to do the unpacking in the same derivation. (That's what "fetchzip" does.) But now we can say: tarball = stdenv.mkDerivation { __impure = true; name = "tarball"; buildInputs = [ curl ]; buildCommand = "curl --fail -Lk https://github.com/NixOS/patchelf/tarball/c1f89c077e44a495c62ed0dcfaeca21510df93ef > $out"; }; unpacked = stdenv.mkDerivation { name = "unpacked"; outputHashAlgo = "sha256"; outputHashMode = "recursive"; outputHash = "1jl8n1n36w63wffkm56slcfa7vj9fxkv4ax0fr0mcfah55qj5l8s"; buildCommand = "mkdir $out; tar xvf ${tarball} -C $out"; }; I needed this because <nix/fetchurl.nix> does not support unpacking, and adding untar/unzip functionality would be annoying (especially since we can't just call "tar" or "unzip" in a sandbox). NixOS#520
Impure derivations are derivations that can produce a different result every time they're built. Example: stdenv.mkDerivation { name = "impure"; __impure = true; # marks this derivation as impure buildCommand = "date > $out"; }; Some important characteristics: * Impure derivations are not "cached". Thus, running "nix-build" on the example above multiple times will cause a rebuild every time. In the future, we could implement some mechanism for reusing impure builds across invocations. * The outputs of impure derivations are moved to a content-addressed location after the build (i.e., the resulting store path will correspond to the hash of the contents of the path). This way, multiple builds of the same impure derivation do not collide. * Because of content-addressability, the output paths of an impure derivation recorded in its .drv file are "virtual" placeholders for the actual outputs which are not known in advance. This also means that "nix-store -q bla.drv" gives a meaningless path. * Pure derivations are not allowed to depend on impure derivations. The only exception is fixed-output derivations. Because the latter always produce a known output, they can depend on impure shenanigans just fine. Also, repeatedly running "nix-build" on such a fixed-output derivation will *not* cause a rebuild of the impure dependency. After all, if the fixed output exists, its dependencies are no longer relevant. Thus, fixed-output derivations form an "impurity barrier" in the dependency graph. * When sandboxing is enabled, impure derivations can access the network in the same way as fixed-output derivations. In relaxed sandboxing mode, they can access the local filesystem. * Currently, the output of an impure derivation must have no references. This is because the content-addressing scheme must be extended to handle references, in particular self-references (as described in the ASE-2005 paper.) * Currently, impure derivations can only have a single output. No real reason for this. * "nix-build" on an impure derivation currently creates a result symlink to the incorrect, virtual output. A motivating example is the problem of using "fetchurl" on a dynamically generated tarball whose contents are deterministic, but where the tarball does not have a canonical form. Previously, this required "fetchurl" to do the unpacking in the same derivation. (That's what "fetchzip" does.) But now we can say: tarball = stdenv.mkDerivation { __impure = true; name = "tarball"; buildInputs = [ curl ]; buildCommand = "curl --fail -Lk https://github.com/NixOS/patchelf/tarball/c1f89c077e44a495c62ed0dcfaeca21510df93ef > $out"; }; unpacked = stdenv.mkDerivation { name = "unpacked"; outputHashAlgo = "sha256"; outputHashMode = "recursive"; outputHash = "1jl8n1n36w63wffkm56slcfa7vj9fxkv4ax0fr0mcfah55qj5l8s"; buildCommand = "mkdir $out; tar xvf ${tarball} -C $out"; }; I needed this because <nix/fetchurl.nix> does not support unpacking, and adding untar/unzip functionality would be annoying (especially since we can't just call "tar" or "unzip" in a sandbox). NixOS#520
Is there any hope of seeing __impure merged into the main branch any time soon? |
@deliciouslytyped ca derivations make |
And now we have them! (#4087) So let's resurrect this. Should be quite easy, actually. |
Looking at edolstra@690e06b, hare are some notes:
So let's just wait for #4056 to land, and then we basically "do it again" for this! CC @regnat |
I marked this as stale due to inactivity. → More info |
still interested |
I marked this as stale due to inactivity. → More info |
Still interested |
Does #6227 resolve your use-case? |
Wow that was a fast response 😆 and yes it does! I want to use them in Hydra actually. Thanks! |
Let's repurpose this to be a tracking issue for the now-merged unstable feature! |
Let's! I'll start playing with impure drvs soon enough. If I hit any issues I'll report back here. |
I don't have perms to edit the issue or change its title, but |
I'm not able to use impure derivations at all: { pkgs ? import <nixpkgs> {}, ... }:
pkgs.stdenv.mkDerivation {
name = "impure";
__impure = true; # marks this derivation as impure
#outputHashAlgo = "sha256"; # optional, default is sha256
#outputHashMode = "recursive"; # optional, default is recursive
buildCommand = "date > $out";
} |
@melvyn2 I also get that error when running /tmp/tmp.TMbuOx5fGy
❯ cat default.nix
{ pkgs ? import <nixpkgs> { } }:
pkgs.stdenv.mkDerivation {
name = "impure";
__impure = true;
buildCommand = "date > $out";
}
/tmp/tmp.TMbuOx5fGy
❯ nix build --impure --file default.nix
/tmp/tmp.TMbuOx5fGy
❯ cat result
Wed Aug 3 10:03:03 UTC 2022
/tmp/tmp.TMbuOx5fGy
❯ nix-build
this derivation will be built:
/nix/store/2ylp1hynhl3902kjzii9ynvby9ljizwp-impure.drv
resolved derivation: '/nix/store/2ylp1hynhl3902kjzii9ynvby9ljizwp-impure.drv' -> '/nix/store/sm5kqqpsr9v7hk7hdxmhl4kxnd2mc3a6-impure.drv'...
building '/nix/store/sm5kqqpsr9v7hk7hdxmhl4kxnd2mc3a6-impure.drv'...
nix-build: src/nix-build/nix-build.cc:594: void main_nix_build(int, char**): Assertion `maybeOutputPath' failed.
Aborted (core dumped) |
Would these kinds of impure derivations be permitted in flakes pure eval mode? |
Yes, this would be safe because the outPath is not deterministic and the eval itself is not impure, only the build-phase. |
This might reveal a deep misunderstanding on my part, but as far as I can tell, nix fundamentally divides its derivations into "fixed-output" and "deterministic build", based on the presence/absence of
outputHash
. I'm wondering if there could be a third type of fundamental building block which could allow limited but trackable nondeterministic behavior. The main example I can think of right now is the newfetchTarball
builtin, which has its own magic caching strategy, but you could imagine wanting to pull the latest git revision of something usingfetchgit
and the like. If you usefetchgit
as a fixed-output derivation, you can't always get the latest version. If you have it "lie" and pretend not to be a fixed-output derivation, nix will only ever do the work once and not bother refreshing itself.If nix supported this third type of derivation, I could imagine something like:
Of course, it should be possible for you to take an expression and figure out all sources of nondeterminism in it (much like how this source downloader works) so as to better trust the evaluation.
Another possible feature of interest could be the notion of a
nondetDerivation
optionally (it's not possible with all sources of nondeterminism, but is obviously desirable) emitting some sort of an "anchor" allowing one to tie the nondeterministic evaluation down to something deterministic. Think how ruby'sGemfile
ties itself down toGemfile.lock
(but we'd obviously provide hashes), and how when you fetch a git ref you can "lock it down" by resolving that ref to a hash. Another example is how the NixOS channel mechanism resolves the top-level redirect to a precise channel revision. Such an anchor file could then be maintained as a way to lock down nondeterminism to get reproducible system states, but you could also selectively (or in bulk) update the locked things (much likenix-channel --update
) to get newer versions.A last example is just how magic path references in nix copy things into the store for you. We could retain the built-in syntax, but translate the syntax into implicit invocations of the same
nondetDerivation
primitive.Is this too weird? I'm just trying to think of a principled way to track my nondeterminism, and possibly to unify the channel world into pure nix.
TBC: I'm not proposing adding more nondeterminism to the system. Just want to be able to track/unify the existing stuff better.
The text was updated successfully, but these errors were encountered: