Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redefine outputs in terms of language-level "package", not necessarily store-level derivation (RFC-92, and multi-drv packages, docs) #6507

Open
2 tasks
roberth opened this issue May 9, 2022 · 33 comments
Labels
bug significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc.

Comments

@roberth
Copy link
Member

roberth commented May 9, 2022

Describe the problem

Goals

  • make outputs defined in the Nix language the source of truth, instead of derivation outputs
    • necessary for effective use of RFC 92 Computed Derivations.
    • allow even non-RFC-92 packages to reduce the build closure by splitting outputs such as doc into a separate derivation
  • document what Nix expects from a package
    • define the interface between Nix and Nixpkgs
    • have a definition of "package"

Currently, packages and derivations are often the same thing, but the lack of a definition and distinction between the two can not continue since RFC 92 (computed derivations, outputOf) and the conflation has unnecessarily made the concept of a multi-derivation package ill-formed / "unthinkable".

Nix doesn't really have a notion of "package". The term is only mentioned in a few places in the code, and only defined in the context of buildenv (ie legacy nix-env). This only related to the usage of derivations in a profile, and therefore does not conflict with a definition of "package".

Nixpkgs on the other hand is all about packages, but it does not define precisely what a package is.

I propose the following definition:

A package is an attribute set with the following attributes:

  • outputs: list of strings
  • ${output} for each output in outputs: store path string
  • name: string
  • version: string
  • meta: attribute set with specific optional(?) attributes
  • tests: optional attrset tree of derivations and/or packages
  • devShell: see Allow to get rid of nix develop "shell" logic #7501

Notably absent from the definition of a package:

  • drvPath: this is an implementation detail. A package consumer really only cares about outputs, which don't need to be provided by a single derivation. Derivation path(s) can be recovered from the outputs.
  • overrideAttrs, buildInputs: attributes related to the construction of a derivation. These are implementation details
  • passthru: not related to the construction of a derivation, but an implementation detail for setting package attrs as opposed to derivation attrs. This attribute owes its entire existence to overrideAttrs. Without overrideAttrs, // is sufficient to set package attributes.
  • all: not used by Nix itself, rarely used in Nixpkgs (regex: (?<!platforms)(?<!lib)(?<!builtins)[.]all; still mostly false positives. Only one clear usage, which has a TODO on it)

Steps To Reproduce

  1. Define a package where one output comes from a different derivation. You may want to do this to keep derivation dependencies to a minimum (e.g. doc autoconf: build offline html documentation nixpkgs#172103 where it would be more desirable for the non-doc outputs not to depend on texinfo).
  2. Be confused about what drvPath should be.
  3. Install the package and note that the output from the non-drvPath output wasn't included.

Expected behavior

Nix represents packages by its outputs and metadata, not the drvPath implementation detail.

nix-env --version output

2.8. Or 2.x really. I would appreciate a major version increase for the (subtle) change in behavior.

Additional context

I came across this problem again in Nixpkgs today and figured I had to share my thoughts. I guess I should turn it into an RFC? I can do that later if y'all agree that we need something like this.

There's also NixOS/nixpkgs#172008 which is really a different problem, but depends on this issue, as this issue defines the interface for what current and future Nixpkgs' must implement.

I can't change the bug label on this issue. It's really a design issue rather than a bug, so another label would be more fitting. Can I have more permissions on this repo?

@vcunat
Copy link
Member

vcunat commented Jun 3, 2022

Documentation is quite a frequent problem, I think. Well texinfo above is quite cheap, but you commonly have bigger tools like pandoc. If these docs don't get split into a separate derivations, we get more prone to huge "unexpected" rebuilds. So far we often just don't build expensive docs, as many people are used to online resources.

@roberth
Copy link
Member Author

roberth commented Oct 29, 2022

Found this note at DerivedPathBuilt

/*
[...]
 * Note that does mean a derived store paths evaluates to multiple
 * opaque paths, which is sort of icky as expressions are supposed to
 * evaluate to single values. Perhaps this should have just a single
 * output name.
 */
struct DerivedPathBuilt {

Seems like a change worth implementing for this issue. The notion of a single derivation with its outputs seems appropriate at the store level, but up from there, built paths are what matter and there's no reason to tie them to their derivation. By making the suggested improvement, it seems that we get a bit closer to multi-drv packages.

@Ericson2314
Copy link
Member

@roberth in the later RFC 92 patches I do indeed make a SingleDerivedPath so only the last step (baz) of a foo^bar^baz chain is potentially multiple paths. So we can consider using that more. On the other hand the wildcard ^* means we cannot get rid of the multiple one completely (if the DRV is unbuilt yet we cannot resolve the *).

@roberth roberth changed the title A definition of "package" to clarify RFC-92 packages, and multi-drv packages that follow from the new definition Redefine outputs in terms of language-level package, not necessarily store-level derivation (RFC-92, and multi-drv packages, docs) Nov 8, 2022
@roberth
Copy link
Member Author

roberth commented Nov 8, 2022

@Ericson2314 I would consider ^ to be more of a "power user" thing, because packages should use outputOf to hide it.
I don't think foo^* is useful within the Nix language; at least not with the way we currently represent outputs in the package attrset. This doesn't seem to be a great loss, because the package expression could presumably force the inner derivation to provide all requested outputs, returning an empty directory for output if the inner derivation determines that some output isn't useful.
So I don't think ^ affects usability too much. It just makes the CLI-level logic slightly more complicated in a few cases, but nothing too crazy.

I've updated the issue title and description to clarify the goal of the issue.

@Ericson2314
Copy link
Member

Ericson2314 commented Nov 8, 2022

@roberth Oh sure, What I just mean is that the comment you referenced above is quite likely one I wrote! :) And the SingleDerivedPath is close to its resolution.

You might want to take a look at #7261. I agree ^ should be not need by regular users / we should make computed derivations ones not need to be used differently. I have just been trying to wrap my head around the plumbing (which is subtle enough!) before we get to the porcelain.

In a way, this issue here could be a joint effort between the Nixpkgs Architecture team and Nix team because the cross-cutting concerns invovled.

@Ericson2314
Copy link
Member

#7467 somewhat relates to this.

@Ericson2314
Copy link
Member

So currently in the docs we have "store derivation" and "derivation", and what I like about this is it completely decouples the logic:

  • derivation: the store layer concept

    • single build step
    • multiple results
  • package: the eval layer concept

    • a collections "things" (let's not call them outputs, might not have single shared derivation they are built from),
    • have meta.<things>ToInstall, so by default one gets some subset, but different subsets can be selected instead

@blaggacao
Copy link
Contributor

blaggacao commented Jan 16, 2023

Notably, the proposal [to define a "package" data type that nests derivation(s)] would solve laziness of meta (and passthru). And obsolete nixpkgs's recently added lazyDerivation.

That means, you can then peak, for example, at meta.description without almost certainly risking evaluation of drvPath, as well. In a heavy IFD case and short of any proposed solutions to IFD, that's a heck of a cost.

That means recovering metadata from packages finally gets the competitive pricing it deserves that is more closely related with its true production cost.

@roberth
Copy link
Member Author

roberth commented Jan 16, 2023

meta.outputsToInstall has already set a precedent for the expression-level package to be different from the derivation, but to a "lesser" degree. This issue can be seen as a suggestion to lean into that distinction and make it more useful.

@fricklerhandwerk
Copy link
Contributor

I can't tell why this would have to be a Nix language concept. What precludes Nixpkgs of making that abstraction on top of derivation?

@roberth
Copy link
Member Author

roberth commented Jan 16, 2023

What precludes Nixpkgs of making that abstraction on top of derivation?

The CLI uses drvPath instead of outputs and its corresponding attributes.

I can't tell why this would have to be a Nix language concept.

The language itself remains unchanged. derivation, or even better derivationStrict will keep supporting the output-related parts of these attribute sets.

@Ericson2314
Copy link
Member

@fricklerhandwerk Also check out https://github.com/NixOS/nix/issues/ There is a tension between these too things:

  1. The low level store path installables ought to be explicit as possible. Stuff like .drv punning is bad for programmatic usage like cat paths | xargs nix blah where we want the same behavior on every store object.

  2. The high level installables we want to be ergonomic for typing, even if it makes their usage more complex.

The easiest way to resolve this tension is probably to cut the cord between them: different idioms for different level of abstraction, disjoint terminology (package vs derivation).

@nrdxp
Copy link

nrdxp commented Jan 17, 2023

I can't tell why this would have to be a Nix language concept. What precludes Nixpkgs of making that abstraction on top of derivation?

Probably not strictly necessary, but it may be useful for this to be a language level concept. It would essentially encode the same knowledge as a profile, but a posteriori. Since a "useful package" is really the final aim of why we use Nix in the first place, it makes sense for it to be at the root.

It may also have the effect of making our current terminology more intuitive. We have derivations, but what are they derived from exactly? A language level package construct makes the answer tangibly obvious.

@roberth
Copy link
Member Author

roberth commented Oct 29, 2023

meta and "passthru" strictness, data model, what should be top level?

For what it's worth, meta is wrong from a strictness perspective.
pkg.meta makes the value of meta strict in pkg, where pkg is a non-trivial computation that may not even work.

It could be argued that a better representation of the (human) package concept is to start with meta, and perhaps have an instantiatable derivation inside of it, inside an attribute.
However, this would be unpleasant to use. Perhaps the strictness issue is better to be fixed by making sure that the package attrset is always cheap to compute. This means recognizing typical use cases, providing a standard attribute for each use case, which is always allowed to exist, but may be null. This way, a package can always be defined with a function body that's an attribute set literal, without // and without dynamic attributes.
Although it's not pretty, it works around the strictness problem rather well, solving such issues as

  • Lazy attribute names #4090 (comment) without the requirement that they don't use ? (they won't be motivated to, and they can do a != null instead).
    • note that they define a non-standard attribute. These would have to be added somewhere in the package attrset; probably behind a .extra attribute or something ugly like that. Or whoever came up with the custom attribute has to solemnly swear that they don't make the existence of the attribute (attribute name) depend on anything non-trivial.

@fricklerhandwerk
Copy link
Contributor

fricklerhandwerk commented Oct 29, 2023

An attribute set for a package is reasonable, and that's also what the flake schema goes for. But all this is an issue for Nixpkgs, because that's where Nix-language-level metadata is currently lumped together with build configuration, and where a more scalable convention should be established first. There is little Nix can (or should, IMO) do about that.

The recently renewed design discussion around lockfiles could offer a place for statically displaying outputs (including metadata) as well, so they don't require computation to determine on the consumer's end. Then we don't have to artificially restrict expressive power in the package declaration itself.

@roberth
Copy link
Member Author

roberth commented Oct 29, 2023

Is this discussion in scope for NixOS/nix?

But all this is an issue for Nixpkgs, because that's where Nix-language-level metadata is currently lumped together with build configuration

Currently there's an implicit interface between Nix and Nixpkgs, and that's the topic of this issue. If I were to put this issue in Nixpkgs, I'd get the exact opposite response because there maintainers expect Nix to lead in such changes. Iirc if you ask Eelco he would say Nix defines the DSL and Nixpkgs just implements that, which would be consistent with my assumption about Nixpkgs maintainers.

Unless you create a repo nixpk for this issue - between nix and nixpkgs - I will keep collecting thoughts here.
If you disagree, I kindly ask that you just ignore this issue. I have zero appetite for a meta-discussion that's going to be even less relevant than my notes, interspersed between here.

@roberth
Copy link
Member Author

roberth commented Oct 29, 2023

The recently renewed design discussion around lockfiles could offer a place for statically displaying outputs (including metadata) as well, so they don't require computation to determine on the consumer's end. Then we don't have to artificially restrict expressive power in the package declaration itself.

Wouldn't this create a need to update the lock file whenever the expressions change? I don't think using the lockfile as a cache is a good idea, and I'd be happy to explain that in a suitable issue/thread. IFD is one reason.

Another problem with this is a data model problem that I didn't illustrate at all. It's the problem of the linked comment. What if your update tooling needs to read meta while the package can not be evaluated yet.
Similarly, what about a package that can only be evaluated on a certain system? Or any system? Much of meta is not dependent on system whereas all instantiation is completely dependent on system. Nonetheless you have to pass a system (explicitly or implicitly by accessing an attr, if you're lucky to have one) in order to get meta. A lockfile based solution would still suffer from this problem.

@Ericson2314
Copy link
Member

Yeah sounds like a unit in RFC-140 speak is closer to a pair of a package function and a meta; much/all of the meta should not depend on parameters of the package function.

I do agree that even if some things are Nixpkgs only, and we have a Nixpkgs CLI to deal with those, we do need something else for Nix<--->Nixpkgs and that is what this issue deals with.

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/flakey-profile-declarative-nix-profiles-as-flakes/35163/3

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-12-08-nix-team-meeting-minutes-110/36721/1

roberth added a commit to hercules-ci/nix that referenced this issue Jan 13, 2024
A small step towards NixOS#6507

I believe this incomplete definition is one that can be agreed on.
It would be nice to define more, but considering that the issue
also proposes changes to the design, I believe we should hold off
on those.

As for the wording, we're dealing with some very general and vague
terms, that have to be treated with exactly the right amount of
vagueness to be effective.

I start out with a fairly abstract definition of package.
1. to establish a baseline so we know what we're talking about
2. so that we can go in and clarify that we have an extra, Nix-specific
   definition.

"Software" is notoriously ill-defined, so it makes a great qualifier
for package, which we don't really want to pin down either, because
that would just get us lost in discussion.
We can come back to this after we've done 6057 and a few years in a
desert cave.

Then comes the "package attribute set" definition.
I can already hear Valentin say "That's not even Nix's responsibility!"
and on some days I might even agree.
However, in our current reality, we have `nix-env`, `nix-build` and
`nix profile`, which query the `outputName` attribute - among others -
which just don't exist in the derivation.

For those who can't believe what they're reading:

    $ nix-build --expr 'with import ./. {}; bind // {outputName = "lib";}' --no-out-link
    this path will be fetched (1.16 MiB download, 3.72 MiB unpacked):
      /nix/store/rfk6klfx3z972gavxlw6iypnj6j806ma-bind-9.18.21-lib
    copying path '/nix/store/rfk6klfx3z972gavxlw6iypnj6j806ma-bind-9.18.21-lib' from 'https://cache.nixos.org'...
    /nix/store/rfk6klfx3z972gavxlw6iypnj6j806ma-bind-9.18.21-lib

and let me tell you that bind is not a library.

So anyway, that's also proof of why calling this a "derivation attrset" would be wrong, despite the type attribute.
@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2024-02-26-nix-team-meeting-minutes-128/40496/1

roberth added a commit that referenced this issue Nov 12, 2024
The new package output attributes are somewhat experimental, and
provided for compatibility most of all.

We'll see how well this goes before the changes proposed in
#6507
roberth added a commit that referenced this issue Nov 12, 2024
The new package output attributes are somewhat experimental, and
provided for compatibility most of all.

We'll see how well this goes before the changes proposed in
#6507
roberth added a commit that referenced this issue Nov 12, 2024
The new package output attributes are somewhat experimental, and
provided for compatibility most of all.

We'll see how well this goes before the changes proposed in
#6507
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc.
Projects
None yet
Development

No branches or pull requests

10 participants