-
-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add nix-prefetch-source #21734
add nix-prefetch-source #21734
Conversation
I'm very much 👍 for a {
"type" : "fetchurl",
"args" : {
"url" : "http://...",
"sha256" : "fsf45"
}
} A logical improvement might then be making |
Also of interest might be the update-nix-fetchgit tool https://github.com/expipiplus1/update-nix-fetchgit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm 👎 for this PR since for 2 reasons
-
there is very little value if there is a script that just calculates hash of the changed revision. you still need to figure out manually which is the new revision.
-
none of the update scripts are that generic as described here. example: when
fetchFromGitHub
is used you also need to know which update policy you would follow. should you follow the master branch, some other branch, etc...
~20 days we added an updateScript
to the nixpkgs
(56cb5b7) which lets you run updateScript
for any expression that has updateScript
defined. and i understand why you don't know about it, since it isn't yet used in many places (only 7 package use it).
here is what i would propose.
let
src = fromGitHub { # <- this is the helper method that needs to be written to make this magic happen.
owner = "NixOS";
repo = "nixpkgs";
branch = "master"; # <- this the branch we follow for updates
path = "src.json"; # <- this is the file which we write/read source information to
};
in stdenv.mkDerivation {
name = "blabla";
inherit src;
passthru.updateScript = src.updateScript; # <- we can make this line obsolete but i though it would be nice show how everything works together with current updateScript solution
}
ofcourse everything could be reduced to (if we would make use some default values) stdenv.mkDerivation {
name = "blabla";
src = fetchFromGitHubWithUpdate {
owner = "NixOS";
repo = "nixpkgs";
branch = "master";
};
} in above example we would change |
as a proof of concept I tried to implement it here: 45dcfb3 to run update for pdf2odt from above commit you need to run: |
Fair point, the tool probably could now edit a file in-place. Actually the code should already work, we might just want to make it the default mode of operation. As for where the arguments live, you're right that they're basically the same, but I don't want to force them to be the same (as your proposal would do). In fact one of the first things I tried involved extra arguments which mean nothing to the fetcher. e.g.:
Here the fetcher doesn't know or care about Ultimately the format of src.json is just an implementation detail, since the logic for generation and consumption is all in one place. So I don't mind what it looks like, as long as it works ;)
Well, that depends. If you do have an automated way of knowing what the latest release is, you could always generate src.in.json (or just pass along that information as a command line argument). Or if it's general enough, add it to the relevant nix-prefetch-source handler.
For this example you would specify which branch to track using the I did know about the As far as I can see, you'd more or less be implementing the same logic whether you do it in If we put the logic in nix-update-source, Personally, I think it's cleaner to use JSON or command line arguments over env vars, and I think that implementing this in each fetcher would lead to inconsistent behaviour and ad-hoc argument parsing (as we already have with the various prefetch scripts), but I really don't mind where this functionality lives. I didn't mention it earlier because my initial description was already long enough, but one more thing I'd like to have is a minimal way of specifying some arbitrary source code from the internet. One neat thing that putting everything in
Mostly I use this for development versions, and don't expect it to appear in nixpkgs proper. But it's still a very useful thing to enable. Keeping the logic in fetcher-specific nix expressions wouldn't easily allow this, as the information about which fetcher to use wouldn't be part of the JSON. You'd have to write:
Basically I want to make sure that I can blindly import a piece of code from somewhere without knowing anything about it, using only the machine-generated |
@timbertson i have updated an example from before a little bit (45dcfb3) if you ignore the details of from stdenv.mkDerivation rec {
version = "2014-12-17";
name = "pdf2odt-${version}";
src = fetchFromGitHub {
owner = "gutschke";
repo = "pdf2odt";
rev = "master";
sha256 = "14f9r5f0g6jzanl54jv86ls0frvspka1p9c8dy3fnriqpm584j0r";
};
...
} to stdenv.mkDerivation rec {
version = "2014-12-17";
name = "pdf2odt-${version}";
src = fetchFromGitHubWithUpdate {
owner = "gutschke";
repo = "pdf2odt";
branch = "master";
};
...
} Maybe we could actually make changes directly to |
Yep, I definitely like getting this for free in I still think it should be accessible from an external tool you can run as well though, since I will want to use this on packages which aren't in nixpkgs (i.e development versions). As far as I can tell the update mechanism can only work for official packages. |
I’m very much against ad-hoc editing of files in place. That just screams unmaintainable. I have created a simple POC a few days ago that explores the design space for a general update API. It does not yet contain the code for making fetchers updateable. Also I haven’t had the time to compare it to @garbas It also depends on the nixpkgs testing structure I am very close to creating a PR for, since one definitely doesn’t want automatic updates without being able to test if something breaks. |
Agreed, although you need to update an expression before you can test it - I think they're independent concerns. |
FWIW, I worked on a similar thing in #21766. |
I'm very much against |
@edolstra the purpose is not to update src.json/src.nix manually but to have the update process scripted. instead of updating sha/version manually you would run i personally prefer this to be in a separate file, but on the end - as it is already the case - it is up to the maintainer and the package you are talking. eg. this makes little sense for simple packages, but makes a lot of sense for thunderbird-bin/firefox-bin. the most important thing is that there is one way how to run the update scripts and that we go away from manually updating versions. |
@edolstra as Rok said, the intent is to remove the need for manual editing of source infromation at all (in practice, this means copy/pasting from a There's certainly nothing forcing maintainers to adopt this process, but if you want to be able to automate the updating of sources, I don't see any way to allow that other than:
|
Except if @edolstra gifts us a full-blown semantic introspection framework. :P |
Editing the This tool could greatly benefit from having There's also not much stopping it from being used for non-git sources. |
@expipiplus1 that's indeed quite impressive. I worry that having this system be implemented in haskell might limit the ability of people (myself included) to contribute though - I've written some haskell and liked in the past, but it's not necessarily easy. Don't get me wrong, if this tool could do everything I want[1] then I'd be all for it, but I think it would take quite a lot of work to get to that point, and I don't know who'd be willing to do that work. And I feel like there's quite a lot to be said for using a simple, ubiquitous file format for machine-editable data rather than taking on the complexity of editing .nix files in-place. I fact, when I added nix syntax output to [1]: off the top of my head: fetchfromGithub, fetchurl, and allowing me to direct the update algorithm either via a config file or arguments |
nixpkgs is mostly written in perl, for Stallman’s sake |
import = path: | ||
let | ||
json = lib.importJSON path; | ||
fetchFunction = builtins.getAttr json.type pkgs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it's safe to use any function from the pkgs as a fetcher? In my version I was thinking of classifying them into their own attrset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean just a subset of allowed functions as fetchers? That should be fine. I didn't expect to need to protect against malicious input, but it probably can't hurt to limit the options to only what we need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I can see how the generated file will probably come from an untrusted source in the future. Anyways, it's an improvement and shouldn't be a blocker for this PR.
@timbertson mentions the "complexity of editing .nix files in place". As someone who has made a tool to edit Nix files in place (update-nix-fetchgit, along with @expipiplus1), I'd say it is actually not that complex. The only thing you need to do when updating to the latest source version is to change certain strings in the file; you don't need to reorganize/refactor/reindent the Nix expression. So you just need a parser that can parse Nix files and retain source location information (like hnix), then you need a thing that scans the parse tree for known patterns to find places where source code is being fetched (Haskell is great at pattern matching), and then you just need to replace the strings you found with their updated versions. The advantage of updating things in place is that you don't need to have JSON files or restructure all the Nix expressions; you can mostly just work with the expressions we already have, which are more readable. I agree with @edolstra's comment, though that was mostly about editing. I think that it's nice to have fewer files to look at when you are trying to understand a Nix expression, regardless of whether you are editing it or not. It's nice to just have one file that has both the version number of the program you are compiling and the instructions for compiling it. Like most code, Nix expressions are probably read more than they are written, no? So splitting up every Nix package into two separate pieces just to make life easier for developers of automated updater tools at the expense of readability doesn't seem like a good tradeoff. Regarding this PR: I have not looked too carefully at it, but there are already a bunch of tools in nixpkgs for updating or prefretching source code and they don't have much cohesion, so adding another one to the mix does not seem too bad. Just putting it in the repository does not force people to use it, or tell people that it's the recommended tool. In summary, I am in favor of tools that provide in-place editing of Nix expressions and are written in high-level languages. I don't think such a tool would need a huge development effort, so a difficult language like Haskell is OK. And update-nix-fetchgit is a good example. |
My concern with updating nix files programmatically is that the space of possible nix files is far larger than anything we can reason about statically. fetchgit {
url = "blah";
sha256 = "omg";
} is pretty easy to track and update programmatically, but what about this? let
apply = x: y: x y; # or `x: x` :)
in apply fetchgit { url = "blah"; sha256 = "zomg"; } Should it update that? Take for example what I used to have in My fear with something like Edit: to be clear, this is a more general concern than just Edit 2: some previous thoughts on related topics: #19582 (comment), #19582 (comment), #19582 (comment) Edit 3: NixOS/nix#520 almost feels like the real crux of the matter, but that's even more speculative |
There's certainly a balance to be found here. Perhaps the tool could have a "lenient" mode where it updates every attribute set with Alternatively the updater could be taught about particular idioms being used, such as the one in The argument could certainly be made that if one has automatic updating of these sections in the source, does it matter as much that they aren't as unrepetitive as possible? I suspect that there are cases where the repetition is worth getting automatic updates and other times when it reduces the clarity of the source. |
Holy crap you speak from my heart. What good is a code generation tool if the programmer has to constrain hirself to make it work?
There won’t be a single tool (because that’s stupid). There will be a bunch of scripts written as nix functions that should allow for composition. |
Well, it writes to the file you told it to (via Do you'd agree that |
Sounds good to me. |
d140b0e
to
7fdad9c
Compare
OK, https://github.com/timbertson/nix-update-source it is! I've pushed new code to this branch to reflect the rename. |
python3Packages.buildPythonApplication rec { | ||
version = "0.2.1"; | ||
name = "nix-update-source-${version}"; | ||
src = fetchurl { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fetchFromGitHub
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point - done.
7fdad9c
to
fd92dd0
Compare
@garbas I think we can merge it in this stage. |
Sorry, but I've reverted this for two reasons:
|
This reverts commit ca38ef7 due its use of importJSON and external source info files, which is non-idiomatic.
I don't agree that JSON is just about tool writers: it's about the fundamental data model underlying it. A language with lambdas/abstraction like Nix, even if we had easy AST manipulation tools, is fundamentally harder/impossible to programmatically modify because people can write something equivalent to a given expression in a bajillion ways, some of which don't even obviously terminate. JSON, ugly as it may be, represents a smaller universe of values, and generally has a fairly canonical representation. So I see preferring JSON as a data modeling decision more than a tool-writer's convenience decision.
… On Jan 30, 2017, at 11:44, Eelco Dolstra ***@***.***> wrote:
Sorry, but I've reverted this for two reasons:
It relies on importJSON, which I think is unnecessary. JSON doesn't add anything over Nix expressions and requires users to constantly switch between two syntaxes when editing a Nix package.
It enshrines a style of having external src.json (or src.nix) files, rather than having the version info in the main package expression. IMHO having one file is more readable (and doesn't pollute the repo with lots of tiny files). I realize that this is subjective, but it is the standard structure for Nix packages which shouldn't be changed without good reason. (And "a separate src.json is easier to generate" is not a great reason, because it prioritizes the convenience of tool builders over the convenience of human readers.)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@edolstra Hi, I think this commit became a tool which some want to use and not imposing any changes that would include As for the separate Will my contributions be rejected if I continue like this? Just wondering if I should spend my efforts in Please reconsider your stand on this I allow some maintainers to choose their way of updating packages. |
@edolstra sorry for piling up but I'm quite shocked to see this PR reverted based on, as you say it, subjective reasons. Please consider the effect it might have on the contributors when doing something like this. |
Is there no way to automatically update |
+1 on this |
There is work for that in #21766. |
I'd be 👍 on this if it updated version and hash in the nix file in place but I don't like the src.json approach, too and would agree with @edolstra here.. |
I might be an outlier, but when I update packages, the time spent on bumping the version and hash is typically a very insignificant fraction of the whole process. (Just my 2 cents.) |
@edolstra You might consider removing the upstream-updater tool that @7c6f434c made from nixpkgs, which is designed to have two separate files for every Nix package: This is partially a response to what @garbas said: I would prefer to have style guides and rules that are adhered to throughout nixpkgs rather than leaving big decisions up to individual package maintainers. That way, it's easier to work on any part of the project without learning new patterns. A senior member of the project can dictate what those rules are. It would also be great to get some official rules about how to write automatic Nix expression updater tools, like what languages and libraries to use, how the user interface should work, etc. (My two cents is to prefer Haskell and hnix but I can see the advantage of C++ and tolerate other languages too.) @garbas: Wouldn't @jgeerds: If the |
I must admit I haven't bothered to update a minor release of zsh-completion, mostly because it's kind of a hassle to get the hash and update the file manually (+ create the commit and PR). This is partially because I don't do it very often and have fumbled a bit in the past getting the correct hash. (esp. with fetchFromGitHub) Of course one should spend some time testing a new package version and make sure the dependencies haven't changed too. But if you actually use the package testing is simply getting the new version and using it as you normally would for a while. I think a script that handles these simple updates would result in more up to date packages. Ideally the script would create a branch and commit too :) |
@copumpkin For the sake of human-readability, I would say that you should be structuring your Nix expressions in a way where it is somewhat easy to tell what sources could be downloaded. And if you do that, I don't think it's too hard to have a computer program parse it and update it. I don't think you should be doing something too complex to calculate the URL of some source code or its hash. |
But even today they are very rarely literal strings, as people very
regularly splice the package name and version into various parts of the
URL. Furthermore, I'm not saying we should obfuscate it, but in many cases
there are very repetitive patterns between sets of related packages (some
releases are unnecessarily granular, or an organization has a very regular
pattern), for example with LLVM or the Apple source releases, and it
clutters up expressions to write the same fetchurl/fetchFromGitHub
boilerplate over and over again. We also have no good way to prevent people
from abstracting that sort of thing (and they have demonstrated repeatedly
that they will do that) so I'd much rather restrict ourselves to a domain
that we know ahead of time will always work. Nix is good at specifying
arbitrary structures, but when there's a very clear domain model for
external dependencies, I see very little downside to specifying those
dependencies explicitly in terms of that model. Keeping everything "in the
same language" is great until you want a machine to touch it, and that's
where we are today.
…On Mon, Jan 30, 2017 at 23:43 David Grayson ***@***.***> wrote:
A language with lambdas/abstraction like Nix, even if we had easy AST
manipulation tools, is fundamentally harder/impossible to programmatically
modify because people can write something equivalent to a given expression
in a bajillion ways, some of which don't even obviously terminate
@copumpkin <https://github.com/copumpkin> For the sake of
human-readability, I would say that you should be structuring your Nix
expressions in a way where it is somewhat easy to tell what sources could
be downloaded. And if you do that, I don't think it's too hard to have a
computer program parse it and update it.
I don't think you should be doing something too complex to calculate the
URL of some source code or its hash. I think it is not an unbearable
restriction if all source URLs have to be literal strings that are directly
passed to arguments of a fetchgit, fetchurl, or a similar function. You
can still store those fetchgit calls in some higher-level data structure
and manipulate them in a billion ways.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#21734 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAAKPxv0zf1duDqtwG1H9XtDYybNjPnrks5rXmeigaJpZM4Ldaga>
.
|
@edolstra I would say that most of NixPkgs is pollution with little files (when compared with |
@edolstra I'm sorry to hear that. I'd like to address a few of your comments:
I find it very hard to believe this would be a problem in practice. JSON is just as readable as nix. And you will never be writing JSON source information wholesale, you'll be:
There's been no agreement or encouragement that this is the one true way to automate updates or to structure derivations, it's just one option available to maintainers to choose if they want to. I hope a smarter tool will appear which has the same features but can update inline nix expressions. But that doesn't exist, and I doubt it ever will - it's simply much trouble to bother with. Those that disagree are welcome to build it, and I'll gladly use it. But until then, I'm saddened to see a useful option being denied when there are no viable alternatives.
I've got two responses to this: First, the information in a Second, there are a lot of packages in nixpkgs, and they're mostly (all?) maintained by humans. Forgoing easy automation in favour of an inline syntax which must be updated manually[1] increases the effort required from every maintainer, for every package, for every version bump. Each instance is a small inconvenience, which some people aren't bothered by. But some are, and that can add up to a big deterrent. It's personally kept me from contributing multiple packages to nixpkgs - not because it's hard, but because it's tedious and error prone, and I already have enough work to do maintaining software. [1]: Yes, I'm ignoring |
Motivation:
I've always found updating sources to be somewhat awkward. I don't know how others do it, but for me it's usually a matter of running
nix-prefetch-$whatever
, then copy-pasting the relevant fields into my nix expression.But tediously, I need to keep the source information updated in multiple locations:
nix-build
ornix-shell
from a checkout)In fact, this friction has prevented me from bothering adding packages to
nixpkgs
in the past, because the other requirements are already inconvenient enough. Which is sad, and I want to fix that.Goals:
Make it painless (and automatable) to update a nix expression to build the latest upstream source code. I was particularly inspired by Rok's recent blog post on automating updates, and by discovering the existing support in
maintainers/scripts/update.nix
for scripted updates.Make it less tedious to keep upstream
.nix
expressions in sync with the nixpkgs repositoryApproach:
I've created a tool called
nix-prefetch-source
. It's a wrapper around the othernix-prefetch-*
scripts, with the aim of making source updates as automatable as possible.So for example currently you might:
nix-prefetch-git <repo> HEAD
rev
andsha265
into your nix expressionIt's not hard, but it's not automatable. With nix-prefetch-source, you store what you want to fetch as JSON:
(you can pass them directly as command line arguments if you don't want to store this as JSON)
Then you run
nix-prefetch-source -i src.in.json -o src.json
, and out pops the latest source for those inputs:This can be re-run at any time to get the latest sources. It contains whatever inputs were given, plus a set of attributes in
fetchArgs
suitable for passing directly tofetchgit
.And indeed my PR adds a convenience function,
importSource
which takes a path to this kind of file, and returns an object with all the same properties plussrc
, which is really just(getAttribute input.type) input.fetchArgs
.It's generic enough that it ought to be able to support every fetcher, by simply adding a handler to
nix-prefetch-source
, which knows how to perform the prefetch and then generate the arguments to give to the fetching function.Here's how I've used to package a simple python package that I happened to be packaging today:
This separation conveniently means that
nix/default.nix
can be copied wholesale intonixpkgs
, as long assrc.json
comes along with it. You can even use a differentsrc.json
while keeping the samenix
expression, if for example you want to reference tagged releases innixpkgs
usingfetchFromGitHub
but need to reference a specific git commit viafetchgit
in development.Next steps:
Any thoughts? objections? I believe the PR should be mergeable and useable as-is, although there's opportunity for a bit more integration if we want to include
nix-prefetch-source
in nix-prefetch-scripts. I haven't done this yet because I wanted to validate the approach first, and I'd need to invert a dependency (currently it depends on nix-prefetch-scripts)