Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

autodetect setuptools/missing pyproject.toml plus many overrides #1711

Closed
wants to merge 16 commits into from

Conversation

TyberiusPrime
Copy link
Contributor

This is an attempt to get rid of the thousands of setuptools overrides, by detecting if there is no pyproject.toml present in the source.

shall tackle #1591 and #771

but I'm not convinced it's doable yet.

I'll try to build 'em all, and in parallel see what the CI says to this hack.

@TyberiusPrime
Copy link
Contributor Author

At the very least this is going to identify a bunch of overrides for non-existing packages in the overrides file.

@TyberiusPrime
Copy link
Contributor Author

So, building the 8241 packages that had 'setuptools*' annotations in build-systems.
changes: removing all setuptools entries from build-sytems and augmenting mk-python-dep.nix:

  • before 1540 packages failing.
  • after 1364.

A cursory glance suggests that might be because some of those ~176 packages had upstream packages that were missing the setuptools annotation.
Of note: no regressions, it's strictly a superset of packages now building.

Looking at the top 15 reasons why packages still fail:

  1. hatchling 214

  2. flit_core 112

  3. pytest-runner 71

  4. compilation failure 61

  5. 3.12 compatibility: SafeConfigParser 57

  6. poetry 52

  7. maturin 50

  8. setuptools_rust 38

  9. buildInput: missing cmake 38

  10. poetry: missing masonry 29

  11. nix: error: value is a set while a string was expected 28

  12. source broken: missing file 24

  13. pdm 23

  14. meson: mesonpy 21

  15. setuptools_scm 21

  16. smells like some kind of poetry2nix bug.

@TyberiusPrime
Copy link
Contributor Author

I build the 8000 most downloaded packages from pypi, here are the stats:

total 8000
done 7927
poetry fail 69
both success 3773
both failed 2279
now succeeding (=gains) 1874
now failing (=losses) 1

(this is after reintroducing a few setuptools-scm buildSystems).

The failure is in xtcocotools, where pip tries to download python-dateutils, and I currently have no idea why whatsoever. That's not in that packages source...

I'm aware that there are 4 packages unaccounted for. I'll investigate in the morning.

@cpcloud
Copy link
Collaborator

cpcloud commented Jul 5, 2024

@TyberiusPrime Friendly ping for this PR. Is the expectation that you'll get existing CI to pass?

@TyberiusPrime
Copy link
Contributor Author

TyberiusPrime commented Jul 5, 2024

@TyberiusPrime Friendly ping for this PR. Is the expectation that you'll get existing CI to pass?

Absolutely, this must pass the existing CI, and not loose a single package that was building previously.

It has turned into a bit of quest and mania.

I have defined a python package set with from

  • poetry2nix overrides
  • nixpkgs
  • top 8k downloads
  • pypi-debs-db with more than 5 direct dependencies
    which gives me 17271 packages.

Each of which I stick into a flake whose pyproject.toml just requests that one package.

After some 'poetry doesn't generate a lock file' losses later,
that ends up being 16924 packages, of which the upstream poetry2nix (rev = master when I started this...)
only builds 41%.

Or more like fails to build 59%, or exactly 9999 packages :(.

With the setuptools auto-detection and a ton of override-tuning,
I'm now at 80% of my package set building.

I'll push again in a minute, to see if I'm making the CI happy,
and spend some more cycles this weekend, to get even more packages building.

Let me know how you want this structured,
one giant PR, or one for the setuptools-stuff, followed by one for all the other overrides, or...
(I'll squash the commits eventually, for now it's a mess)

PS:
These are the numbers for 'how many different packages do we actually build', with the dependencies of the
17k set included:
pkgs * versions
Counter({'with_patch': 138676, 'without_patch': 48592})
pkgs
Counter({'with_patch': 14925, 'without_patch': 7286})

I guess poetry does select very different versions depending on what package I request.

@TyberiusPrime
Copy link
Contributor Author

It just occurred to me that if we have to download and check for pyproject.toml anyway, we might as well open it up and read the build systems if present.

@TyberiusPrime TyberiusPrime force-pushed the no_setuptools branch 3 times, most recently from bd538e1 to 2de9144 Compare July 9, 2024 10:27
@TyberiusPrime
Copy link
Contributor Author

Ok, I got the CI to run just one job locally.
All you need to do is to comment out all other tests in tests/default.nix (and the formatting testin flake.nix), and then run 'act push --job nix-build', and you'll run just one job, which at least doesn't take until the heat death of the universe.

This should allow me to get the last 7 failing CI test fixed.

@TyberiusPrime
Copy link
Contributor Author

TyberiusPrime commented Jul 10, 2024

Lol, that CI setup is messing with me. a 4+hour long job, and I can't even get the log of the job that failed after 2h because the log is too long and github says 'wait till they're all done, then download the log'.

(it was scipy. Here's to hoping this finishes now).

@TyberiusPrime TyberiusPrime changed the title WP: No setuptools autodetect setuptools/missing pyproject.toml plus many overrides Jul 11, 2024
@TyberiusPrime
Copy link
Contributor Author

yeah, the CI passes :)

All right, @cpcloud, I'm ready to move this from 'work in progress' to 'please judge this puppy, thank you'.

I can squash the CI-induced commits down if you want, but I didn't want to trigger another 4.5h CI run.
We really should consider splitting up those macos jobs.)

I still think we can extend this, in combination with the existing auto-detection for 'local' packages, to cover more than just setuptools/(no pyproject.toml) auto detection, but for now, I think we should strive to land this just for the sheer amount of building packages gained.

That is if there's no fundamental issue with digging around in the $src to get our buildInputs that I'm missing.

srcRoot + "/${source.subdirectory}"
else
srcRoot;
missing_pypproject_toml = ! builtins.pathExists "${src}/pyproject.toml";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is IFD (import-from-derivation), and not permitted in restricted eval.
This would break poetry2nix for such setups (e.g. users of Hydra).

As well as just being terrible for eval performance:
https://fzakaria.com/2020/10/20/nix-parallelism-import-from-derivation.html

@TyberiusPrime
Copy link
Contributor Author

Fair enough @adisbladis , I had misunderstand the importance of the 'import' part of import-from-derivation.Nix-eval being single threaded & blocking is :(.

This leaves me seeing only two ways forward to increase the set of pypi packages poetry2nix can build, without burdening the users:

  1. either I extend my 'try to build, add overrides' approach to the whole 500k pypi packages... it's a quest, but probably just about doable once.
  2. or we supply all the usual suspects to the derivations, then sort through them and hide the unused ones in a hook.

No 2. would save a ton of overrides, at the cost rebuilding when any build-system changes.

I guess a third option would be to provide a tool that finds out the necessary build-systems given a pyproject.toml and a poetry.nix, but that's just a tad less user hostile than the current 'and then you need to figure out overrides' approach.

Either way, I'll readd the setuptools overrides for now (and add the new ones that the 17k package set needs), and resubmit as a new PR if that's cool?.

@TyberiusPrime
Copy link
Contributor Author

Oops, the cargo stuff is also IDF.

All right, then it's option 1 & 3 together, I suppose.

@heimalne
Copy link

With more and more "magic" happening under the hood it would great to address #1730 and support a user to get some tracing insights e.g. for debugging which package was added in what phase(?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants