-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove submodules from checkout actions #1332
Conversation
sdist are not a good choice for packagers because maintaining MANIFEST.in is too error prone, that's why the Debian package uses the GitHub tar archives. Vendoring the bits of nc-complex that are required to build is a better choice. |
In a perfect GitHub world I agree with you but I've seeing so many bad practices on both PyPI sdist and GH tarballs that I don't think one is better than the other. I'm one of the core maintainers for conda-forge and we deal with tons of Python scientific packages that, due to some "magic" are only possible to build from PyPI or GH depending on the level of acrobatics the maintainers are doing. Regardless, the use of submodules here broke both and we need to fix it.
Agreed. I'm not a main maintainer here. Let's see what they have to say. Maybe an alternative solution is to make that code optional. Not sure... |
Vendor the three required files, e.g. apply this patch: |
If you use setuptools_scm, you don't need to worry about |
by 'vendoring' you mean just including a fork of the needed source files directly in this repo? |
Yes, just copy the files needed for netcdf4-python into its source tree. And copy them again when you would have updated the submodule for a newer version. |
done in PR #1337 |
I would recommend continuing to use the submodule over a naive vendoring approach. You lose provenance, version, and licensing information (the last not so applicable to this project). While submodules can be a pain, updating vendored code is far more error prone and will be more difficult maintain in the long term. It will also be more difficult to determine where and when breaking changes occurred. |
Can you elaborate on why you think that is the case? I will also note, vendoring code in this way is also an increased security risk. It's much easier to verify where code came using submodules. A naive vendoring approach is much more susceptible to for example, a social engineering attack. From other issues I've read, the main justification has been it's burdensome when building wheels in CI. This is remedied by initializing the submodules after the checkout step in the gh action and including the necessary files in |
It also makes it impossible for distributions to build packages using the GitHub tarball. |
Thanks for the added context, @sebastic! That is really unfortunate... it looks like there have been several unsuccessful requests to add such a feature:
I the spirit of trying to find a compromise it seems like there are two main options:
I lean slightly to the second option b.c. fewer / no upstream changes should be required by for example, package managers. Not to mention it avoids the whole |
If nc-complex installs its headers and code as a header-only library it doesn't need to be vendored, nor when it provides the functions via a shared library. |
If I am following correctly, you are talking about the case when |
Yes. I haven't dug into which bits of nc-complex are being used by netcdf4-python, and whether or not it's feasible to expect those to be exposed by public headers and functions, but in theory it would remove the need to vendor this code in netcdf4-python. |
I see what you are saying. It could be treated like the |
This one will pass only after #1337 is merged. |
You have access to every sdist that was ever uploaded to PyPI, provided they weren't deleted. Also, sdists on PyPI are static and never change (stable sha256) whereas the hashes for github tarballs have been known to change, which creates its own problems. I avoid using the github tarballs whenever I can for packaging. |
@jswhit I'm not a big fan of submodules b/c they add an extra layer of complexity, specially with code tracking/versioning. This change will ship the latest submodule code with the sdist. We are already using the same when building the wheels. Which, BTW, stomped me for a while until I realized they were missing.
Closes #1331 and #1329.