Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ansible 6] Improve setup.py to exclude some unnecessary files #342

Merged
merged 4 commits into from
Mar 23, 2022

Conversation

felixfontein
Copy link
Collaborator

@felixfontein felixfontein commented Nov 24, 2021

This is currently depending on the Ansible major version, so it will only be used for Ansible 6 and its prereleases, but not for any Ansible 5 (or less) release.

This PR has two big changes:

  1. The code in setup.py which checks whether Ansible 2.9 or ansible-base 2.10 is installed is no longer there. This allows us to build a wheel for faster installation! (moved to [Ansible 6] Disable collision checks with Ansible 2.9 / ansible-base 2.10 in setup.py #394; for wheels see [Ansible 6] Build wheels for Ansible 6+ #395)
  2. In setup.py, exclude_package_data is passed to setup() to list some files and directories that are excluded from the bdists (wheels are such distributions), while they are still present in the sdist. This should make the lawyers happy, since the source distribution (the tarball on PyPi) still contains everything, while when you install that source distribution, or install a binary distribution built from it, you will not have these extra files.

Locally, after removing the __pycache__ files that the installation process creates, ansible_collections/ shrinks from 373 MB to 205 MB, and the number of files goes down from 31942 to 13682.

A side-effect is that setup.py is now larger, with 763 lines. The concrete instance I got looks like this: https://gist.github.com/felixfontein/cf70e8c52fae9dc98a5c9d058f54505f

(Note: this PR includes #341, I will rebase once that's through the review pipeline :) )

@felixfontein
Copy link
Collaborator Author

(Docs on exclude_package_data: https://setuptools.pypa.io/en/latest/userguide/datafiles.html)

Copy link

@jamescassell jamescassell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to see us finally able to ship a wheel! the smaller bdist with full sdist is also a good thing.

'{{ name }}/{{ namespace }}/.github/*',
'{{ name }}/{{ namespace }}/.azure-pipelines/*',
'{{ name }}/{{ namespace }}/changelogs/*',
'{{ name }}/{{ namespace }}/docs/*',

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is docs/ really unused by ansible? This is the only one I think might be good to keep shipping.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not used by ansible. It still might be useful though; I guess this is something we have to discuss. I do see both arguments for keeping them (having offline docs can be useful) and for removing them (often these contain only redundant docs that ansible-doc can also show, for example community.aws has 180 generated RST files of 6.1 MB which contain the same information that ansible-doc community.aws.<module-name> will also show).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree that if there is anything of value to keep, it's probably the docs -- they are also not as problematic in terms of volume and size as the other bits.

This list is otherwise a good start, although not exhaustive and would miss quite a few hits from https://gist.github.com/dmsimard/4d53b6578f6a18f86013ffa94214af82

The pattern I had used was: find ansible_collections | grep -E "\.gitignore|\.github|\.azure-pipelines|\.swp|\.orig|\.DS_Store|\.keep|\.gitkeep|\.vscode|\.pytest_cache|\.yamllint|\.pre-commit-config\.yaml|\.plugin-cache\.yaml|\.pytest_cache|\.mypy_cache|\.zuul\.yaml|\.flake8|\.circleci|\.gitattributes|\.ansible-lint|\.travis\.yml|\.gitlab-ci\.yml|\.idea|\.settings" | sort

It would feel a bit "dirty" to include these patterns in the setup.py but I figure they must be defined somewhere...

@webknjaz
Copy link
Member

webknjaz commented Nov 25, 2021

@felixfontein what platforms are you targeting? Who do you expect to build from sdist? Who do you expect to install in editable mode? What are the restrictions for pip and setuptools that you have in mind?

The reason I'm asking is because today it's possible to seamlessly move away from the dynamic setup.py in favor of setup.cfg and even editable installs will keep working. If you expect that the end-users will be able to consume wheels and you don't expect to support building in ancient environments with a 10-year-old pip, with my PyPA hat on I'd recommend using setup.cfg+pyproject.toml: I've recently added a note about this at https://setuptools.pypa.io/en/latest/setuptools.html#setup-cfg-only-projects.

@webknjaz
Copy link
Member

@felixfontein by the way, do you realize that invoking setup.py is strongly discouraged and is being deprecated over time? It's already started (slowly): https://github.com/pypa/setuptools/blob/391bb5d4d09c9eb8d6b2b98968e623455ae0a384/setuptools/command/install.py#L35.

Here's some motivation behind that: https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html.

I noticed that the current code for some reason still calls setup.py bypassing modern standards (by "modern" I mean accepted about 5 years ago and mostly implemented across the Python and downstream packaging ecosystems today).

@felixfontein
Copy link
Collaborator Author

(In case someone wonders how long setuptool has supported exclude_package_data for setup(): pypa/setuptools@d989230. That's definitely long enough!)

@dmsimard
Copy link
Contributor

dmsimard commented Dec 3, 2021

@webknjaz I think those are good questions that we should find an answer for in the not too distant future.
I would consider a rework (like switching to setup.cfg+pyproject.toml) out of scope for the time being since our priority is to stop installing files we don't care about but that doesn't mean we shouldn't transition to something better soon™.

Should we create an issue so we don't forget about it ?

@felixfontein felixfontein force-pushed the improve-setup.py branch 2 times, most recently from 6e52cf6 to 7db6717 Compare December 7, 2021 20:07
@felixfontein
Copy link
Collaborator Author

Rebased after #341 has been merged.

@felixfontein
Copy link
Collaborator Author

I split out the collision check removal to #394.

@felixfontein felixfontein changed the title [WIP] Improve setup.py to exclude some unnecessary files, and get rid of Ansible 2.9 / ansible-base 2.10 checks [WIP] Improve setup.py to exclude some unnecessary files Jan 26, 2022
@felixfontein felixfontein changed the title [WIP] Improve setup.py to exclude some unnecessary files [WIP] [Ansible 6] Improve setup.py to exclude some unnecessary files Jan 26, 2022
@felixfontein
Copy link
Collaborator Author

This PR now implements the proposal from ansible-community/community-topics#76. This proposal is safe once ansible-community/community-topics#77 passes.

@felixfontein felixfontein changed the title [WIP] [Ansible 6] Improve setup.py to exclude some unnecessary files [Ansible 6] Improve setup.py to exclude some unnecessary files Mar 23, 2022
@felixfontein felixfontein merged commit 7e05104 into ansible-community:main Mar 23, 2022
@felixfontein
Copy link
Collaborator Author

@jamescassell @Andersson007 @webknjaz @dmsimard thanks for all your comments!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants