Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Old wheels culling. #4

Closed
Carreau opened this issue May 25, 2023 · 19 comments · Fixed by #12
Closed

Old wheels culling. #4

Carreau opened this issue May 25, 2023 · 19 comments · Fixed by #12
Assignees
Milestone

Comments

@Carreau
Copy link
Collaborator

Carreau commented May 25, 2023

Another action that allows to upload to anaconda a is https://github.com/OpenAstronomy/publish-wheels-anaconda, in particular it allows to remove older wheels.

I don't this we need to implement that as part of our action – too keep it simple though as we are publishing everything on the same repo we could have a cron that cull all the old wheels from here.

@matthewfeickert
Copy link
Member

I don't this we need to implement that as part of our action – too keep it simple though as we are publishing everything on the same repo we could have a cron that cull all the old wheels from here.

@Carreau What's the scope of the action? Is it taking a page from https://github.com/pypa/gh-action-pypi-publish (in terms of scope) and only doing the upload? This is (old wheel removal) is part of my #5 (comment).

Removal is pretty straightforward though and you can avoid a cron job that might remove more then you care about by having someone do something similar to what I did for matplotlib in matplotlib/matplotlib#23349 (now expanded to the following by others):

      - name: Remove old uploads to save space
        run: |
          N_LATEST_UPLOADS=5

          # Remove all _but_ the last "${N_LATEST_UPLOADS}" package versions
          # N.B.: `anaconda show` places the newest packages at the bottom of
          # the output of the 'Versions' section and package versions are
          # preceded with a '   + '.
          anaconda show scipy-wheels-nightly/matplotlib &> >(grep '+') | \
              sed 's/.* + //' | \
              head --lines "-${N_LATEST_UPLOADS}" > remove-package-versions.txt

          while LANG=C IFS= read -r package_version ; do
              echo "Removing scipy-wheels-nightly/matplotlib/${package_version}"
              anaconda --token ${{ secrets.ANACONDA_ORG_UPLOAD_TOKEN }} remove \
                --force \
                "scipy-wheels-nightly/matplotlib/${package_version}"
          done <remove-package-versions.txt

@Carreau
Copy link
Collaborator Author

Carreau commented May 26, 2023

Thanks for the comment and sorry for the confusion.

Basically I don't want to cull each time a new nightly is published, I wan to keep the action to just publish new nightly and as simple as possible. Then we can centrally decide how we cull older wheels separately.

We had some of those discussion in person – like should the action both built the wheel and upload ? Should it also remove, and should the number of wheel kept is configurable. But the more things you try to do, and the more option you had, the more disagreement there is.

https://github.com/OpenAstronomy/publish-wheels-anaconda is another action that takes care of removal

@betatim
Copy link
Contributor

betatim commented May 26, 2023

I think doing the clean up centrally will be easier than asking each project that uploads wheels to also clean up (and pick sensible parameters).

@tupui
Copy link
Member

tupui commented May 30, 2023

+1 to centralise this. The action could be run from this repo I suppose.

If we do the cleanup ourselves, we have to consider the case were projects don't upload nightlies but weeklies or have another timeline. i.e. maybe keep at the last n versions.

@bsipocz
Copy link
Member

bsipocz commented May 30, 2023

i.e. maybe keep at the last n versions.

we also need a time cutoff, maybe a month or 2 months? (see the case of scikit-image in the old location where it was built once, but never again, and therefore was extremely outdated, tripping the CI of a few projects when they assumed to pick up the dev wheel)

@tupui
Copy link
Member

tupui commented May 30, 2023

Agreed, I would give 3 to 6 months to be more generous with "slower" projects. But we could start with a lower bar and revisit if the use case ever comes up.

@bsipocz
Copy link
Member

bsipocz commented May 30, 2023

I would argue that those "slower" projects may not be a good fit for these nightlies, and would not provide many benefits for nightly testers?

@tupui
Copy link
Member

tupui commented May 30, 2023

You could still want to have a way to propose a development version which is easy to pull vs making a release.

e.g. I am working on SALib and we have a very very low volume of commit which does not require frequent updates. Still, some of our users sometimes ask us to test our things before a release. Sure in this case the package is easily built enough, but you might have a more complex project or just add the mechanism as a way to show that your project is serious.

But as I said, we can also wait for folks to ask that and start as you proposed with something a bit more aggressive.

@bsipocz
Copy link
Member

bsipocz commented May 30, 2023

Fair enough, thanks for clarifying with an example.

@tupui
Copy link
Member

tupui commented May 30, 2023

Of course 😃 But now sleep, I should not play with the jet lag haha

@betatim
Copy link
Contributor

betatim commented May 31, 2023

I like the idea of keeping the last N wheels.

Situations like scikit-image sound like something we should notice and somehow address "human to human". For example, do they upload so rarely because the project rarely changes, because something broke, etc.

@matthewfeickert
Copy link
Member

Given #12 (comment) is there agreement on 1-action-per-repository? If so, then if I've understood @tupui correctly this means we need a new remove-nightly-wheels repo (or something of that sort).

@tupui
Copy link
Member

tupui commented May 31, 2023

Why not having a single action which go through all our packages and apply a fixed policy? I think that's what we were saying here no?

The only concern I have is that we need to be careful with this action and its token as it would be an admin token in the wild. So it's going to be a trade-off between security and conveniency for all packages.

Here I think it would be an acceptable risk to use an admin token for that purpose as it's to apply a global policy.

Should we do a small call to discuss these?

@matthewfeickert
Copy link
Member

Why not having a single action which go through all our packages and apply a fixed policy? I think that's what we were saying here no?

Ah so "centralise" as in a "repo as a nighlty CI cron service for the whole org"? I have a few of those for dashboarding services for orgs.

The only benefit I see to having this be a service, compared to an action that projects run in their workflows, is that project maintainers no longer have to be checking that they are keeping things within their quota. Though perhaps the main advantage is that other projects in the org don't also have to ask for other maintainers to clean up space if things are running tight. I guess this is exactly what @betatim meant in

I think doing the clean up centrally will be easier than asking each project that uploads wheels to also clean up (and pick sensible parameters).

👍

Should we do a small call to discuss these?

If it is necessary, sure, though these security details seem reasonable given my understanding now and you can further restrict things to run in a specific GitHub Actions environment.

@tupui
Copy link
Member

tupui commented May 31, 2023

Yeah the idea is that projects should just care about uploading their package. The rest is admin/maintenance and we can do it for them ideally. So we are all on the same page (and it's fair as in the same rules for all) all the time.

@matthewfeickert
Copy link
Member

matthewfeickert commented May 31, 2023

Okay, then if people are in agreement, and someone else who is better at naming things than me chooses a repo name, I can make a PR to that new repo tonight US time that adds this as a GitHub Actions workflow. Should be easy to iterate on the PR to a version people are happy with quickly.

@tupui
Copy link
Member

tupui commented May 31, 2023

Personally I would be ok if this would run on the current action repo. It's related in scope so I would not find this out of place.

@bsipocz
Copy link
Member

bsipocz commented May 31, 2023

+1 on keeping it here

@matthewfeickert
Copy link
Member

Personally I would be ok if this would run on the current action repo. It's related in scope so I would not find this out of place.

+1 on keeping it here

Done in a redone PR #12. c.f. #12 (review) for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants