Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-purge, get rid of old chart versions #316

Open
jdolitsky opened this issue Mar 20, 2020 · 14 comments
Open

Auto-purge, get rid of old chart versions #316

jdolitsky opened this issue Mar 20, 2020 · 14 comments
Assignees
Labels
proposal which still in discussion

Comments

@jdolitsky
Copy link
Contributor

Add feature flags that enable auto-removal of old chart versions in storage based on various age / last used / version parameters

@cep21
Copy link

cep21 commented Dec 7, 2020

I'm interested in this ticket. It's something I could do out of band, but the missing piece is I don't think there's anything in chartmuseum that tracks last used for charts. Does this data currently exist? If we were to add it, where would it be stored?

@scbizu
Copy link
Contributor

scbizu commented Dec 7, 2020

@cep21 Happy with your interests , maybe this is what you find ? If you can help us with this feature , it will make a big sense :)

@cep21
Copy link

cep21 commented Dec 7, 2020

@scbizu I think "last read" makes more sense than "last modified", right? You would want to remove charts people are no longer downloading, for example.

@scbizu
Copy link
Contributor

scbizu commented Dec 7, 2020

@cep21 You're right , but the storage itself do not support the last read timestamp yet . It will be a huge PR if we add a new mechanism to the storage structure , we should handle every REQ from helm pull, and update the last read timestamp.

I think this is why this issue still tag with help wanted XD

@cep21
Copy link

cep21 commented Dec 7, 2020

One idea is to store this information inside redis, for example. Another is to use the backend storage itself to store this information and "sync" some kind of ledger every 60 seconds (for example).

@scbizu
Copy link
Contributor

scbizu commented Dec 8, 2020

I prefer the second one , the AutoPurge should be provided as an interface function , and it can be differ from the real storage backends. The expiration duration should be configurable if users open the auto purge feature .

@cep21
Copy link

cep21 commented Dec 8, 2020

and "sync" some kind of ledger every 60 seconds

This could be tricky if people are running multiple chartmuseum instances for redundancy, since we'll have to merge ledgers

@vadasambar
Copy link

It'd be great if we could have more than one conditions to decide whether to delete the charts or not. E.g., Instead of "Delete all the charts older than 2 months", it would be better if we have "Delete all the charts older than 2 months matching a particular regex". This is because you might not want to delete release (e.g., 2.1.0) charts but if you want to delete pre-release charts (e.g., 2.1.0-custom-fix or 2.1.0-pr-3245), you could use regex to match all the pre-release charts older than specified time period (Check #383 ).

However, one thing that concerns me about regex is you'd want to test it out first to see which charts would be deleted with that particular regex to avoid deleting charts you didn't intend to delete.

@scbizu
Copy link
Contributor

scbizu commented Feb 8, 2021

The thread is too long to track information now . And let me draw the conclusion till now , the key points list below:

  • Purge according to the last read(or can be fallback to last modified ?)
  • Should solve the multi-tenants problem (purge one tenant should not affect others) <- need further discussions
  • Purge supports some conditions (e.g.: regex rules)
  • Optional: provide a flag such as dry-run , it logs the will-be-purged chart before purge the chart.

@scbizu scbizu added this to the v0.14.0 milestone Feb 8, 2021
@scbizu scbizu added the proposal which still in discussion label Feb 8, 2021
@cep21
Copy link

cep21 commented Feb 8, 2021

Above looks right. Bullet (3) is probably an enhancement off the core request: bullet (1). Also the difference between last read and last modified is pretty huge in both use and ease-of-implementation.

@vadasambar
Copy link

vadasambar commented Feb 9, 2021

Optional: should provide a flag to determine whether users need to soft delete the chart. (soft delete here means not really purges the chart but logs the will-be-purged charts)

Nitpick: I think it'd be better to call it dry-run instead. soft delete makes me think that the chart would be archived or removed from the index but the data would still be there but what we want to show using soft delete (as far as I've understood) is what's going to be deleted if you run a delete operation.

Everything else looks good to me @scbizu

@scbizu
Copy link
Contributor

scbizu commented Jun 1, 2021

I will be self-assigned to draft one implementation this weekend since I think it will make a big sense for decreasing the pressure of index so that we can both decrease the latency of our APIs and save the disk size of chart storage .

Maybe it will be provides with --per-chart-max-version, it will keep the latest N charts as your configuration. However, since I do not know which chart is currently be used , we can add more stuffs (like stick some charts so that they will not be removed from storage) later .

(The dry-run option is already implemented inside our company maybe I can open source it later)

(off-topic: Our CI failed again because of too large index refreshing XD)

scbizu added a commit that referenced this issue Jun 14, 2021
scbizu added a commit that referenced this issue Jun 14, 2021
scbizu added a commit that referenced this issue Jun 14, 2021
scbizu added a commit that referenced this issue Jun 14, 2021
scbizu added a commit that referenced this issue Jun 14, 2021
scbizu added a commit that referenced this issue Jun 14, 2021
scbizu added a commit that referenced this issue Jun 14, 2021
scbizu added a commit that referenced this issue Jun 14, 2021
scbizu added a commit that referenced this issue Jun 14, 2021
scbizu added a commit that referenced this issue Jun 14, 2021
scbizu added a commit that referenced this issue Jun 14, 2021
@jdolitsky jdolitsky removed this from the v0.14.0 milestone Jan 25, 2022
scbizu added a commit that referenced this issue Jan 28, 2022
jdolitsky added a commit that referenced this issue Jan 28, 2022
…ption , impls #316 (#466)

Signed-off-by: scnace <[email protected]>

Co-authored-by: Josh Dolitsky <[email protected]>
@jasondamour
Copy link

Where was this left off? I'm willing to try picking up the remaining work. We have 2.5k charts, and would like to purge as many as possible

@scbizu
Copy link
Contributor

scbizu commented Jan 29, 2023

@jasondamour This is already implemented , you can use the version in our HEAD and try the -per-chart-max-version option to start chartmuseum .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal which still in discussion
Projects
None yet
Development

No branches or pull requests

5 participants