-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve garbage collection strategy #602
Comments
I think we should handle the two concerns separately, which makes option 1 more appealing to me. On the audit/compliance perspective, I would argue that source controller is not the source of truth in itself for such data, and therefore an audit log feature covering externally consumed data would probably be a more resource-optimised approach towards compliance. |
I would introduce retention as global settings with the following defaults: --artifact-retention-ttl=120s
--artifact-retention-records=2 For the above, the controller will delete artifacts older than 120s while preserving the two most recent ones. I'm reluctant on having these settings in custom resources since they would allow a tenant to fill the source-controller storage by setting: apiVersion: source.toolkit.fluxcd.io/v1beta1
kind: GitRepository
spec:
retention:
ttl: 50000m
records: 50000 |
Due to the fact that we make last-mile modifications, I am not sure this is true.
I would say this could be prevented by allowing an administrator to configure a max, at which the user defined value is capped. |
Ok so maybe the flags should have a |
Closed via #638 |
At present, the garbage collection of an Artifact happens on the next reconciliation run and removes all but current Artifact. As reported in fluxcd/flux2#2464, this can lead to problems when multiple commits are made in a row. Because there is a chance a controller reacting to an update is not able to download the advertised Artifact in time.
In addition, there has been chatter about complying with audits, which may require a (number of) Artifact(s) to be present for a longer period of time to provide a track record.
To improve and/or address this, there are a couple of options we can pursue.
Option 1: taking factor time into account
This would be the simplest option to solve primarily the first mentioned issue. Instead of deleting all but one Artifact, we could delete all that are older than
x
compared to current Artifact. This would provide the controllers consuming Source objects with a certain time frame to complete their started operations.Option 2: allow a controller global retention strategy
This would add flags to the controller config to e.g. keep a specific number of Artifacts around. Allowing cluster administrators to provide a configuration which addresses both the first and second issue.
Option 3: allow an object retention strategy
This would allow specifying a retention strategy on an object, giving the user fine-grain control over the garbage collection process. Allowing users to provide a configuration which addresses both the first and second issue.
Option 4: allow an object retention strategy with controller global fallback
This is a combination of 2 and 3, allowing more fine-grain control in total.
The text was updated successfully, but these errors were encountered: