Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve garbage collection strategy #602

Closed
hiddeco opened this issue Mar 4, 2022 · 5 comments
Closed

Improve garbage collection strategy #602

hiddeco opened this issue Mar 4, 2022 · 5 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@hiddeco
Copy link
Member

hiddeco commented Mar 4, 2022

At present, the garbage collection of an Artifact happens on the next reconciliation run and removes all but current Artifact. As reported in fluxcd/flux2#2464, this can lead to problems when multiple commits are made in a row. Because there is a chance a controller reacting to an update is not able to download the advertised Artifact in time.

In addition, there has been chatter about complying with audits, which may require a (number of) Artifact(s) to be present for a longer period of time to provide a track record.

To improve and/or address this, there are a couple of options we can pursue.

Option 1: taking factor time into account

This would be the simplest option to solve primarily the first mentioned issue. Instead of deleting all but one Artifact, we could delete all that are older than x compared to current Artifact. This would provide the controllers consuming Source objects with a certain time frame to complete their started operations.

Option 2: allow a controller global retention strategy

This would add flags to the controller config to e.g. keep a specific number of Artifacts around. Allowing cluster administrators to provide a configuration which addresses both the first and second issue.

Option 3: allow an object retention strategy

This would allow specifying a retention strategy on an object, giving the user fine-grain control over the garbage collection process. Allowing users to provide a configuration which addresses both the first and second issue.

Option 4: allow an object retention strategy with controller global fallback

This is a combination of 2 and 3, allowing more fine-grain control in total.

@hiddeco hiddeco added the enhancement New feature or request label Mar 4, 2022
@hiddeco hiddeco moved this to Up-Next in Maintainers' Focus Mar 4, 2022
@pjbgf
Copy link
Member

pjbgf commented Mar 4, 2022

I think we should handle the two concerns separately, which makes option 1 more appealing to me.
With that in mind, the TTL could potentially start as an arbitrary value, as those archives are solely used internally by flux and not recommended for user consumption.

On the audit/compliance perspective, I would argue that source controller is not the source of truth in itself for such data, and therefore an audit log feature covering externally consumed data would probably be a more resource-optimised approach towards compliance.

@stefanprodan
Copy link
Member

I would introduce retention as global settings with the following defaults:

--artifact-retention-ttl=120s
--artifact-retention-records=2

For the above, the controller will delete artifacts older than 120s while preserving the two most recent ones.

I'm reluctant on having these settings in custom resources since they would allow a tenant to fill the source-controller storage by setting:

apiVersion: source.toolkit.fluxcd.io/v1beta1
kind: GitRepository
spec:
  retention: 
    ttl: 50000m
    records: 50000

@hiddeco
Copy link
Member Author

hiddeco commented Mar 4, 2022

On the audit/compliance perspective, I would argue that source controller is not the source of truth in itself for such data, and therefore an audit log feature covering externally consumed data would probably be a more resource-optimised approach towards compliance.

Due to the fact that we make last-mile modifications, I am not sure this is true.

I'm reluctant on having these settings in custom resources since they would allow a tenant to fill the source-controller storage by setting:

I would say this could be prevented by allowing an administrator to configure a max, at which the user defined value is capped.

@stefanprodan
Copy link
Member

I would say this could be prevented by allowing an administrator to configure a max, at which the user defined value is capped.

Ok so maybe the flags should have a max suffix.

@aryan9600 aryan9600 self-assigned this Mar 22, 2022
@pjbgf pjbgf added this to the GA milestone Mar 29, 2022
@aryan9600
Copy link
Member

Closed via #638

Repository owner moved this from Up-Next to Done in Maintainers' Focus Apr 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Done
Development

No branches or pull requests

4 participants