Set retention for staging images #525

stp-ip · 2019-12-19T15:12:01Z

We currently set a 60d retention on staging storage and staging gcb storage, but don't enforce any retention for images.

Staging images should be discouraged from being used and therefore adding a retention policy would help setting the right expectations as well as keep our storage needs lower in the long run.

I am proposing the same 60d retention to keep things the same across all staging retention settings. Happy for other suggestions.

Additional notes:
Currently GCR itself doesn't provide retention settings. We could create the retention on the GCR created bucket, but I assume this could lead to weird issues.
The other option could run a prow job every week to clean up older images.

"Manual" removal script example: https://gist.github.com/ahmetb/7ce6d741bd5baa194a3fac6b1fec8bb7

rajibmitra · 2020-02-14T20:29:16Z

I would like to work on prow job that will clean up the older images.

rajibmitra · 2020-02-15T04:09:58Z

/assign fiorm

fejta-bot · 2020-05-18T07:15:24Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2020-06-17T07:58:52Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot · 2020-07-17T08:39:00Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2020-07-17T08:39:16Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

spiffxp · 2020-08-03T18:10:23Z

/reopen

k8s-ci-robot · 2020-08-03T18:10:36Z

@spiffxp: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

spiffxp · 2020-08-03T18:14:46Z

I agree 60d for GCR is reasonable. Staging GCR repos have images older than 60d. They should not, or people are going to assume they can use them in perpetuity.

This came up because @msau42 mentioned CSI images were close to 60d and was concerned they would expire and break kubernetes CI. They won't.

$ export prj=k8s-staging-csi; for b in $prj $prj-gcb artifacts.$prj.appspot.com; do echo $b: $(gsutil lifecycle get gs://$b); done
k8s-staging-csi: {"rule": [{"action": {"type": "Delete"}, "condition": {"age": 60}}]}
k8s-staging-csi-gcb: {"rule": [{"action": {"type": "Delete"}, "condition": {"age": 60}}]}
artifacts.k8s-staging-csi.appspot.com: gs://artifacts.k8s-staging-csi.appspot.com/ has no lifecycle configuration.

We should give SIG Storage time to promote their images to k8s.gcr.io and update tests to use them. Then I think we should implement this.

/assign @thockin @dims @bartsmykla
as a heads up

thockin · 2020-08-03T18:26:56Z

I don't object in theory. There isn't a good mechanism to do it, short of writing our own daily things that loops over every staging repo and nukes old images.

bartsmykla · 2020-08-04T07:51:26Z

I can help with our own solution :-)

msau42 · 2020-08-05T21:10:01Z

Is there a recommended way to do canary testing with the 60d removal? For example, in csi, our periodic canary testing tests multiple repos' canary images in one job. But some repos are more active than others, and the inactive ones may not have any merges for > 60d. Is there a way we can keep the canary images around to facilitate this workflow without having to promote the canary tag?

thockin · 2020-08-05T21:27:33Z

Define canary? If you need them long-term, why not promote them?

…

On Wed, Aug 5, 2020 at 2:10 PM Michelle Au ***@***.***> wrote: Is there a recommended way to do canary testing with the 60d removal? For example, in csi, our periodic canary testing tests multiple repos' canary images in one job. But some repos are more active than others, and the inactive ones may not have any merges for > 60d. Is there a way we can keep the canary images around to facilitate this workflow without having to promote the canary tag? — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#525 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABKWAVH5FBZLYUJENK6BUTDR7HDDPANCNFSM4J5IUKCA> .

msau42 · 2020-08-05T21:44:39Z

canary in our case is actually images with a "canary" tag. Every pr that merges, we build and repush the "canary" tag. We have a specific canary job that's configured to test using images with the canary tag. We do end up promoting those images with official release tags, and we have separate jobs that test using release images, but we will still have a canary job that tests against head of everything.

thockin · 2020-08-05T23:33:54Z

Yeah, you would not want to promote those, and tags are not mutable in prod anyway. So the goal is to keep the last N builds (N may be 1), regardless of age or whether that build was promoted or not? That seems like a recipe for flakes, no? If we were building our own retirement, we could (for example) always leave a specific tag or something, I guess? But I am still shaky on the idea.

…

On Wed, Aug 5, 2020 at 2:44 PM Michelle Au ***@***.***> wrote: canary in our case is actually images with a "canary" tag. Every pr that merges, we build and repush the "canary" tag. We have a specific canary job that's configured to test using images with the canary tag. We do end up promoting those images with official release tags, and we have separate jobs that test using release images, but we will still have a canary job that tests against head of everything. — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#525 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABKWAVHPSXVOCLOUCJEIQP3R7HHFNANCNFSM4J5IUKCA> .

msau42 · 2020-08-06T02:58:27Z

If there's another way we could achieve a "test against latest for all images, even if some of the repos are inactive", open to suggestions.

pohly · 2020-08-06T07:08:40Z

Perhaps we can add a periodic job which rebuilds canary images once a month? The added bonus is that we'll notice if something breaks in the build environment (shouldn't happen, but one never knows...) before actually trying to build a proper release.

thockin · 2020-08-06T16:44:57Z

Why do you want to test against head as opposed to the latest release? Even if that release is explicitly a "daily snapshot" or something? I guess I don't fully know what all the images are or what you are testing...

…

On Thu, Aug 6, 2020 at 12:08 AM Patrick Ohly ***@***.***> wrote: Perhaps we can add a periodic job which rebuilds canary images once a month? The added bonus is that we'll notice if something breaks in the build environment (shouldn't happen, but one never knows...) before actually trying to build a proper release. — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#525 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABKWAVALGQLZ2WSHQRQ6GEDR7JJINANCNFSM4J5IUKCA> .

spiffxp · 2020-10-27T16:37:59Z

/lifecycle frozen

pohly · 2020-10-28T18:55:11Z

We do promote our canaries. ... The one configured to run against canary is going to be prone to expiration of staging images for inactive repos

I think you meant "we don't promote our canaries", right?

unless we have something that will periodically build new images

Here's a PR which tentatively defines a job which refreshes "canary" for one repo: kubernetes/test-infra#19734

Release candidates are still problematic. We sometimes need those while preparing new sidecars for an upcoming Kubernetes release. On the other hand, the time period where we do need them might be small enough that the normal retention period is okay, so this might not be a problem?

msau42 · 2020-10-28T20:35:14Z

What I meant was we do promote canary builds to official release version tags. We don't promote the "canary" tag.

Yes I think we can treat release candidates separately. We don't want to promote release candidates and merge any tests that depend on release candidates in k/k

spiffxp · 2021-02-18T18:55:08Z

From https://cloud.google.com/container-registry/docs/managing#deleting_images:

"Do not apply Cloud Storage retention policies to storage buckets used by Container Registry. These policies to not work for managing images in Container Registry storage buckets." - so we're following the recommended guidance by not setting these
https://github.com/sethvargo/gcr-cleaner is a not-official-Google-product that could accomplish this

spiffxp · 2021-07-16T18:21:58Z

/milestone v1.23

ameukam · 2021-12-14T22:09:48Z

/milestone clear

ameukam · 2024-03-03T13:57:56Z

/milestone v1.32

ameukam · 2024-12-09T22:03:00Z

/milestone v1.34

BenTheElder · 2024-12-09T22:27:10Z

We have a 90d retention policy enabled already on the artifact registry images in k8s-staging-images, I think we may roll this up into the AR migration? cc @upodroid

upodroid · 2024-12-09T22:37:53Z

That's correct. We currently have a 90d retention for staging AR registries.

https://github.com/kubernetes/k8s.io/blob/main/infra/gcp/terraform/k8s-staging-images/registries.tf#L49

We might tighten that in the future.

BenTheElder · 2024-12-10T04:35:24Z

If we do any in-place migrations we'll enable this on the ARs then, should be simpler that way.
For everything else, the staging ARs will have it.
/lifecycle active

stp-ip added the wg/k8s-infra label Dec 19, 2019

stp-ip mentioned this issue Jan 8, 2020

support community owned GCS buckets for uploading results to TestGrid #332

Closed

k8s-ci-robot assigned rajibmitra Feb 15, 2020

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 18, 2020

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 17, 2020

k8s-ci-robot closed this as completed Jul 17, 2020

k8s-ci-robot reopened this Aug 3, 2020

k8s-ci-robot assigned bartsmykla, dims and thockin Aug 3, 2020

thockin removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Aug 3, 2020

k8s-ci-robot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Oct 27, 2020

spiffxp mentioned this issue Oct 29, 2020

[Umbrella Issue] Create a Image Promotion process #157

Closed

spiffxp added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Jan 22, 2021

spiffxp mentioned this issue Apr 1, 2021

dl.k8s.io: Redirect CI URIs to Kubernetes Community infra #1857

Merged

k8s-ci-robot added this to the v1.23 milestone Jul 16, 2021

k8s-ci-robot added sig/k8s-infra Categorizes an issue or PR as relevant to SIG K8s Infra. and removed wg/k8s-infra labels Sep 29, 2021

k8s-ci-robot removed this from the v1.23 milestone Dec 14, 2021

dims assigned ameukam and unassigned dims Jan 31, 2022

k8s-ci-robot added this to the v1.32 milestone Mar 3, 2024

k8s-ci-robot modified the milestones: v1.32, v1.34 Dec 9, 2024

k8s-ci-robot added lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. and removed lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. labels Dec 10, 2024

ameukam added this to SIG K8S Infra Dec 10, 2024

ameukam moved this to Backlog in SIG K8S Infra Dec 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set retention for staging images #525

Set retention for staging images #525

stp-ip commented Dec 19, 2019 •

edited

Loading

rajibmitra commented Feb 14, 2020

rajibmitra commented Feb 15, 2020 •

edited

Loading

fejta-bot commented May 18, 2020

fejta-bot commented Jun 17, 2020

fejta-bot commented Jul 17, 2020

k8s-ci-robot commented Jul 17, 2020

spiffxp commented Aug 3, 2020

k8s-ci-robot commented Aug 3, 2020

spiffxp commented Aug 3, 2020

thockin commented Aug 3, 2020

bartsmykla commented Aug 4, 2020

msau42 commented Aug 5, 2020

thockin commented Aug 5, 2020 via email

msau42 commented Aug 5, 2020

thockin commented Aug 5, 2020 via email

msau42 commented Aug 6, 2020

pohly commented Aug 6, 2020

thockin commented Aug 6, 2020 via email

spiffxp commented Oct 27, 2020

pohly commented Oct 28, 2020

msau42 commented Oct 28, 2020

spiffxp commented Feb 18, 2021 •

edited

Loading

spiffxp commented Jul 16, 2021

ameukam commented Dec 14, 2021

ameukam commented Mar 3, 2024

ameukam commented Dec 9, 2024

BenTheElder commented Dec 9, 2024 •

edited

Loading

upodroid commented Dec 9, 2024

BenTheElder commented Dec 10, 2024

Set retention for staging images #525

Set retention for staging images #525

Comments

stp-ip commented Dec 19, 2019 • edited Loading

rajibmitra commented Feb 14, 2020

rajibmitra commented Feb 15, 2020 • edited Loading

fejta-bot commented May 18, 2020

fejta-bot commented Jun 17, 2020

fejta-bot commented Jul 17, 2020

k8s-ci-robot commented Jul 17, 2020

spiffxp commented Aug 3, 2020

k8s-ci-robot commented Aug 3, 2020

spiffxp commented Aug 3, 2020

thockin commented Aug 3, 2020

bartsmykla commented Aug 4, 2020

msau42 commented Aug 5, 2020

thockin commented Aug 5, 2020 via email

msau42 commented Aug 5, 2020

thockin commented Aug 5, 2020 via email

msau42 commented Aug 6, 2020

pohly commented Aug 6, 2020

thockin commented Aug 6, 2020 via email

spiffxp commented Oct 27, 2020

pohly commented Oct 28, 2020

msau42 commented Oct 28, 2020

spiffxp commented Feb 18, 2021 • edited Loading

spiffxp commented Jul 16, 2021

ameukam commented Dec 14, 2021

ameukam commented Mar 3, 2024

ameukam commented Dec 9, 2024

BenTheElder commented Dec 9, 2024 • edited Loading

upodroid commented Dec 9, 2024

BenTheElder commented Dec 10, 2024

stp-ip commented Dec 19, 2019 •

edited

Loading

rajibmitra commented Feb 15, 2020 •

edited

Loading

spiffxp commented Feb 18, 2021 •

edited

Loading

BenTheElder commented Dec 9, 2024 •

edited

Loading