-
Notifications
You must be signed in to change notification settings - Fork 839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimize k8s.gcr.io backup script #666
Comments
First suggestion would be to only backup a single region and add a Is copying over into the same prefix/timestamp faster? We could copy into the same location gcrane cp -r / and set a long term retention to prevent modifications, but that goes against the GCR teams recommendation to assume or work with anything in GCS, but we already do this for prod afaik so could do here as well. Splitting or stretching the backup window had issues as gcrane doesn't allow rate-limits afaik and with running into quotas we block updates to say prow. |
How do you mean, exactly? (You can take a look here for some examples of timestamped prefixes.)
I'd rather not deal with GCS (implementation) details. We have been advised a number of times by the GCR team to treat GCS as opaque.
I'm working on a design with some ideas; will share soon. Stay tuned. |
Sorry, I think I understand your comment now. Last time I checked, gcrane does not work any faster even if you copy into the same prefix. It takes a long time (hours) for it to traverse over the ~30K images to realize that nothing needs to be copied over. This is different than the promoter which runs much faster here due to it aggressively ignoring images that have already been promoted. |
This was fixed with #677 /close |
@listx: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
We are exceeding quota for GCR by doing a simple copy of ~30K images in each of the 3 regions (for a total of ~90K image copies).
We have spoken with GCR and there is currently no way of giving special quota privileges to particular GCP projects. While this might be worth pursuing (simply increasing quota), I think we can just work on optimizing the backup scripts. There are a couple reasons for this:
gcrane cp -r <prod> <prod-backup>/<timestamp>
and while this works, it is slow (on the order of taking hours to complete), even if there are 0 changes from the point of the last backup!k8s-artifacts-prod
) will be immutable (the promoter does not allow mutations), so essentially it is an ever-growing-GCR. Given this nature, we can just make sure that the backup always just adds a delta of new images since the last snapshot, as the old backed-up copies will never change or be removed.I'll come up with a more detailed design soon.
/assign @listx
The text was updated successfully, but these errors were encountered: