Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support community owned GCS buckets for uploading results to TestGrid #332

Closed
BenTheElder opened this issue Aug 13, 2019 · 21 comments
Closed

Comments

@BenTheElder
Copy link
Member

We currently have a somewhat ad-hoc process to allow uploading third party (I.E. not produced by prow.k8s.io) test results to TestGrid / testgrid.k8s.io without requiring companies / organizations to supply their own GCS buckets.

This process results in using buckets provided by one of Google's teams, in a Google owned project. Ideally these would be owned and controlled by the project instead.

Current docs are here: https://github.com/kubernetes/test-infra/tree/master/testgrid/conformance

cc @spiffxp

@dims
Copy link
Member

dims commented Aug 13, 2019

@BenTheElder what's the expiry settings for this bucket? (how long before the data gets wiped)

@BenTheElder
Copy link
Member Author

@dims We'd need to ask @michelle192837 what the limits are there, I think it depends on the testgrid updater. Currently we haven't set one because:

  • The data size is typically very small (just some test logs)
  • The current process was meant to be temporary until we had funding etc. for a long term solution

@michelle192837
Copy link
Contributor

Updater doesn't control the expiry, it should depend on the specific bucket's configuration. Is the question about the bucket where TestGrid results live or where conformance results are uploaded to?

@spiffxp
Copy link
Member

spiffxp commented Aug 13, 2019

FWIW, any results generated by prow.k8s.io that land in the kubernetes-jenkins GCS bucket (the bulk of the proejct's CI) age out at 90 days. I think it should be way longer.

$ gsutil lifecycle get gs://kubernetes-jenkins | jq .
{
  "rule": [
    {
      "action": {
        "type": "Delete"
      },
      "condition": {
        "age": 90
      }
    }
  ]
}

@BenTheElder
Copy link
Member Author

Updater doesn't control the expiry, it should depend on the specific bucket's configuration. Is the question about the bucket where TestGrid results live or where conformance results are uploaded to?

It doesn't control it directly, but presumably there's a lower bound on expiry or else risk of the updater not seeing the results?

@spiffxp
Copy link
Member

spiffxp commented Aug 13, 2019

Survey of all buckets that kettle currently parses
$ for b in $(<kettle/buckets.yaml yaml2json | jq -r to_entries[].key | cut -d/ -f1-3); do echo $b; gsutil lifecycle get $b; done

gs://canonical-kubernetes-tests
gs://canonical-kubernetes-tests/ has no lifecycle configuration.
gs://canonical-kubernetes-tests
gs://canonical-kubernetes-tests/ has no lifecycle configuration.
gs://cel-conformance
AccessDeniedException: 403 [email protected] does not have storage.buckets.get access to cel-conformance.
gs://compute-image-tools-test
AccessDeniedException: 403 [email protected] does not have storage.buckets.get access to compute-image-tools-test.
gs://istio-circleci
gs://istio-circleci/ has no lifecycle configuration.
gs://istio-prow
gs://istio-prow/ has no lifecycle configuration.
gs://k8s-conformance-docker
AccessDeniedException: 403 [email protected] does not have storage.buckets.get access to k8s-conformance-docker.
gs://k8s-conformance-gardener
AccessDeniedException: 403 [email protected] does not have storage.buckets.get access to k8s-conformance-gardener.
gs://k8s-conformance-kind-arm64-openlab
gs://k8s-conformance-kind-arm64-openlab/ has no lifecycle configuration.
gs://k8s-conformance-openstack
gs://k8s-conformance-openstack/ has no lifecycle configuration.
gs://kubernetes-github-redhat
AccessDeniedException: 403 [email protected] does not have storage.buckets.get access to kubernetes-github-redhat.
gs://kubernetes-jenkins
{"rule": [{"action": {"type": "Delete"}, "condition": {"age": 90}}]}
gs://kubernetes-multiarch-e2e-results
BucketNotFoundException: 404 gs://kubernetes-multiarch-e2e-results bucket does not exist.
gs://origin-federated-results
AccessDeniedException: 403 [email protected] does not have storage.buckets.get access to origin-federated-results.
gs://pivotal-e2e-results
AccessDeniedException: 403 [email protected] does not have storage.buckets.get access to pivotal-e2e-results.

@michelle192837
Copy link
Contributor

Updater doesn't control the expiry, it should depend on the specific bucket's configuration. Is the question about the bucket where TestGrid results live or where conformance results are uploaded to?

It doesn't control it directly, but presumably there's a lower bound on expiry or else risk of the updater not seeing the results?

I'm still confused I think, but the updater doesn't have expectations around this. Default lookback time is 24 hours, or longer if updating from scratch (days_of_results specified in the config) or max_test_runtime_hours if specified in the config.

@kevinzs2048
Copy link

kevinzs2048 commented Aug 27, 2019

@BenTheElder Do we still need to implement this first?

@BenTheElder
Copy link
Member Author

I would hope this would be a pretty small ask but I'm not sure.

It will be hard to get everyone migrated so I'd like to not add more people to the Google controlled buckets I manage if we can avoid it.

@dims what needs to happen here?

@dims
Copy link
Member

dims commented Aug 27, 2019

@BenTheElder we should add a new script in https://github.com/kubernetes/k8s.io/tree/master/infra/gcp based off the existing ones that creates just a GCS bucket with the correct retention/auto-deletion. Once that's done, we can start migrating folks off of the google controlled buckets. We also have to figure out how to script the service account stuff that we give the folks that need to push to those buckets.

@BenTheElder
Copy link
Member Author

ACK @dims, sorry this got buried in my inbox :(
pinning this in a tab to get back to.

@BenTheElder
Copy link
Member Author

retention policy might just be indefinite for these? logs are not particularly large.
otherwise prow's log bucket has the arbitrary policy of 90 days.

looks like we can script service account creation with https://cloud.google.com/iam/docs/creating-managing-service-accounts#iam-service-accounts-create-gcloud

gcloud beta iam service-accounts create [SA-NAME] \
    --description "[SA-DESCRIPTION]" \
    --display-name "[SA-DISPLAY-NAME]"

we'll also need to give that account permission to create GCS objects in the related bucket

@stp-ip
Copy link
Member

stp-ip commented Dec 19, 2019

Any reason why we would want to keep these around?
We already have a 60d retention for staging buckets for both human and gcb interaction.
To set the same expectations we could use the same 60d retention rate for this as well.

@spiffxp
Copy link
Member

spiffxp commented Dec 19, 2019

I'm opposed to 60d retention for test results, and would prefer to see expanded beyond the existing 90d. Such a short retention prevents us from comparing test results across quarters.

@ixdy
Copy link
Member

ixdy commented Dec 19, 2019

FYI, the 90d retention on gs://kubernetes-jenkins is pretty arbitrary. For background, way back in 2016 I bumped it from 30d to 90d.

@BenTheElder
Copy link
Member Author

for conformance tests from third parties we also expect relatively low amounts of extra data and relatively low rates of testing compared to our typical test result hosting.

@stp-ip
Copy link
Member

stp-ip commented Jan 8, 2020

We discussed this partially in the call today (2020-01-08).

The suggestion was to use a production storage bucket, which would come with a 10y retention.

What is the current cost associated?
Probably storage cost won't be a factor, but might be nice to know beforehand.

Context:
Staging storage buckets are set to 60d and staging images might be between 60d to 6m (#525)

@bartsmykla
Copy link
Contributor

Hi everyone.

Currently we do have ensure-conformance-storage.sh with three buckets being created:

capi-openstack
cri-o
huaweicloud

@BenTheElder @spiffxp @dims can we mark it as done or are we missing anything here?

@dims
Copy link
Member

dims commented Mar 30, 2020

/close

yep. this is done.

@k8s-ci-robot
Copy link
Contributor

@dims: Closing this issue.

In response to this:

/close

yep. this is done.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@BenTheElder
Copy link
Member Author

yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants