Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setup a job to automatically run and PR the results of audit/audit-gcp.sh #244

Closed
spiffxp opened this issue May 1, 2019 · 35 comments · Fixed by kubernetes/test-infra#20742
Assignees
Labels
area/access Define who has access to what via IAM bindings, role bindings, policy, etc. area/audit Audit of project resources, audit followup issues, code in audit/ lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@spiffxp
Copy link
Member

spiffxp commented May 1, 2019

/assign @hh
/wg k8s-infra

EDIT: redoing description entirely, things have changed since this issue was created

We want a prowjob that does the following:

  • runs on k8s-infra-prow-build-trusted
  • uses the k8s-infra-gcp-auditor GCP service account via workload identity
  • runs the audit/audit-gcp.sh script
  • if there are changes, opens or updates an existing PR to k8s.io (similar to how prow opens/updates autobump PRs)
  • (optionally) fails if the existing PR has been open for too long

I am super open to suggestions about whether there are better or more-actionable ways to do auditing. But we need to start with something.

I sketched out what such a job would look like here: https://github.com/bashfire/prow-config/blob/435a8039bc9cf496690ad572884a72e9608ebb4e/config/jobs/bashfire/k8s-io.yaml
This is one of the first things we want to setup on a freshly created k8s-infra cluster to be sure we actually have the cluster and all of the IAM policies / roles created properly

First run as Job, next run as CronJob

@spiffxp
Copy link
Member Author

spiffxp commented May 1, 2019

creating the cluster ref: #243

@spiffxp spiffxp added the area/access Define who has access to what via IAM bindings, role bindings, policy, etc. label May 1, 2019
@spiffxp
Copy link
Member Author

spiffxp commented May 1, 2019

CronJob would open a PR (akin to how prow maintains a bump PR?)

@hh
Copy link
Member

hh commented May 1, 2019

Depends on script in this PR: #213

@thockin thockin added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Jul 8, 2019
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 6, 2019
@scottilee
Copy link

@hh is this issue still relevant? If so, it seems like there should be a CronJob that periodically checks that a specified cluster has the required IAM policies and when it doesn't open a PR in this repo to fix? Let me know if I'm missing anything.

@cblecker
Copy link
Member

cblecker commented Nov 2, 2019

/remove-lifecycle stale

I think it is. I think the point is to run this regularly and then output the details

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 2, 2019
@thockin thockin added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Dec 11, 2019
@thockin thockin changed the title Setup a Job in k8s-infra cluster to scrape iam policies Setup a Job in k8s-infra cluster to scrape and audit iam policies Dec 11, 2019
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 10, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 9, 2020
@spiffxp
Copy link
Member Author

spiffxp commented Apr 15, 2020

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Apr 15, 2020
@spiffxp
Copy link
Member Author

spiffxp commented Apr 27, 2020

I am noodling on something over in https://github.com/bashfire/prow-config/blob/master/config/jobs/bashfire/k8s-io.yaml

The job will output the diff at the end. What I'd like to do next is create a PR if there are any changes, and continue to update that PR until it's been merged

@spiffxp
Copy link
Member Author

spiffxp commented Apr 29, 2020

/assign @spiffxp @bartsmykla
bart to help with refactoring pr-creator

@spiffxp spiffxp modified the milestones: ready-to-migrate, v1.21 Jan 15, 2021
@rikatz
Copy link
Contributor

rikatz commented Jan 20, 2021

/lifecycle active
So bot doesn't close this :P
Will see at least if I can find someone to take a look into this and help :)

@k8s-ci-robot k8s-ci-robot added the lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. label Jan 20, 2021
@spiffxp
Copy link
Member Author

spiffxp commented Jan 21, 2021

Chatted with @hh about this

@spiffxp
Copy link
Member Author

spiffxp commented Jan 21, 2021

/uncc @bartsmykla
/priority critical-urgent
During yesterday's meeting @thockin suggested we should be making no more big changes to infra until we have this taken care of. I don't like blocking everything for too long, but I am inclined to agree.

@k8s-ci-robot k8s-ci-robot added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Jan 21, 2021
@hh
Copy link
Member

hh commented Jan 22, 2021

/assign @hh

@hh
Copy link
Member

hh commented Jan 22, 2021

I didn't get a chance to dive deep into this, but we have a somewhat working similar script that does currently create or update a PR when changes have occurred in conformance data underpinning apisnoop.cncf.io :

https://github.com/cncf/apisnoop/blob/gcb-snoodb-pr-gater/cloudbuild.yaml

@bernokl
Copy link
Contributor

bernokl commented Jan 26, 2021

For this job I would suggest using the working pr-creator from the yaml hh posted above,
The alternative is to use golang-pr creator, I am reluctant to incur the overhead of the golang-pr creator, when the solution is so short.
Is there a reason to use the entire complexity of golang when our entire job that currently works is only this?

cp -r coverage/[artifact] path/to/repo/[artifact]
cd repo
git diff > ../GIT_DIFF
if [ -s ../GIT_DIFF ]
then
git config --global user.email "[email protected]"
git config --global user.name "CNCF CI Bot"
git checkout -b snoop-auto-updates
git add .
git commit -m "Changes in coverage diff $(date)"
git push --force https://$(cat ../github-token.txt)@github.com/cncf/apisnoop snoop-auto-updates
GIT_EDITOR=cat GITHUB_TOKEN=$(cat ../github-token.txt) ../hub pr list | grep "Changes in coverage diff" > ../PR_EXISTS
if [ -s ../PR_EXISTS ]
then
echo "PR exists already"
else
GIT_EDITOR=cat GITHUB_TOKEN=$(cat ../github-token.txt) ../hub pull-request -p
fi

@bernokl
Copy link
Contributor

bernokl commented Jan 26, 2021

Currently we use gcb for our build which means our credentials/secrets are in the gcb project.
We saw that your mockup for this job uses google secrets in trusted cluster.
How do you see the job getting access to those secrets?
Do we want to run this as a gcb job with its own secrets or should this job run in the trusted cluster? it is currently a multi-step job

@spiffxp
Copy link
Member Author

spiffxp commented Feb 2, 2021

/remove-help
Since I think this is being worked by some of @hh's team.

Discussed in meetings but to close the loop here:

  • this should run as a prowjob on k8s-infra-prow-build-trusted, the gcp-auditor service account is available there
  • the github token could either be stored in secret manager or could be added to k8s-infra-prow-build-trusted as a secret (similar to how secrets are manually added to the default cluster used by prow.k8s.io)

@k8s-ci-robot k8s-ci-robot removed the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Feb 2, 2021
@spiffxp
Copy link
Member Author

spiffxp commented Feb 16, 2021

/reopen
/area infra/auditing

@k8s-ci-robot k8s-ci-robot added the area/audit Audit of project resources, audit followup issues, code in audit/ label Feb 16, 2021
@k8s-ci-robot
Copy link
Contributor

@spiffxp: Reopened this issue.

In response to this:

/reopen
/area infra/auditing

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Feb 16, 2021
@spiffxp
Copy link
Member Author

spiffxp commented Feb 16, 2021

#1648 - auditor account didn't have access to secretmanager, add a one-off script for this for now

Sample audit PR's manually opened by @hh

I have /held these PRs because I would like to merge the PR I opened last month that annotated many of the changes that have happened in the past 7 months commit by commit. I am in the midst of updating it right now, and suggest I leave out some of the questionable changes I can't account for, for the automation to PR instead (#1534 (comment))

@hh
Copy link
Member

hh commented Feb 24, 2021

We now have a set of partial-audits PR created via a prow job.
They are smaller, though some touch 450 files (boskos-*).
Let me know if some. need broken down into smaller PRs.

It was created by using a gist of a slightly modify audit-gcp.sh over a few runs, using a different branch each time.

The changes to the job are here:

And the resulting list of 10 PRs:

@spiffxp
Copy link
Member Author

spiffxp commented Feb 25, 2021

Copy-pasting from #1676 (comment)

The partial-audit PR's opened by cncf-ci caught almost everything. I reviewed and approved them, but I can't do anything further to merge them until the account has signed the CLA. Use /check-cla to refresh once you've gotten it signed, and they should merge (hopefully without conflict)

The outstanding changes I didn't see covered in them are:

  • audit: add psharma to org admins - a9efe55
  • audit: add psharma as owner to kubernetes-public - f8a9628
  • audit: update gcp-auditor org-scoped roles - f0d0b63

@spiffxp
Copy link
Member Author

spiffxp commented Mar 4, 2021

/close
Thank you for your help in getting this over the line @hh!

I think we can call this done now:

I have some quality of life suggestions for improvements, but I will make those elsewhere.

@k8s-ci-robot
Copy link
Contributor

@spiffxp: Closing this issue.

In response to this:

/close
Thank you for your help in getting this over the line @hh!

I think we can call this done now:

I have some quality of life suggestions for improvements, but I will make those elsewhere.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/access Define who has access to what via IAM bindings, role bindings, policy, etc. area/audit Audit of project resources, audit followup issues, code in audit/ lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants