Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TEP-0091] Add Verification at reconciler #5581

Merged

Conversation

Yongxuanzhang
Copy link
Member

@Yongxuanzhang Yongxuanzhang commented Sep 29, 2022

Changes

This commit is part of TEP-0091, before this commit we only have signing and verification functions but not used. This commit adds verification at reconciler after remote resolution and local resolve is done.

There are mainly 2 parts in this PR:

  1. New feature flag to enable trusted resources and configmap to configure public keys under pkg/apis/config
  2. Task and Pipeline verification under pkg/reconciler

Notes that KMS key is not supported in this commit since the dependency is not fully imported. This commit supports read pem file to public key.

/kind feature

Submitter Checklist

As the author of this PR, please check off the items in this checklist:

  • Has Docs included if any changes are user facing
  • Has Tests included if any functionality added or changed
  • Follows the commit message standard
  • Meets the Tekton contributor standards (including
    functionality, content, code)
  • Has a kind label. You can add one by adding a comment on this PR that contains /kind <type>. Valid types are bug, cleanup, design, documentation, feature, flake, misc, question, tep
  • Release notes block below has been updated with any user facing changes (API changes, bug fixes, changes requiring upgrade notices or deprecation warnings)
  • Release notes contains the string "action required" if the change requires additional action from users switching to the new release

Release Notes

Trusted Resource feature enable tekton pipeline to verify the resources resolved from resolver. With trusted resource feature, users can configure public keys in configmap and choose to turn on/off this feature via feature flag `resource-verification-mode`. This commit enables mount public key files as secrets into Pipeline and used for verification. Taskrun/Pipelinerun that fail the verification will be marked as `failed` and be stopped from execution if `resource-verification-mode` is set to `enforce`

@tekton-robot
Copy link
Collaborator

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@tekton-robot tekton-robot added release-note-none Denotes a PR that doesnt merit a release note. kind/feature Categorizes issue or PR as related to a new feature. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Sep 29, 2022
@tekton-robot tekton-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Sep 29, 2022
@Yongxuanzhang Yongxuanzhang changed the title [TEP-0091] Add Sign CMD and Verification at reconciler [TEP-0091] Add Sign&Verify CMD and Verification at reconciler Sep 29, 2022
@Yongxuanzhang Yongxuanzhang marked this pull request as ready for review September 29, 2022 17:28
@tekton-robot tekton-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 29, 2022
@Yongxuanzhang
Copy link
Member Author

/assign @wlynch

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
cmd/sign/main.go Do not exist 14.9%
cmd/verify/main.go Do not exist 22.6%
pkg/apis/config/feature_flags.go 78.9% 78.8% -0.2
pkg/apis/config/store.go 56.2% 55.6% -0.7
pkg/apis/config/trusted_resources.go Do not exist 66.7%
pkg/reconciler/pipelinerun/resources/pipelineref.go 89.7% 87.3% -2.5
pkg/reconciler/taskrun/resources/taskref.go 85.7% 84.7% -1.0
pkg/reconciler/trustedresources/verify.go 87.5% 88.6% 1.1
test/controller.go 19.8% 19.2% -0.6

@Yongxuanzhang
Copy link
Member Author

/assign @dibyom

cmd/sign/main.go Outdated
}
}
if *kmsKey != "" {
signer, err = kms.Get(ctx, *kmsKey, crypto.SHA256)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/s KMS doesn't work quite the same way as cosign since it's trying to minimize deps - you need to import all of the desired providers from https://github.com/sigstore/sigstore/tree/main/pkg/signature/kms manually, which will call ProviderInit to register the prefix.

The reason for this is that the providers are a big source of the massive dependency creep in cosign and related tools, since the cloud providers will pull in their corresponding SDKs.

I don't think we want to put this dependency burden on pipelines - at the very least you should make this its own module, though you may find it easier to just break this out into its own repo in the tektoncd org.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Followed up off-thread - KMS dependency is likely unavoidable since we'll need it on the reconciler side anyway.

Import away 🙈

Copy link
Member

@wlynch wlynch Sep 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would still recommend a separate repo for this long term though - you'll likely find it easier to maintain / distribute. Process isn't too bad - just follow the process here: https://github.com/tektoncd/community/blob/main/process.md#proposing-projects

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I remove those deps from this PR? I just tried to push them in a new commit and added 13k lines of code. 😢
Then this commit doesn't really work for kms. (Can be added later when we have a separate repo)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that sounds about right. It's kinda unavoidable given that you're pulling in the SDKs.

limitations under the License.
*/

package main
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of making 2 separate binaries, start prepping this to be a tkn plugin by making this 1 single binary with subcommands.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok to keep these 2 binaries before tkn support? Then we depreciate/remove them.
The tkn side may take longer time, including TEP and implementation. Just worried we have no way to sign files

Copy link
Member

@wlynch wlynch Sep 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need a TEP for a tkn plugin - you just have to install the binary with the name tkn-<plugin name> https://github.com/tektoncd/cli/blob/78665b7ea7583a2e76d85d5996924d5bd5ebad82/pkg/plugins/plugins.go#L34-L50

You could still do this with 2 binaries if you wanted, though it becomes a little more annoying to distribute.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add these into tkn as subcommand. Created an issue here tektoncd/cli#1733.
So for this commit in the doc I will point to experimental for signing, and when tkn supports we can update the doc

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
cmd/sign/main.go Do not exist 14.9%
cmd/verify/main.go Do not exist 22.6%
pkg/apis/config/feature_flags.go 78.9% 78.8% -0.2
pkg/apis/config/store.go 56.2% 55.6% -0.7
pkg/apis/config/trusted_resources.go Do not exist 66.7%
pkg/reconciler/pipelinerun/resources/pipelineref.go 89.7% 87.3% -2.5
pkg/reconciler/taskrun/resources/taskref.go 85.7% 84.7% -1.0
pkg/reconciler/trustedresources/verify.go 87.5% 88.6% 1.1
test/controller.go 19.8% 19.2% -0.6

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
cmd/sign/main.go Do not exist 14.9%
cmd/verify/main.go Do not exist 22.6%
pkg/apis/config/feature_flags.go 78.9% 78.8% -0.2
pkg/apis/config/store.go 56.2% 55.6% -0.7
pkg/apis/config/trusted_resources.go Do not exist 66.7%
pkg/reconciler/pipelinerun/resources/pipelineref.go 89.7% 87.3% -2.5
pkg/reconciler/taskrun/resources/taskref.go 85.7% 84.7% -1.0
pkg/reconciler/trustedresources/verify.go 87.5% 88.6% 1.1
test/controller.go 19.8% 19.2% -0.6

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
cmd/sign/main.go Do not exist 14.9%
cmd/verify/main.go Do not exist 22.6%
pkg/apis/config/feature_flags.go 78.9% 78.8% -0.2
pkg/apis/config/store.go 56.2% 55.6% -0.7
pkg/apis/config/trusted_resources.go Do not exist 66.7%
pkg/reconciler/pipelinerun/resources/pipelineref.go 89.7% 87.3% -2.5
pkg/reconciler/taskrun/resources/taskref.go 85.7% 84.7% -1.0
pkg/reconciler/trustedresources/verify.go 87.5% 88.6% 1.1
test/controller.go 19.8% 19.2% -0.6

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
cmd/sign/main.go Do not exist 14.9%
cmd/verify/main.go Do not exist 22.6%
pkg/apis/config/feature_flags.go 78.9% 78.8% -0.2
pkg/apis/config/store.go 56.2% 55.6% -0.7
pkg/apis/config/trusted_resources.go Do not exist 66.7%
pkg/reconciler/pipelinerun/resources/pipelineref.go 89.7% 87.3% -2.5
pkg/reconciler/taskrun/resources/taskref.go 85.7% 84.7% -1.0
pkg/reconciler/trustedresources/verify.go 87.5% 88.6% 1.1
test/controller.go 19.8% 19.2% -0.6

@abayer
Copy link
Contributor

abayer commented Sep 29, 2022

Why do we need these new cmd/...s in pipeline.git, rather than, say, tkn.git, or a new repo entirely? We don't have any other binaries in this repo that are meant to be run directly by users, just the binaries for our deployments and the binaries used behind the scenes in pods created for TaskRuns. I'm going to put a hold on this for now, til that question is resolved.

/hold

@tekton-robot tekton-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 29, 2022
@Yongxuanzhang
Copy link
Member Author

Why do we need these new cmd/...s in pipeline.git, rather than, say, tkn.git, or a new repo entirely? We don't have any other binaries in this repo that are meant to be run directly by users, just the binaries for our deployments and the binaries used behind the scenes in pods created for TaskRuns. I'm going to put a hold on this for now, til that question is resolved.

/hold

Sure, I will remove them and add to tkn

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/config/feature_flags.go 78.9% 78.8% -0.2
pkg/apis/config/store.go 56.2% 55.6% -0.7
pkg/apis/config/trusted_resources.go Do not exist 66.7%
pkg/reconciler/pipelinerun/resources/pipelineref.go 89.7% 87.3% -2.5
pkg/reconciler/taskrun/resources/taskref.go 85.7% 84.7% -1.0
pkg/reconciler/trustedresources/verify.go 87.5% 88.6% 1.1
test/controller.go 19.8% 19.2% -0.6

@abayer
Copy link
Contributor

abayer commented Sep 29, 2022

@tekton-robot tekton-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 31, 2022
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/config/feature_flags.go 78.3% 81.2% 2.8
pkg/apis/config/store.go 56.2% 72.2% 16.0
pkg/apis/config/trusted_resources.go Do not exist 75.0%
pkg/reconciler/pipelinerun/pipelinerun.go 85.4% 85.5% 0.2
pkg/reconciler/pipelinerun/resources/pipelineref.go 88.9% 92.2% 3.3
pkg/reconciler/pipelinerun/resources/pipelinerunresolution.go 94.7% 94.7% 0.0
pkg/reconciler/taskrun/resources/taskref.go 84.9% 88.2% 3.3
pkg/reconciler/taskrun/taskrun.go 81.3% 81.4% 0.1
pkg/trustedresources/verify.go Do not exist 73.8%
test/controller.go 19.8% 30.4% 10.6
test/trustedresources.go Do not exist 15.4%

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/config/feature_flags.go 78.3% 81.2% 2.8
pkg/apis/config/store.go 56.2% 72.2% 16.0
pkg/apis/config/trusted_resources.go Do not exist 75.0%
pkg/reconciler/pipelinerun/pipelinerun.go 85.4% 85.5% 0.2
pkg/reconciler/pipelinerun/resources/pipelineref.go 88.9% 92.2% 3.3
pkg/reconciler/pipelinerun/resources/pipelinerunresolution.go 94.7% 94.7% 0.0
pkg/reconciler/taskrun/resources/taskref.go 84.9% 88.2% 3.3
pkg/reconciler/taskrun/taskrun.go 81.3% 81.4% 0.1
pkg/trustedresources/verify.go Do not exist 86.3%
test/controller.go 19.8% 30.4% 10.6
test/trustedresources.go Do not exist 15.4%

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/config/feature_flags.go 78.3% 81.2% 2.8
pkg/apis/config/store.go 56.2% 72.2% 16.0
pkg/apis/config/trusted_resources.go Do not exist 75.0%
pkg/reconciler/pipelinerun/pipelinerun.go 85.4% 85.5% 0.2
pkg/reconciler/pipelinerun/resources/pipelineref.go 88.9% 92.2% 3.3
pkg/reconciler/pipelinerun/resources/pipelinerunresolution.go 94.7% 94.7% 0.0
pkg/reconciler/taskrun/resources/taskref.go 84.9% 88.2% 3.3
pkg/reconciler/taskrun/taskrun.go 81.3% 81.4% 0.1
pkg/trustedresources/verify.go Do not exist 86.3%
test/controller.go 19.8% 30.4% 10.6
test/trustedresources.go Do not exist 15.4%

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/config/feature_flags.go 78.3% 81.2% 2.8
pkg/apis/config/store.go 56.2% 72.2% 16.0
pkg/apis/config/trusted_resources.go Do not exist 75.0%
pkg/reconciler/pipelinerun/pipelinerun.go 85.4% 85.5% 0.2
pkg/reconciler/pipelinerun/resources/pipelineref.go 88.9% 92.2% 3.3
pkg/reconciler/pipelinerun/resources/pipelinerunresolution.go 94.7% 94.7% 0.0
pkg/reconciler/taskrun/resources/taskref.go 84.9% 88.2% 3.3
pkg/reconciler/taskrun/taskrun.go 81.3% 81.4% 0.1
pkg/trustedresources/verify.go Do not exist 86.3%
test/controller.go 19.8% 30.4% 10.6
test/trustedresources.go Do not exist 15.4%

@Yongxuanzhang
Copy link
Member Author

/retest

Copy link
Collaborator

@bobcatfish bobcatfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for adding e2e tests for this!!

I wonder if it would make sense to reduce the number of e2e test cases? (Because e2e tests are a bit slow and also tend to be difficult to maintain)

The test cases we have are:

  • "Signed Task Passes Verification",
  • "Tampered Task Fails Verification with tampered content",
  • "Unsigned Task Fails Verification without signature",
  • "Signed Pipeline Passes Verification",
  • "Tampered Pipeline Fails Verification with tampered content",
  • "Unsigned Pipeline Fails Verification without signature",

I think it would be worth thinking about:

  1. What would cause one of these tests to fail that wouldn't be caught by one of the other e2e test cases?
  2. How many of these are directly duplicating test cases we have at the reconciler and unit test level?

Thinking about (1) I'm wondering if we need to cover both Pipelines and Tasks at this level - feel free to push back on this b/c maybe the logic is significantly different but I'm wondering if the logic is similar enough that if there was a problem, both the Pipeline and Task tests would fail (i.e. we only need one of the two sets)

Thinking about (2) I'm wondering if we need to test the error cases at all at this level (or the reverse? maybe we only test that tampered tasks fail) - since the main goal of e2e tests is to make sure everything is wired up properly and we want to leave the specifics to other (faster) tests, maybe we can get away with less here

^^ But take that all with a grain of salt! Maybe it's worth having 6 separate tests here. Or maybe we can just make the Pipeline vs Task optimization and stick with 3. Either way thanks again for writing these!

)

func init() {
os.Setenv("PRIVATE_PASSWORD", password)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this environment variable just for the test or part of the feature? Either way could we document it somewhere?

Copy link
Member Author

@Yongxuanzhang Yongxuanzhang Nov 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh sorry, this is only for the test. I will add comment for it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comment

ctx := context.Background()
ctx, cancel := context.WithCancel(ctx)
defer cancel()
c, namespace := setup(ctx, t, requireAnyGate(neededFeatureFlags))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apologies for the potential scope creep here, but i want to double check that setting these flags as part of this test won't impact any other tests, do you know if that's the case? i'm wondering specifically two things:

  • do we run the e2e tests in parallel? if so, do we do anything to make sure that other tests aren't impacted by controller wide config settings like this?
  • if the test gets interrupted, will teardown occur that will reset these feature flags? (i would expect that the function being called by CleanupOnInterrupt would do something to reset these?)

In any case it's probably outside of the scope of this PR to do anything about these but i want to make sure that if these need to be addressed we do something to track that - BUT if these are already addressed, we should at least document how this works somewhere (maybe in a docstring for the setup function?) - or if there are already docs for this feel free to just point me at them 🙏

Copy link
Member Author

@Yongxuanzhang Yongxuanzhang Nov 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that is the tricky part of the tests.

  1. This setup function doesn't set the feature flag, it will only check if the needed flags are enabled or not, if not it will skip the current test. So here since we require alpha feature flag, the test will only run in alpha tests.
c, namespace := setup(ctx, t, requireAnyGate(neededFeatureFlags))
  1. In the current PR we turn on resource-verification-mode by making api request to the api server and will turn it off after the test finished (One issue is that we don't have the secret lister from client side, so there might be some latency between making the call and it really happen).
    Though it seems working now since the CI passes and we're not running them in parallel (no -parallel).
    go_test_e2e -timeout=${E2E_GO_TEST_TIMEOUT} ./test/... || failed=1

    I can add comment here that this would be trouble if we run tests in parallel.

I cannot think of a good solution here since this might be the first case that one feature flag may break others. It may not be worth to add a new CI job at this moment though it can be a way to fix

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This setup function doesn't set the feature flag, it will only check if the needed flags are enabled or not, if not it will skip the current test.

ah, kk, thanks for explaining!

I can add comment here that this would be trouble if we run tests in parallel.

kk sounds good - it would be nice if we could actually fail the test if someone tries to do it (if this went wrong, tests might just start randomly failing and it might be hard to trace that back to this comment) but I'm not sure how to do that (probably possible tho?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be not so hard to trace, since the error message will tell that the error is because we enable this feature flag. I think maybe I should raise this to the community and seek more feedback?

t.Fatalf("Failed to create PipelineRun `%s`: %s", pr.Name, err)
}

if tc.wantErr {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend avoiding using the same test code to test error and success cases. it's a Tekton wide standard at https://github.com/tektoncd/community/blob/main/standards.md#tests - would be nice if we included some of the reasoning for this in the docs itself but in the meantime you can see it in the discussion here tektoncd/community#133 (comment) - TL;DR: it makes code confusing and brittle and is much clearer and easier to maintain as separate test cases (even within this test to reason about what it's doing you have to understand that there are multiple if tc.wantErr blocks that completely change the test funcitonality)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! I started with having error and success cases separately but merge them to reduce the redundant code. I will divide them again

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will keep this standard in mind!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed this in latest commit

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much! (If there is code it makes sense for them to share, you could always consider creating a library for that code - but also a little repetition is often not as bad as it seems - 'repetition is better than the wrong abstraction' :D (https://sandimetz.com/blog/2016/1/20/the-wrong-abstraction)

@Yongxuanzhang
Copy link
Member Author

Thanks so much for adding e2e tests for this!!

I wonder if it would make sense to reduce the number of e2e test cases? (Because e2e tests are a bit slow and also tend to be difficult to maintain)

The test cases we have are:

  • "Signed Task Passes Verification",
  • "Tampered Task Fails Verification with tampered content",
  • "Unsigned Task Fails Verification without signature",
  • "Signed Pipeline Passes Verification",
  • "Tampered Pipeline Fails Verification with tampered content",
  • "Unsigned Pipeline Fails Verification without signature",

I think it would be worth thinking about:

  1. What would cause one of these tests to fail that wouldn't be caught by one of the other e2e test cases?
  2. How many of these are directly duplicating test cases we have at the reconciler and unit test level?

Thinking about (1) I'm wondering if we need to cover both Pipelines and Tasks at this level - feel free to push back on this b/c maybe the logic is significantly different but I'm wondering if the logic is similar enough that if there was a problem, both the Pipeline and Task tests would fail (i.e. we only need one of the two sets)

Thinking about (2) I'm wondering if we need to test the error cases at all at this level (or the reverse? maybe we only test that tampered tasks fail) - since the main goal of e2e tests is to make sure everything is wired up properly and we want to leave the specifics to other (faster) tests, maybe we can get away with less here

^^ But take that all with a grain of salt! Maybe it's worth having 6 separate tests here. Or maybe we can just make the Pipeline vs Task optimization and stick with 3. Either way thanks again for writing these!

Yeah this is great suggestion! I thought about this when writing the pipeline test. I think the pipeline case with a taskref should be enough since it will also check if the task can pass the verification.

They are all duplicate tests from the unit tests. 😅 Yeah I think we can only keep the Pipeline success and error which pipeline or task is tampered with.

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/config/feature_flags.go 78.3% 81.2% 2.8
pkg/apis/config/store.go 56.2% 72.2% 16.0
pkg/apis/config/trusted_resources.go Do not exist 75.0%
pkg/reconciler/pipelinerun/pipelinerun.go 85.4% 85.5% 0.2
pkg/reconciler/pipelinerun/resources/pipelineref.go 88.9% 92.2% 3.3
pkg/reconciler/pipelinerun/resources/pipelinerunresolution.go 94.7% 94.7% 0.0
pkg/reconciler/taskrun/resources/taskref.go 84.9% 88.2% 3.3
pkg/reconciler/taskrun/taskrun.go 81.3% 81.4% 0.1
pkg/trustedresources/verify.go Do not exist 86.3%

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/config/feature_flags.go 78.3% 81.2% 2.8
pkg/apis/config/store.go 56.2% 72.2% 16.0
pkg/apis/config/trusted_resources.go Do not exist 75.0%
pkg/reconciler/pipelinerun/pipelinerun.go 85.4% 85.5% 0.2
pkg/reconciler/pipelinerun/resources/pipelineref.go 88.9% 92.2% 3.3
pkg/reconciler/pipelinerun/resources/pipelinerunresolution.go 94.7% 94.7% 0.0
pkg/reconciler/taskrun/resources/taskref.go 84.9% 88.2% 3.3
pkg/reconciler/taskrun/taskrun.go 81.3% 81.4% 0.1
pkg/trustedresources/verify.go Do not exist 86.3%
test/controller.go 19.8% 30.4% 10.6
test/trustedresources.go Do not exist 15.4%


var (
neededFeatureFlags = map[string]string{
"resource-verification-mode": "enforce",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if i'm understanding how requireAnyGate works correctly, I think there might be something odd about including this flag here? it seems like requireAnyGate will run the current test if any of the passed in flags are set - since this test is providing both resource-verification-mode and enable-api-fields, am I right that that means the test skipping/running would be:

  1. resource-verification-mode set, enable-api-fields, set <-- run the test
  2. resource-verification-mode unset (which is what we'd expect i think), enable-api-fields, set <-- run the test
    3.resource-verification-mode set, enable-api-fields, unset <-- run the test
    4.resource-verification-mode unset, enable-api-fields, unset <-- skip the test

i.e. it would only skip the test in case (4)? i think we'd want to skip the test in case (3) as well? long story short I'm wondering if we should only pass enable-api-fields to requireAnyGate

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're totally correct! It will run if any of these 2 are set. It totally makes sense that we only pass enable-api-fields, I will change it accordingly.

t.Fatalf("Failed to create PipelineRun `%s`: %s", pr.Name, err)
}

if tc.wantErr {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much! (If there is code it makes sense for them to share, you could always consider creating a library for that code - but also a little repetition is often not as bad as it seems - 'repetition is better than the wrong abstraction' :D (https://sandimetz.com/blog/2016/1/20/the-wrong-abstraction)

Copy link
Collaborator

@bobcatfish bobcatfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! A few more comments - and apologies for bringing up a whole new thing around wantErr 🙏

I think the pipeline case with a taskref should be enough since it will also check if the task can pass the verification.

ah good call, that makes a lot of sense

Yeah I think we can only keep the Pipeline success and error which pipeline or task is tampered with.

great!! :D

Namespace: namespace,
Annotations: map[string]string{"foo": "bar"},
},
wantErr: true,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apologies, I gave you some feedback about wantErr in the context of the e2e tests but that would apply to the rest of these tests as well - I strongly recommend separating out the error cases into their own tests for improved readability and maintainability

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure! Thanks for reminding me of those tests. Will do!

ctx := context.Background()
ctx, cancel := context.WithCancel(ctx)
defer cancel()
c, namespace := setup(ctx, t, requireAnyGate(neededFeatureFlags))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This setup function doesn't set the feature flag, it will only check if the needed flags are enabled or not, if not it will skip the current test.

ah, kk, thanks for explaining!

I can add comment here that this would be trouble if we run tests in parallel.

kk sounds good - it would be nice if we could actually fail the test if someone tries to do it (if this went wrong, tests might just start randomly failing and it might be hard to trace that back to this comment) but I'm not sure how to do that (probably possible tho?)

}

knativetest.CleanupOnInterrupt(func() { tearDown(ctx, t, c, namespace) }, t.Logf)
defer tearDown(ctx, t, c, namespace)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just triple checking - will resource-verification-mode be reset back to it's original value after the test even if the test is interrupted or fails? (i.e. does teardown reset the value?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I didn't put the cleanup code inside the tearDown, I feel like tearDown is mainly used to clean the namespace we created for this test. Our current test need to reset the configmap at system namespace, so to address your comment I create a new helperfunction within this test to clean up the created secret and configmap.

The defer will enable the function to be executed even if the test fails. I have tested this behaviour.

Thank you very much for pointing this out!!

@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/config/feature_flags.go 78.3% 81.2% 2.8
pkg/apis/config/store.go 56.2% 72.2% 16.0
pkg/apis/config/trusted_resources.go Do not exist 75.0%
pkg/reconciler/pipelinerun/pipelinerun.go 85.4% 85.5% 0.2
pkg/reconciler/pipelinerun/resources/pipelineref.go 88.9% 92.2% 3.3
pkg/reconciler/pipelinerun/resources/pipelinerunresolution.go 94.7% 94.7% 0.0
pkg/reconciler/taskrun/resources/taskref.go 84.9% 88.2% 3.3
pkg/reconciler/taskrun/taskrun.go 81.3% 81.4% 0.1
pkg/trustedresources/verify.go Do not exist 86.3%
test/controller.go 19.8% 30.4% 10.6
test/trustedresources.go Do not exist 15.4%

This commit is part of TEP-0091, before this commit we only have signing
and verification functions but not used. This commit adds verification at
reconciler after remote resolution and local resolution is done.
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/config/feature_flags.go 78.3% 81.2% 2.8
pkg/apis/config/store.go 56.2% 72.2% 16.0
pkg/apis/config/trusted_resources.go Do not exist 75.0%
pkg/reconciler/pipelinerun/pipelinerun.go 85.4% 85.5% 0.2
pkg/reconciler/pipelinerun/resources/pipelineref.go 88.9% 92.2% 3.3
pkg/reconciler/pipelinerun/resources/pipelinerunresolution.go 94.7% 94.7% 0.0
pkg/reconciler/taskrun/resources/taskref.go 84.9% 88.2% 3.3
pkg/reconciler/taskrun/taskrun.go 81.3% 81.4% 0.1
pkg/trustedresources/verify.go Do not exist 86.3%
test/controller.go 19.8% 30.4% 10.6
test/trustedresources.go Do not exist 15.4%

@Yongxuanzhang
Copy link
Member Author

/retest

Copy link
Collaborator

@bobcatfish bobcatfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the revisions!! The e2e test cases are looking great, I really like the simplicity of having just the 2 cases (one success, one error) 🎉

/lgtm


c, namespace, secretName, signer := setupResourceVerificationConfig(ctx, t, requireAnyGate(neededFeatureFlags))
knativetest.CleanupOnInterrupt(func() { removeResourceVerificationConfig(ctx, t, c, namespace, secretName) }, t.Logf)
defer removeResourceVerificationConfig(ctx, t, c, namespace, secretName)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!! :D :D :D thanks for all the back and forth on this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!!!

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 2, 2022
@tekton-robot tekton-robot merged commit 2d38f5f into tektoncd:main Nov 2, 2022
@JeromeJu JeromeJu mentioned this pull request Nov 2, 2022
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/feature Categorizes issue or PR as related to a new feature. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants