RFC: Basic performance test #4378

guillaumerose · 2021-11-17T13:46:01Z

I am trying to write performance test for Tekton. I am looking at finding good metrics to follow and good test scenario.
Here is one: I create N taskRuns per second and observe how long it takes for the controller to create the pod related to each taskRun.

At the beginning, I thought it will be done in few seconds but even at a small rate (2-3 taskRuns per second), after 3min, the controller takes 30 seconds between taskRun creation and pod creation. It seems to be better if I tune QPS and burst of the k8s client of the controller.

I hope I can turn this into a real test that can run in the CI. Let me know what you think, what I should improve!

tekton-robot · 2021-11-17T13:46:03Z

@guillaumerose: Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

tekton-robot · 2021-11-17T13:46:10Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign afrittoli
You can assign the PR to them by writing /assign @afrittoli in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

imjasonh · 2021-11-17T13:57:56Z

@guillaumerose Thanks for picking this up and especially for your interest in formalizing and automating the tests. I've definitely done similar load tests in the past without observing the slowness you're seeing, so we must have had a regression at some point. Running these tests regularly would help prevent those in the future. 👍

vdemeester · 2021-11-17T15:10:55Z

/kind feature

vdemeester

I agree with @imjasonh, I think it would be nice to have those running alongside with the nightly build for example. We can start with this very simple test and grow it iteratively.

bobcatfish · 2021-11-17T23:17:13Z

I have a couple of requests:

That we prioritize setting up the automation to run these tests on a nightly basis (and make results visible) - if we're not able to do that I wouldn't want to add tests to the codebase that we don't exercise (this is why we didnt end up merging Initialize load testing #3554 )
I think it would be good to identify the scenarios we want to exercise in the tests up and agree up front before merging, TEP-0036 tries to start the process of defining what we need, @guillaumerose if you're at all interested in picking up that TEP and making a proposal around it you are more than welcome :D

guillaumerose · 2021-11-18T09:47:29Z

Thanks all!

@bobcatfish I will add my thoughts in a PR in this TEP, then we can come back to this 👍

vdemeester · 2021-11-18T09:51:52Z

That we prioritize setting up the automation to run these tests on a nightly basis (and make results visible) - if we're not able to do that I wouldn't want to add tests to the codebase that we don't exercise (this is why we didnt end up merging Initialize load testing #3554 )

@bobcatfish this is a bit chicken-and-egg problem here. We need something to execute on to setup the nightly runs, hence having this in would be first step, 2nd step would be to setup a nightly in dogfooding, and the steps after would be to refine and enhance.

lbernick · 2021-11-19T20:14:30Z

test/performance_test.go

+	rateLimitedCfg.RateLimiter = flowcontrol.NewTokenBucketRateLimiter(qps, burst)
+	rateLimitedKubeClient, rateLimitedCS := clientSets(t, rateLimitedCfg)
+
+	wg := sync.WaitGroup{}


I know there's some ongoing discussion about what exactly to benchmark, but I recently discovered testing.B and I'm wondering if it could be helpful here? It looks like some of timing + waitgroup logic could be replaced with b.runParallel.

Ah I though @guillaumerose used testing.B.., but turns out not. Yes, maybe using it might make sense indeed.

tekton-robot · 2022-02-20T17:23:29Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

tekton-robot · 2022-03-22T17:36:28Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

tekton-robot · 2022-04-21T18:28:43Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

tekton-robot · 2022-04-21T18:28:45Z

@tekton-robot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Basic performance test

79be874

tekton-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Nov 17, 2021

tekton-robot requested review from imjasonh and vdemeester November 17, 2021 13:46

tekton-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Nov 17, 2021

tekton-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 17, 2021

vdemeester reviewed Nov 17, 2021

View reviewed changes

lbernick reviewed Nov 19, 2021

View reviewed changes

tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 20, 2022

tekton-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 22, 2022

tekton-robot closed this Apr 21, 2022

This was referenced Sep 2, 2022

Improve DAG validation for pipelines with hundreds of tasks #5421

Merged

Optimize Tekton Pipeline #5452

Open

lbernick mentioned this pull request Jan 24, 2023

Nightly performance tests #6036

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Basic performance test #4378

RFC: Basic performance test #4378

guillaumerose commented Nov 17, 2021

tekton-robot commented Nov 17, 2021

tekton-robot commented Nov 17, 2021

imjasonh commented Nov 17, 2021

vdemeester commented Nov 17, 2021

vdemeester left a comment

bobcatfish commented Nov 17, 2021

guillaumerose commented Nov 18, 2021

vdemeester commented Nov 18, 2021

lbernick Nov 19, 2021

vdemeester Nov 22, 2021

tekton-robot commented Feb 20, 2022

tekton-robot commented Mar 22, 2022

tekton-robot commented Apr 21, 2022

tekton-robot commented Apr 21, 2022

RFC: Basic performance test #4378

RFC: Basic performance test #4378

Conversation

guillaumerose commented Nov 17, 2021

tekton-robot commented Nov 17, 2021

tekton-robot commented Nov 17, 2021

imjasonh commented Nov 17, 2021

vdemeester commented Nov 17, 2021

vdemeester left a comment

Choose a reason for hiding this comment

bobcatfish commented Nov 17, 2021

guillaumerose commented Nov 18, 2021

vdemeester commented Nov 18, 2021

lbernick Nov 19, 2021

Choose a reason for hiding this comment

vdemeester Nov 22, 2021

Choose a reason for hiding this comment

tekton-robot commented Feb 20, 2022

tekton-robot commented Mar 22, 2022

tekton-robot commented Apr 21, 2022

tekton-robot commented Apr 21, 2022