Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Flaky test] Tests in vendor/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go are flaking #114145

Closed
Rajalakshmi-Girish opened this issue Nov 25, 2022 · 28 comments · Fixed by #114940
Labels
kind/flake Categorizes issue or PR as related to a flaky test. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/testing Categorizes an issue or PR as relevant to SIG Testing.

Comments

@Rajalakshmi-Girish
Copy link
Contributor

Which jobs are flaking?

https://prow.ppc64le-cloud.org/job-history/s3/ppc64le-prow-logs/logs/postsubmit-master-golang-kubernetes-unit-test-ppc64le

Which tests are flaking?

Tests in vendor/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go are flaking when run with master golang on ppc64le.

TestGracefulTerminationWithKeepListeningDuringGracefulTerminationDisabled
TestGracefulTerminationWithKeepListeningDuringGracefulTerminationEnabled
TestMuxAndDiscoveryComplete
TestPreShutdownHooks/ShutdownSendRetryAfter_is_disabled
TestPreShutdownHooks/ShutdownSendRetryAfter_is_enabled

Since when has it been flaking?

After the commit golang/go@8a81fdf

Testgrid link

No response

Reason for failure (if possible)

The request to APIServer is timing out at https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go#L861

The tests are Passing when the timeout value is increased to 200ms

[root@raji-workspace server]# go version
go version devel go1.20-8a81fdf165 Sat Nov 19 16:48:07 2022 +0000 linux/ppc64le
[root@raji-workspace server]# go test -race -run TestMuxAndDiscoveryComplete
W1125 12:05:42.845733 3169366 authorization.go:47] Authorization is disabled
W1125 12:05:42.845988 3169366 authentication.go:47] Authentication is disabled
I1125 12:05:42.859571 3169366 secure_serving.go:210] Serving securely on [::]:46773
I1125 12:05:42.859730 3169366 tlsconfig.go:240] "Starting DynamicServingCertificateController"
--- FAIL: TestMuxAndDiscoveryComplete (5.21s)
    genericapiserver_graceful_termination_test.go:890: Sending request - timeout: 100ms, url: https://127.0.0.1:46773/echo?message=attempt-1
    genericapiserver_graceful_termination_test.go:995: [server] seen new connection: &net.TCPConn{conn:net.conn{fd:(*net.netFD)(0xc0007a6080)}}
    genericapiserver_graceful_termination_test.go:865: Still waiting for the server to start - err: <nil>
    genericapiserver_graceful_termination_test.go:890: Sending request - timeout: 100ms, url: https://127.0.0.1:46773/echo?message=attempt-2
    genericapiserver_graceful_termination_test.go:995: [server] seen new connection: &net.TCPConn{conn:net.conn{fd:(*net.netFD)(0xc0007a6100)}}
    ........
    ........
    genericapiserver_graceful_termination_test.go:995: [server] seen new connection: &net.TCPConn{conn:net.conn{fd:(*net.netFD)(0xc000ace180)}}
    genericapiserver_graceful_termination_test.go:865: Still waiting for the server to start - err: <nil>
    genericapiserver_graceful_termination_test.go:878: The server has failed to start - err: timed out waiting for the condition
W1125 12:05:48.059617 3169366 authorization.go:47] Authorization is disabled
W1125 12:05:48.059691 3169366 authentication.go:47] Authentication is disabled
W1125 12:05:48.061665 3169366 authorization.go:47] Authorization is disabled
W1125 12:05:48.061725 3169366 authentication.go:47] Authentication is disabled
W1125 12:05:48.063942 3169366 authorization.go:47] Authorization is disabled
W1125 12:05:48.063993 3169366 authentication.go:47] Authentication is disabled
FAIL
exit status 1
FAIL    k8s.io/apiserver/pkg/server     5.428s
[root@raji-workspace server]#

The PASS after increasing timeout:

[root@raji-workspace server]# vi genericapiserver_graceful_termination_test.go +861
[root@raji-workspace server]# git diff
diff --git a/staging/src/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go b/staging/src/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go
index c18ce70c4ea..419e8fb3308 100644
--- a/staging/src/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go
+++ b/staging/src/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go
@@ -858,7 +858,7 @@ func waitForAPIServerStarted(t *testing.T, doer doer) {
        client := newClient(true)
        i := 1
        err := wait.PollImmediate(100*time.Millisecond, 5*time.Second, func() (done bool, err error) {
-               result := doer.Do(client, func(httptrace.GotConnInfo) {}, fmt.Sprintf("/echo?message=attempt-%d", i), 100*time.Millisecond)
+               result := doer.Do(client, func(httptrace.GotConnInfo) {}, fmt.Sprintf("/echo?message=attempt-%d", i), 200*time.Millisecond)
                i++

                if result.err != nil {
[root@raji-workspace server]# go test -race -run TestMuxAndDiscoveryComplete
W1125 12:13:40.991970 3178806 authorization.go:47] Authorization is disabled
W1125 12:13:40.992158 3178806 authentication.go:47] Authentication is disabled
I1125 12:13:41.004942 3178806 secure_serving.go:210] Serving securely on [::]:44603
I1125 12:13:41.004991 3178806 tlsconfig.go:240] "Starting DynamicServingCertificateController"
W1125 12:13:44.200081 3178806 authorization.go:47] Authorization is disabled
W1125 12:13:44.200164 3178806 authentication.go:47] Authentication is disabled
W1125 12:13:44.201983 3178806 authorization.go:47] Authorization is disabled
W1125 12:13:44.202041 3178806 authentication.go:47] Authentication is disabled
W1125 12:13:44.203595 3178806 authorization.go:47] Authorization is disabled
W1125 12:13:44.203649 3178806 authentication.go:47] Authentication is disabled
PASS
ok      k8s.io/apiserver/pkg/server     3.445s
[root@raji-workspace server]#

Anything else we need to know?

Seeing this falkiness only on ppc64le architecture and when run with golang versions after the commit 8a81fdf165facdcefa06531de5af98a4db343035flakiness

Relevant SIG(s)

/sig testing

@Rajalakshmi-Girish Rajalakshmi-Girish added the kind/flake Categorizes issue or PR as related to a flaky test. label Nov 25, 2022
@k8s-ci-robot
Copy link
Contributor

@Rajalakshmi-Girish: The label(s) sig/ cannot be applied, because the repository doesn't have them.

In response to this:

Which jobs are flaking?

https://prow.ppc64le-cloud.org/job-history/s3/ppc64le-prow-logs/logs/postsubmit-master-golang-kubernetes-unit-test-ppc64le

Which tests are flaking?

Tests in vendor/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go are flaking when run with master golang on ppc64le.

TestGracefulTerminationWithKeepListeningDuringGracefulTerminationDisabled
TestGracefulTerminationWithKeepListeningDuringGracefulTerminationEnabled
TestMuxAndDiscoveryComplete
TestPreShutdownHooks/ShutdownSendRetryAfter_is_disabled
TestPreShutdownHooks/ShutdownSendRetryAfter_is_enabled

Since when has it been flaking?

After the commit golang/go@8a81fdf

Testgrid link

No response

Reason for failure (if possible)

The request to APIServer is timing out at https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go#L861

The tests are Passing when the timeout value is increased to 200ms

[root@raji-workspace server]# go version
go version devel go1.20-8a81fdf165 Sat Nov 19 16:48:07 2022 +0000 linux/ppc64le
[root@raji-workspace server]# go test -race -run TestMuxAndDiscoveryComplete
W1125 12:05:42.845733 3169366 authorization.go:47] Authorization is disabled
W1125 12:05:42.845988 3169366 authentication.go:47] Authentication is disabled
I1125 12:05:42.859571 3169366 secure_serving.go:210] Serving securely on [::]:46773
I1125 12:05:42.859730 3169366 tlsconfig.go:240] "Starting DynamicServingCertificateController"
--- FAIL: TestMuxAndDiscoveryComplete (5.21s)
   genericapiserver_graceful_termination_test.go:890: Sending request - timeout: 100ms, url: https://127.0.0.1:46773/echo?message=attempt-1
   genericapiserver_graceful_termination_test.go:995: [server] seen new connection: &net.TCPConn{conn:net.conn{fd:(*net.netFD)(0xc0007a6080)}}
   genericapiserver_graceful_termination_test.go:865: Still waiting for the server to start - err: <nil>
   genericapiserver_graceful_termination_test.go:890: Sending request - timeout: 100ms, url: https://127.0.0.1:46773/echo?message=attempt-2
   genericapiserver_graceful_termination_test.go:995: [server] seen new connection: &net.TCPConn{conn:net.conn{fd:(*net.netFD)(0xc0007a6100)}}
   ........
   ........
   genericapiserver_graceful_termination_test.go:995: [server] seen new connection: &net.TCPConn{conn:net.conn{fd:(*net.netFD)(0xc000ace180)}}
   genericapiserver_graceful_termination_test.go:865: Still waiting for the server to start - err: <nil>
   genericapiserver_graceful_termination_test.go:878: The server has failed to start - err: timed out waiting for the condition
W1125 12:05:48.059617 3169366 authorization.go:47] Authorization is disabled
W1125 12:05:48.059691 3169366 authentication.go:47] Authentication is disabled
W1125 12:05:48.061665 3169366 authorization.go:47] Authorization is disabled
W1125 12:05:48.061725 3169366 authentication.go:47] Authentication is disabled
W1125 12:05:48.063942 3169366 authorization.go:47] Authorization is disabled
W1125 12:05:48.063993 3169366 authentication.go:47] Authentication is disabled
FAIL
exit status 1
FAIL    k8s.io/apiserver/pkg/server     5.428s
[root@raji-workspace server]#

The PASS after increasing timeout:

[root@raji-workspace server]# vi genericapiserver_graceful_termination_test.go +861
[root@raji-workspace server]# git diff
diff --git a/staging/src/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go b/staging/src/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go
index c18ce70c4ea..419e8fb3308 100644
--- a/staging/src/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go
+++ b/staging/src/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go
@@ -858,7 +858,7 @@ func waitForAPIServerStarted(t *testing.T, doer doer) {
       client := newClient(true)
       i := 1
       err := wait.PollImmediate(100*time.Millisecond, 5*time.Second, func() (done bool, err error) {
-               result := doer.Do(client, func(httptrace.GotConnInfo) {}, fmt.Sprintf("/echo?message=attempt-%d", i), 100*time.Millisecond)
+               result := doer.Do(client, func(httptrace.GotConnInfo) {}, fmt.Sprintf("/echo?message=attempt-%d", i), 200*time.Millisecond)
               i++

               if result.err != nil {
[root@raji-workspace server]# go test -race -run TestMuxAndDiscoveryComplete
W1125 12:13:40.991970 3178806 authorization.go:47] Authorization is disabled
W1125 12:13:40.992158 3178806 authentication.go:47] Authentication is disabled
I1125 12:13:41.004942 3178806 secure_serving.go:210] Serving securely on [::]:44603
I1125 12:13:41.004991 3178806 tlsconfig.go:240] "Starting DynamicServingCertificateController"
W1125 12:13:44.200081 3178806 authorization.go:47] Authorization is disabled
W1125 12:13:44.200164 3178806 authentication.go:47] Authentication is disabled
W1125 12:13:44.201983 3178806 authorization.go:47] Authorization is disabled
W1125 12:13:44.202041 3178806 authentication.go:47] Authentication is disabled
W1125 12:13:44.203595 3178806 authorization.go:47] Authorization is disabled
W1125 12:13:44.203649 3178806 authentication.go:47] Authentication is disabled
PASS
ok      k8s.io/apiserver/pkg/server     3.445s
[root@raji-workspace server]#

Anything else we need to know?

Seeing this falkiness only on ppc64le architecture and when run with golang versions after the commit 8a81fdf165facdcefa06531de5af98a4db343035flakiness

Relevant SIG(s)

/sig testing

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added sig/testing Categorizes an issue or PR as relevant to SIG Testing. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 25, 2022
@k8s-ci-robot
Copy link
Contributor

@Rajalakshmi-Girish: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dims
Copy link
Member

dims commented Nov 25, 2022

@Rajalakshmi-Girish i can't seem to find the commit, which PR is that 8a81fdf165facdcefa06531de5af98a4db343035 from?

@Rajalakshmi-Girish
Copy link
Contributor Author

@Rajalakshmi-Girish i can't seem to find the commit, which PR is that 8a81fdf165facdcefa06531de5af98a4db343035 from?

Sorry for the confusion. It is the golang commit golang/go@8a81fdf

@aojea
Copy link
Member

aojea commented Nov 26, 2022

Overall, this patch degrades performance by 55% for private key
operations, and 4-5x for (much faster) public key operations.
(Signatures do both, so the slowdown is worse than decryption.)

name old time/op new time/op delta
DecryptPKCS1v15/2048-8 1.50ms ± 0% 2.34ms ± 0% +56.44% (p=0.000 n=8+10)
DecryptPKCS1v15/3072-8 4.40ms ± 0% 6.79ms ± 0% +54.33% (p=0.000 n=10+9)
DecryptPKCS1v15/4096-8 9.31ms ± 0% 15.14ms ± 0% +62.60% (p=0.000 n=10+10)
EncryptPKCS1v15/2048-8 8.16µs ± 0% 355.58µs ± 0% +4258.90% (p=0.000 n=10+9)
DecryptOAEP/2048-8 1.50ms ± 0% 2.34ms ± 0% +55.68% (p=0.000 n=10+9)
EncryptOAEP/2048-8 8.51µs ± 0% 355.95µs ± 0% +4082.75% (p=0.000 n=10+9)
SignPKCS1v15/2048-8 1.51ms ± 0% 2.69ms ± 0% +77.94% (p=0.000 n=10+10)
VerifyPKCS1v15/2048-8 7.25µs ± 0% 354.34µs ± 0% +4789.52% (p=0.000 n=9+9)
SignPSS/2048-8 1.51ms ± 0% 2.70ms ± 0% +78.80% (p=0.000 n=9+10)
VerifyPSS/2048-8 8.27µs ± 1% 355.65µs ± +4199.39% (p=0.000 n=10+10)

has this an impact on https ?
can this be a regression for performance and scalability in Kubernetes?

/cc @liggitt @wojtek-t

@liggitt
Copy link
Member

liggitt commented Nov 28, 2022

how close were ppc64le tests to the 100ms timeout before?

linux and darwin is nowhere remotely close to the timeout on go1.19 or go master

with this diff:

diff --git a/staging/src/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go b/staging/src/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go
index c18ce70c4ea..40f50d00861 100644
--- a/staging/src/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go
+++ b/staging/src/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go
@@ -858,8 +858,10 @@ func waitForAPIServerStarted(t *testing.T, doer doer) {
 	client := newClient(true)
 	i := 1
 	err := wait.PollImmediate(100*time.Millisecond, 5*time.Second, func() (done bool, err error) {
+		before := time.Now()
 		result := doer.Do(client, func(httptrace.GotConnInfo) {}, fmt.Sprintf("/echo?message=attempt-%d", i), 100*time.Millisecond)
 		i++
+		fmt.Println("DEBUG", time.Since(before))
 
 		if result.err != nil {
 			t.Logf("Still waiting for the server to start - err: %v", err)

there's a ~tiny slowdown of the dial on linux, and a small slowdown on darwin:

go version
go version go1.19.3 linux/amd64

go test k8s.io/apiserver/pkg/server -v -count 1 | grep DEBUG
DEBUG 5.444688ms
DEBUG 5.130479ms
DEBUG 4.932429ms
DEBUG 6.814828ms
DEBUG 4.945898ms
go version
go version devel go1.20-bb0d8297d7 Mon Nov 28 19:38:37 2022 +0000 linux/amd64

go test k8s.io/apiserver/pkg/server -v -count 1 | grep DEBUG
DEBUG 6.501539ms
DEBUG 5.55922ms
DEBUG 7.475699ms
DEBUG 6.167169ms
DEBUG 5.294349ms
go version
go version go1.19.3 darwin/arm64

go test k8s.io/apiserver/pkg/server -v -count 1 | grep DEBUG
DEBUG 6.289042ms
DEBUG 7.023625ms
DEBUG 7.4925ms
DEBUG 8.781125ms
DEBUG 7.661833ms
go version
go version devel go1.20-bb0d8297d7 Mon Nov 28 19:38:37 2022 +0000 darwin/arm64

go test k8s.io/apiserver/pkg/server -v -count 1 | grep DEBUG
DEBUG 9.331125ms
DEBUG 7.38925ms
DEBUG 10.1135ms
DEBUG 9.592208ms
DEBUG 10.335542ms

what does that show before/after on ppc64le?

@liggitt
Copy link
Member

liggitt commented Nov 29, 2022

cc @Rajalakshmi-Girish on the question about existing ppc64le timing in #114145 (comment)

@liggitt
Copy link
Member

liggitt commented Nov 29, 2022

cc @kubernetes/sig-scalability - are we running scalability tests on go tip? did we see any increases in CPU or latency or decreases in throughput in the last two weeks?

@liggitt
Copy link
Member

liggitt commented Nov 29, 2022

cc @marseel

@tosi3k
Copy link
Member

tosi3k commented Nov 29, 2022

@liggitt we run scalability tests on go tip (using fixed version of K8s) but only using x64 architecture for masters and nodes in the cluster. We're also using Kubemark for scale testing go tip in K8s so the results might not be vocal enough about the regression.

No visible change in the pod throughput in our tests:
image

I looked at some of the most popular <resource, subresource, scope, verb> tuples for API call latency we measure in that test but I don't see any visible change at all as well.

There might be an increase in apiserver CPU usage judging by the chart below but I'm not sure:
image

That said, I'll edit the performance dashboard's config to include more runs to see whether the CPU usage bump is indeed a relatively fresh thing or not.

@Rajalakshmi-Girish
Copy link
Contributor Author

cc @Rajalakshmi-Girish on the question about existing ppc64le timing in #114145 (comment)

These failures occur only when run with -race flag.
Without -race flag, as you said, it is nowhere remotely close to the timeout even on ppc64le.

go version devel go1.20-8a81fdf165 Sat Nov 19 16:48:07 2022 +0000 linux/ppc64le
[root@raji-workspace ~]# cd kubernetes
[root@raji-workspace kubernetes]# go test k8s.io/apiserver/pkg/server -v -count                                                                              1 | grep DEBUG
DEBUG 7.858697ms
DEBUG 7.887596ms
DEBUG 8.770044ms
DEBUG 8.757551ms
DEBUG 8.560058ms

[root@raji-workspace kubernetes]#

But with the -race enabled, it is taking ~200ms on ppc64le after the commit golang/go@8a81fdf

[root@raji-workspace kubernetes]# go version
go version devel go1.20-8a81fdf165 Sat Nov 19 16:48:07 2022 +0000 linux/ppc64le
[root@raji-workspace kubernetes]# go test k8s.io/apiserver/pkg/server -race -v -count 1 | grep DEBUG
DEBUG 189.95721ms
DEBUG 196.049104ms
DEBUG 201.173303ms
DEBUG 201.134113ms
DEBUG 200.713944ms
DEBUG 201.019672ms
DEBUG 200.538204ms
DEBUG 201.055208ms
DEBUG 201.143484ms
DEBUG 200.456862ms
DEBUG 200.959993ms
DEBUG 201.086969ms
DEBUG 200.209768ms
DEBUG 198.47887ms
DEBUG 190.04871ms
DEBUG 200.189186ms
DEBUG 195.657968ms
[root@raji-workspace kubernetes]#

With the just previous commit 5f60f844be, it took between 27ms -35ms

[root@raji-workspace ~]# go version
go version devel go1.20-5f60f844be Sat Nov 19 16:45:10 2022 +0000 linux/ppc64le
[root@raji-workspace ~]# cd  kubernetes
[root@raji-workspace kubernetes]# go test k8s.io/apiserver/pkg/server -race -v -count 1 | grep DEBUG
DEBUG 36.729749ms
DEBUG 32.142844ms
DEBUG 27.605452ms
DEBUG 35.726139ms
DEBUG 31.877138ms
[root@raji-workspace kubernetes]#

@Rajalakshmi-Girish
Copy link
Contributor Author

we run scalability tests on go tip

Do these tests run with -race flag enabled?

@tosi3k
Copy link
Member

tosi3k commented Nov 29, 2022

we run scalability tests on go tip

Do these tests run with -race flag enabled?

This is an end-to-end scalability test, not a test that one can run through go test.

I also doubt we use the -race flag when building K8s binaries used in that e2e test given the overhead this flag has on the runtime - see the Runtime Overhead in this go.dev article.

@Rajalakshmi-Girish
Copy link
Contributor Author

This is an end-to-end scalability test, not a test that one can run through go test.

The tests this issue mentions are the unit tests run using the Makefile, which by default has -race flag while running a go test https://github.com/kubernetes/kubernetes/blob/master/hack/make-rules/test.sh#L74

@Rajalakshmi-Girish
Copy link
Contributor Author

linux and darwin is nowhere remotely close to the timeout on go1.19 or go master

@liggitt
Even on linux/amd64, I see a steep increase in time after the commit golang/go@8a81fdf

[root@toad1 go_master]# go version
go version devel go1.20-5f60f844be Sat Nov 19 16:45:10 2022 +0000 linux/amd64
[root@toad1 go_master]# cd ~/kubernetes/
[root@toad1 kubernetes]# go test k8s.io/apiserver/pkg/server -race -v -count 1 | grep DEBUG
DEBUG 26.158877ms
DEBUG 23.055449ms
DEBUG 21.108503ms
DEBUG 15.722442ms
DEBUG 17.811059ms
[root@toad1 kubernetes]#
[root@toad1 kubernetes]# go version
go version devel go1.20-8a81fdf165 Sat Nov 19 16:48:07 2022 +0000 linux/amd64
[root@toad1 kubernetes]# go test k8s.io/apiserver/pkg/server -race -v -count 1 | grep DEBUG
DEBUG 94.78808ms
DEBUG 79.807975ms
DEBUG 77.941713ms
DEBUG 86.069147ms
DEBUG 78.383754ms
[root@toad1 kubernetes]#

@liggitt
Copy link
Member

liggitt commented Nov 29, 2022

we run scalability tests on go tip

Do these tests run with -race flag enabled?

definitely not, -race rewrites/instruments code so that it's performance is not at all representative of production use.

@liggitt
Copy link
Member

liggitt commented Nov 29, 2022

if the impact is limited to unit tests with race detection enabled, there's not a notable production impact, but I still asked if the impact was expected or will be optimized in https://go-review.git.corp.google.com/c/go/+/326012/comments/e0d180a5_e40016bc

@liggitt
Copy link
Member

liggitt commented Nov 29, 2022

looks like there's also a revert CL open at https://go-review.git.corp.google.com/c/go/+/452255 in case it was needed for performance reasons, so I asked there if this level of impact to race detection code was expected

@liggitt
Copy link
Member

liggitt commented Nov 29, 2022

opened golang/go#56980 upstream

@Rajalakshmi-Girish
Copy link
Contributor Author

@liggitt Meanwhile, Can we bump the timeout value to 200ms here? like we did in #106716

@liggitt
Copy link
Member

liggitt commented Dec 7, 2022

that seems reasonable, since the point of this is not testing performance... though 200ms doesn't seem long enough given the comment in #114145 (comment)

@Rajalakshmi-Girish
Copy link
Contributor Author

though 200ms doesn't seem long enough given the comment in #114145 (comment)

True :(
Will it be too much to ask for 250ms/300ms?

@liggitt
Copy link
Member

liggitt commented Dec 8, 2022

what timeout is used for this same request in other places in this test? it looks like a 1 second timeout is used

@Rajalakshmi-Girish
Copy link
Contributor Author

Rajalakshmi-Girish commented Dec 9, 2022

what timeout is used for this same request in other places in this test? it looks like a 1 second timeout is used

True, the same request is using a 1-second timeout at other places in this test!

@Rajalakshmi-Girish
Copy link
Contributor Author

@liggitt As the same request is using 1 second timeout at other places, Can we increase it to 1s at https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/server/genericapiserver_graceful_termination_test.go#L860

@liggitt
Copy link
Member

liggitt commented Jan 6, 2023

that seems fine to me.

@liggitt
Copy link
Member

liggitt commented Jan 6, 2023

xref kubernetes/release#2815

@Rajalakshmi-Girish
Copy link
Contributor Author

@liggitt Can you PTAL #114940?

k8s-ci-robot added a commit that referenced this issue Jan 10, 2023
danielvegamyhre pushed a commit to danielvegamyhre/kubernetes that referenced this issue Jan 14, 2023
k8s-ci-robot added a commit that referenced this issue Jan 20, 2023
…ick-of-#114940-upstream-release-1.24

Automated cherry pick of #114940: Fixes the issue #114145
k8s-ci-robot added a commit that referenced this issue Jan 20, 2023
…ick-of-#114940-upstream-release-1.26

Automated cherry pick of #114940: Fixes the issue #114145
k8s-ci-robot added a commit that referenced this issue Jan 20, 2023
…ick-of-#114940-upstream-release-1.25

Automated cherry pick of #114940: Fixes the issue #114145
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/flake Categorizes issue or PR as related to a flaky test. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/testing Categorizes an issue or PR as relevant to SIG Testing.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants