Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flakes with rpc integration test #1860

Merged
merged 5 commits into from
Apr 1, 2019

Conversation

nkubala
Copy link
Contributor

@nkubala nkubala commented Mar 21, 2019

This adds retries to the integration tests that retrieve the internal state via RPC and HTTP, so that if skaffold doesn't get into the expected state on the first run we can give it a little time to process.

integration/rpc_test.go Outdated Show resolved Hide resolved
@codecov-io
Copy link

codecov-io commented Mar 23, 2019

Codecov Report

Merging #1860 into master will increase coverage by 1.29%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1860      +/-   ##
==========================================
+ Coverage   50.16%   51.45%   +1.29%     
==========================================
  Files         168      171       +3     
  Lines        7356     7690     +334     
==========================================
+ Hits         3690     3957     +267     
- Misses       3310     3357      +47     
- Partials      356      376      +20
Impacted Files Coverage Δ
pkg/skaffold/deploy/kubectl.go 68.05% <0%> (-6.19%) ⬇️
cmd/skaffold/app/cmd/dev.go 10.09% <0%> (-3.19%) ⬇️
pkg/skaffold/deploy/helm.go 61.92% <0%> (ø) ⬆️
pkg/skaffold/schema/versions.go 65.85% <0%> (ø) ⬆️
pkg/skaffold/event/proto/skaffold.pb.go 5.95% <0%> (ø) ⬆️
pkg/skaffold/schema/v1beta7/config.go
pkg/skaffold/schema/v1beta7/upgrade.go
pkg/skaffold/debug/transform.go 86.2% <0%> (ø)
pkg/skaffold/debug/transform_nodejs.go 82.75% <0%> (ø)
pkg/skaffold/debug/transform_jvm.go 94.84% <0%> (ø)
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9e23d32...358cb19. Read the comment docs.

@@ -61,7 +62,7 @@ func TestEventLogRPC(t *testing.T) {
if err != nil {
t.Logf("unable to establish skaffold grpc connection: retrying...")
attempts++
if attempts == retries {
if attempts == connectionRetries {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to re-try here? Is it because, the --rpc-port which the we ended up picking is not free any because some other test or CI job grabbed it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would happen if skaffold hadn't started up the RPC server yet. it's pretty unlikely that we would ever actually need to retry here, I had this in the original client I wrote so I figured I would throw it in here for robustness.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The number of connection re-tries is 20 which scares me. If there is a real issue, it would be masked. I am ok with a smaller number of re-tries here like 2.
Another idea: we should log and collect metrics when retries happen. Can help us determine which skaffold code should we stabilize.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created one here #1880

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HAHA yeah that does not need to be 20....I think I meant to set it to 2 originally. I'll change it here, and thanks for opening that issue

Copy link
Contributor

@balopat balopat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added some comments.

Can you dedupe the 4 (maybe more?) retry loops into a method like Retry(callback func(), waitTime time.duration)?

integration/rpc_test.go Outdated Show resolved Hide resolved
integration/rpc_test.go Outdated Show resolved Hide resolved
integration/rpc_test.go Show resolved Hide resolved
@nkubala nkubala merged commit 01d8d7e into GoogleContainerTools:master Apr 1, 2019
@nkubala nkubala deleted the flake branch April 1, 2019 18:35
@tejal29 tejal29 mentioned this pull request Apr 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants