-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gracefull shutdown RPC even when build is in error #4384
gracefull shutdown RPC even when build is in error #4384
Conversation
@dgageot Can you please take a look at this? Thx! |
Codecov Report
@@ Coverage Diff @@
## master #4384 +/- ##
==========================================
+ Coverage 71.75% 71.77% +0.02%
==========================================
Files 325 325
Lines 12572 12613 +41
==========================================
+ Hits 9021 9053 +32
- Misses 2976 2982 +6
- Partials 575 578 +3
Continue to review full report at Codecov.
|
5467e82
to
6649a5f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but please put the two explanatory comments to the two places for clarity - it is not obvious at all why these design choices were made just from reading the code - and while someone might look at this PRs description, it's best to have this kind of context in the code. Thank you for fixing this!!!
893cae9
to
24862cf
Compare
24862cf
to
5357cf4
Compare
Done! |
I've been digging into this worrying message from
The grpc-gateway uses $ curl -i --raw localhost:50052/v1/events
HTTP/1.1 200 OK
Content-Type: application/json
Grpc-Metadata-Content-Type: application/grpc
Date: Mon, 13 Jul 2020 15:04:10 GMT
Transfer-Encoding: chunked
1b5
{"result":{"timestamp":"2020-07-13T15:04:10.200435Z","event":{"metaEvent":{"entry":"Starting Skaffold: \u0026{Version: ConfigVersion:skaffold/v2beta6 GitVersion: GitCommit: GitTreeState: BuildDate: GoVersion:go1.14.3 Compiler:gc Platform:darwin/amd64}","metadata":{"build":{"numberOfArtifacts":1,"builders":[{"type":"BUILDPACKS","count":1}],"type":"LOCAL"},"deploy":{"deployers":[{"type":"KUBECTL","count":1}],"cluster":"MINIKUBE"}}}}}}
93
{"result":{"timestamp":"2020-07-13T15:04:10.201334Z","event":{"devLoopEvent":{"status":"In Progress"}},"entry":"DevInit Iteration 0 in progress"}}
c1
{"result":{"timestamp":"2020-07-13T15:04:10.299620Z","event":{"buildEvent":{"artifact":"skaffold-buildpacks","status":"In Progress"}},"entry":"Build started for artifact skaffold-buildpacks"}}
201
{"result":{"timestamp":"2020-07-13T15:04:11.777416Z","event":{"buildEvent":{"artifact":"skaffold-buildpacks","status":"Failed","err":"failed to fetch builder image 'gcr.io/buildpacks/builder:v1': error getting credentials - err: exit status 1, out: ``","errCode":"BUILD_UNKNOWN","actionableErr":{"errCode":"BUILD_UNKNOWN","message":"failed to fetch builder image 'gcr.io/buildpacks/builder:v1': error getting credentials - err: exit status 1, out: ``"}}},"entry":"Build failed for artifact skaffold-buildpacks"}}
173
{"result":{"timestamp":"2020-07-13T15:04:11.777773Z","event":{"devLoopEvent":{"status":"Failed","err":{"errCode":"BUILD_UNKNOWN","message":"couldn't build \"skaffold-buildpacks\": failed to fetch builder image 'gcr.io/buildpacks/builder:v1': error getting credentials - err: exit status 1, out: ``"}}},"entry":"DevInit Iteration 0 failed with error code BUILD_UNKNOWN"}}
curl: (18) transfer closed with outstanding read data remaining A chunked connection is normally closed by sending a 0-byte message. But calling |
Fixes: #3970
Maybe fixes #3991
Related: Relevant tracking issues, for context
Merge before/after: Dependent or prerequisite PRs
Description
Another attempt to gracefully shutdown RPC.
This is mostly bringing back what Gracefully shutdown RPC servers. #4010 did however,
GracefulStop
andserver.Shutdown
wait for all idle connections to close.IDES RPC client connections never terminate and go into idle state hence the above functions never return.
In this PR, the
GracefulStop
andserver.Shutdown
are wrapped in a timeout context.They wait for max 1 second which is enough to read the last events on the channel.
shutdownAPIServer
was called inPersistedPostRun
.However,
PersistentPostRun
is only called if a command'sRunE
does not result into any error.https://github.com/spf13/cobra/blob/master/command.go#L843
To make sure, the servers shut down gracefully when an error occurs, we need to call the
shutdownAPIServer
inRunE
command to execute.User facing changes (remove if N/A)
No.
** events API changes **
On any example, please run a command where build would fail.
e.g.
skaffold dev --cache-artifacts=false
(remove global default repo setting)On master
*** Before ***
You never see the build failed and devloop iteration failed events for a failed build.
On the branch
On the branch you will see failure events.
Follow-up Work (remove if N/A)