-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proxy-lifecycle: add HTTP Server with endpoints for proxy lifecycle shutdown #115
Conversation
5a3d46d
to
5be431b
Compare
ada8d83
to
47d0424
Compare
5be431b
to
1ffb8ac
Compare
a5bd187
to
a6e6f0a
Compare
e824175
to
52c214d
Compare
pkg/consuldp/consul_dataplane.go
Outdated
case <-cdp.xdsServerExited(): | ||
if err := proxy.Stop(); err != nil { | ||
// Initiate graceful shutdown of Envoy, kill if error | ||
if err := proxy.Quit(); err != nil { | ||
cdp.logger.Error("failed to stop proxy", "error", err) | ||
if err := proxy.Kill(); err != nil { | ||
cdp.logger.Error("failed to kill proxy", "error", err) | ||
} | ||
} | ||
doneCh <- errors.New("xDS server exited unexpectedly") | ||
case <-cdp.metricsConfig.metricsServerExited(): | ||
doneCh <- errors.New("metrics server exited unexpectedly") | ||
case <-cdp.lifecycleConfig.lifecycleServerExited(): | ||
// Initiate graceful shutdown of Envoy, kill if error | ||
if err := proxy.Quit(); err != nil { | ||
cdp.logger.Error("failed to stop proxy", "error", err) | ||
if err := proxy.Kill(); err != nil { | ||
cdp.logger.Error("failed to kill proxy", "error", err) | ||
} | ||
} | ||
doneCh <- errors.New("proxy lifecycle maangement server exited unexpectedly") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be the only change to existing logic aside from introducing the new proxy lifecycle management server exposing the /shutdown
endpoint - attempting to cleanly exit the Envoy process if a subprocess exits unexpectedly instead of the prior hard kill.
url := fmt.Sprintf("http://127.0.0.1:%d%s", port, m.gracefulShutdownPath) | ||
log.Printf("sending request to %s\n", url) | ||
|
||
resp, err := http.Get(url) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could possibly add some way to check that the lifecycle management server blocks for c.shutdownGracePeriod
seconds here before calling Proxy.Quit()
, but not sure how to best implement that in a non-flaky way.
@@ -32,7 +32,7 @@ jobs: | |||
- uses: actions/setup-go@4d34df0c2316fe8122ab82dc22947d607c0c91f9 # v4.0.0 | |||
with: | |||
go-version: ${{ needs.get-go-version.outputs.go-version }} | |||
- run: go test ./... | |||
- run: go test ./... -p 1 # disable parallelism to avoid port conflicts from default metrics and lifecycle server configuration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What kind of impact does this have on the runtime for these tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to clean this up, but skipped for now in the interest of expediency. It didn't feel substantial enough to warrant the effort at this time, as the full suite still completes in under a minute.
AdminAddr: cdp.cfg.Envoy.AdminBindAddress, | ||
AdminBindPort: cdp.cfg.Envoy.AdminBindPort, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious why these are showing up in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It felt reasonable to pull these out of the consul-dataplane Envoy config at this point when creating the config to pass as the only argument into envoy.NewProxy
(happy to change if they're already somewhere else I didn't notice) to make them accessible within pkg/envoy/proxy.go
where they're needed for the HTTP calls to Envoy's admin API for the Drain()
and Quit()
methods.
This was not needed previously, as the Envoy process was just terminated with a process kill signal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, and this is super useful! I added some nits and non-blocking questions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking pretty good! I can tell that you put a lot of thought and effort into this!
I only have a few comments, but nothing that I will stop you on.
Note: There is going to be another release of consul-dataplane tomorrow (Thursday, June 1st) so I am going to block here so this doesn't get merged in by accident.
1ffb8ac
to
04e73f2
Compare
d548340
to
8bc0b14
Compare
Addressed feedback and pushed some naming changes up into #100, which I'd like to merge before this to get the chain started. |
Co-authored-by: Paul Glass <[email protected]>
…tenersEnabled after rebase
268e173
to
bf8f0c8
Compare
* Revert "proxy-lifecycle: catch SIGTERM and initiate graceful shutdown (#130)" This reverts commit 40c99dc. * Revert "proxy-lifecycle: add HTTP Server with endpoints for proxy lifecycle shutdown (#115)" This reverts commit 0047e65. * Revert "cmd: add CLI flags for proxy shutdown lifecycle management (#100)" This reverts commit 3d37b9f.
…hutdown (#115) * envoy: set drain time and strategy passthrough config with sensible defaults for sidecar proxy shutdown lifecycle * pkg/consuldp/lifecycle: wire up lifecycle mgmt server into consul-dataplane main config * pkg/envoy: add Drain and Quit methods, rename Stop to Kill * pkg/consuldp/lifecycle: gracefully shutdown Envoy if xDS or lifecycle mgmt server exits unexpectedly * ci: disable parallelism in unit tests to avoid port conflicts * pkg/envoy: add http client to dial Envoy admin interface * pkg/consuldp/lifecycle: replace http client with proxy manager interface and mock * test/lifecycle: pick an available port if gracefulPort is unspecified * pkg/consuldp/lifecycle: check errors and close errorExitCh if any problems gracefully shutting down Envoy * add changelong --------- Co-authored-by: Nathan Coleman <[email protected]> Co-authored-by: Paul Glass <[email protected]>
… lifecycle shutdown into release/1.0.x (#182) * proxy-lifecycle: add HTTP Server with endpoints for proxy lifecycle shutdown (#115) * envoy: set drain time and strategy passthrough config with sensible defaults for sidecar proxy shutdown lifecycle * pkg/consuldp/lifecycle: wire up lifecycle mgmt server into consul-dataplane main config * pkg/envoy: add Drain and Quit methods, rename Stop to Kill * pkg/consuldp/lifecycle: gracefully shutdown Envoy if xDS or lifecycle mgmt server exits unexpectedly * ci: disable parallelism in unit tests to avoid port conflicts * pkg/envoy: add http client to dial Envoy admin interface * pkg/consuldp/lifecycle: replace http client with proxy manager interface and mock * test/lifecycle: pick an available port if gracefulPort is unspecified * pkg/consuldp/lifecycle: check errors and close errorExitCh if any problems gracefully shutting down Envoy * add changelong --------- Co-authored-by: Nathan Coleman <[email protected]> Co-authored-by: Paul Glass <[email protected]> * fix automation * fix unit tests --------- Co-authored-by: Mike Morris <[email protected]> Co-authored-by: Nathan Coleman <[email protected]> Co-authored-by: Paul Glass <[email protected]>
… lifecycle shutdown into release/1.0.x (#182) * proxy-lifecycle: add HTTP Server with endpoints for proxy lifecycle shutdown (#115) * envoy: set drain time and strategy passthrough config with sensible defaults for sidecar proxy shutdown lifecycle * pkg/consuldp/lifecycle: wire up lifecycle mgmt server into consul-dataplane main config * pkg/envoy: add Drain and Quit methods, rename Stop to Kill * pkg/consuldp/lifecycle: gracefully shutdown Envoy if xDS or lifecycle mgmt server exits unexpectedly * ci: disable parallelism in unit tests to avoid port conflicts * pkg/envoy: add http client to dial Envoy admin interface * pkg/consuldp/lifecycle: replace http client with proxy manager interface and mock * test/lifecycle: pick an available port if gracefulPort is unspecified * pkg/consuldp/lifecycle: check errors and close errorExitCh if any problems gracefully shutting down Envoy * add changelong --------- Co-authored-by: Nathan Coleman <[email protected]> Co-authored-by: Paul Glass <[email protected]> * fix automation * fix unit tests --------- Co-authored-by: Mike Morris <[email protected]> Co-authored-by: Nathan Coleman <[email protected]> Co-authored-by: Paul Glass <[email protected]>
Refs hashicorp/consul-k8s#536, hashicorp/consul-k8s#650
Adds a proxy lifecycle management server and starts it from the consul-dataplane main process. This server exposes an HTTP endpoint (configurable, defaulting to
/graceful_shutdown
on port 20300) to optionally start draining inbound (external) connections to the managed Envoy proxy, while allowing outbound requests from the application for which this proxy is acting as a sidecar to continue, up to a configurable grace period timeout, to facilitate application shutdown.Refactors the
envoy
package proxy manager to introduce new states (stateDraining
andstateExited
) and new methods (Drain()
,Quit()
and renamingStop()
toKill()
to avoid confusion and better describe the actual implementation) and implement an interface to allow a mock implementation for testing the lifecycle management server in isolation.Notes for reviewers
/graceful_shutdown
endpoint explicitly (which will be necessary for handling job termination from apreStop
hook) in the future.envoy
package does not have proper test coverage yet - I'm hoping to add this as a followup by replacing the currentfake-envoy
implementation with a Go-based version to include an HTTP server mocking the/drain_listeners
and/quitquitquit
Envoy admin API endpoints.envoyExtraArgs
refactor will likely conflict with fix: log level present in extra args using envoy-extra-args annotation : NET-2190 #133 and need to be resolvedLinks