Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graceful shutdown of keep-alive connections #18890

Closed
pokusak opened this issue Jul 21, 2021 · 5 comments · Fixed by #18919
Closed

Graceful shutdown of keep-alive connections #18890

pokusak opened this issue Jul 21, 2021 · 5 comments · Fixed by #18919
Labels
area/vertx kind/enhancement New feature or request
Milestone

Comments

@pokusak
Copy link

pokusak commented Jul 21, 2021

Description

Currently there is support for graceful shutdown implemented in io.quarkus.vertx.http.runtime.filters.GracefulShutdownFilter it waits for all requests to finish and only then (except timeout) terminate the application. This works well.

If I am not mistaken it only counts requests and lets connections to be closed by underlying implementation (Vertx). But in case connections are open with keep alive connections are not closed. Or at least such information is not sent to client in form of header Connection: close

Implementation ideas

io.quarkus.vertx.http.runtime.filters.GracefulShutdownFilter could add header Connection: close in case its internal state is indicating shutdown (running == false). This can be done either unconditionally as it probably won't cause any problems to clients not using keep alive. Or (and here I am not sure) only in case connection is open with keep alive.

@gsmet
Copy link
Member

gsmet commented Jul 21, 2021

@stuartwdouglas looks like something for you.

stuartwdouglas added a commit to stuartwdouglas/quarkus that referenced this issue Jul 21, 2021
stuartwdouglas added a commit to stuartwdouglas/quarkus that referenced this issue Jul 21, 2021
stuartwdouglas added a commit to stuartwdouglas/quarkus that referenced this issue Jul 21, 2021
@quarkus-bot quarkus-bot bot added this to the 2.2 - main milestone Jul 22, 2021
@pokusak
Copy link
Author

pokusak commented Jul 22, 2021

@stuartwdouglas thanks for very quick reaction. May be I am wrong here but is placement of your fix right? Of course when 503 is returned it doesn't hurt to add Connection: close. But this is case of completely new connection. I was mainly thinking about existing connections which are open with keep alive. Once current request in such connection is done it should be closed by Connection: close header. Therefore I think right place to add header is requestDoneHandler. Isn't it?

@stuartwdouglas
Copy link
Member

responseDoneHandler is called when the response is done (i.e. has been sent to the client). You can't add a header to the connections that are currently being processed, they are likely being processed by different threads so any attempt to modify the non thread safe header map from another thread would cause all kinds of problems, and in general tracking this is way more expensive than the the counter approach.

@gsmet gsmet modified the milestones: 2.2 - main, 2.1.0.Final Jul 22, 2021
gsmet pushed a commit that referenced this issue Jul 22, 2021
@pokusak
Copy link
Author

pokusak commented Jul 22, 2021

Sorry for being so stubborn. How will this affect currently open connections with http keep alive? Those connections are open and clients are allowed to send requests through them. (Clients are not aware of running shutdown.)
Case I am trying to describe might be like this:

  1. Quarkus application deployed to Kubernetes.
  2. Clients are using http keep alive
  3. Cluster needs to move pod so sends SIGTERM to app
  4. Application is in grace period. So it should process any incoming requests (cluster is allowed some small amount of time to send requests even to terminating pod since it may take some time to propagate information from pods to ingress)
  5. Application should close any connections currently open.
  6. Ideally in grace period there is no running requests, no new are coming in. Even if not this case, app will be terminated.

As far as I understand current code and behavior of application point 4 is problem since we are already in shutdown grace period but we are not informing clients to close connection. In such case we will probably still reply with HTTP 200 which is OK, but information about closing connection should be added. In current implementation with your fix information about closing will be added only in combination with HTTP 503.

@stuartwdouglas
Copy link
Member

Basically what happens is:

  • Shutdown grace period the readiness check is changed to report that the pod is no longer ready, requests are still processed as normal
  • After the grace period any currently running requests (not connections) are allowed to finish
  • Any request that is received at this point is rejected, even if it is an existing connection

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/vertx kind/enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants