Feature request: Drain based on annotation #4188

anton-johansson · 2019-06-11T15:02:27Z

Is this a request for help?
No

What keywords did you search in NGINX Ingress controller issues before filing this one?
nginx ingress controller drain annotation
nginx ingress controller drain label

I found #2322, which is a similar request, which was closed because drain is commercial only. I understand that, but maybe we can work out a solution that works without the NGINX function (as done with sticky sessions already). If I understand things correctly, we use LUA-scripting to handle the balancing and sticky sessions. It should be possible to check upstream pod annotations here to decide whether or not pods should be considered for new sessions.

Is this a BUG REPORT or FEATURE REQUEST?
Feature request

NGINX Ingress controller version:

0.24.1

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:11:31Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:02:58Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}

Environment:

Cloud provider or hardware configuration: Bare metal, 3 masters and 5 worker nodes.
OS: Ubuntu 18.04.2 LTS (Bionic Beaver)
Kernel: 4.15.0-50-generic
Install tools: I use Ansible to install Kubernetes services, as described here.
Others:

Feature request:

I have a scenario similar to the one described in issue #2322. Our application does not have session replication and we need a better way of running version rollouts. We need a way to tell NGINX to not send new sessions to older deployments. I was thinking that we could use an annotation for this on Pod-level, maybe:

nginx.ingress.kubernetes.io/drain: true

... or this one (to not mix this with NGINX Plus' built in functionality):

nginx.ingress.kubernetes.io/accept-new-sessions: false

This would of course only have any effect at all if using sticky sessions, which could be a little confusing.

Looking around in the code, I assume that this functionality would take place somewhere in the pick_new_upstream function of sticky.lua.

Thoughts? Ideas? I could try and see if I can develop the changes myself, but I need to know if it's something that is actually wanted and if it's the right approach.

The text was updated successfully, but these errors were encountered:

aledbf · 2019-09-03T00:38:17Z

@anton-johansson please check #4514 (not merged yet)

anton-johansson · 2019-09-11T07:32:40Z

@aledbf That looks interersting indeed. But I'm not sure if it's related to marking certain pods as draining (i.e. not receiving new sessions).

ElvinEfendi · 2019-09-16T17:57:55Z

@anton-johansson can you not edit specific pod manifest and change readiness probe in a way that it fails. Then ingress-nginx will remove that pod from its list and stop proxying new connections to it (it'll still process the existing ones).

anton-johansson · 2019-09-17T09:26:38Z

Are you sure that's how it works? I get the feeling that pods that turn Unready will be removed from load balancing all together, including existing sessions.

If I'm wrong though, your solution seems perfectly valid and is surely something that I'd like to try out. :D

ElvinEfendi · 2019-09-17T15:01:32Z

Give it a try ;)

wknapik · 2019-09-27T12:25:07Z

We have this exact problem. Stateful app, no way to migrate sessions, need draining until sessions expire when updating.

I implemented a PoC based on nginx-ingress 0.25.1 and it seems to be working, but I had to patch sticky.lua to modify pick_new_upstream. I really want to avoid this, since it will require endless maintenance - I can already see the patch will not work on upstream code from master.

So... Ideally, I'd like to see session draining based on annotations implemented in nginx-ingress, but if that's not going to happen, I'd at least like to be able to implement it myself without the need for patching upstream code.

It could be done with minimal effort - say, pick_new_upstream could exclude upstreams from a shared nginx dict with a specific name. Or maybe we could provide a lua snippet via a configmap that would return a dict of upstreams to exclude (like get_drained_upstreams, mimicking get_failed_upstreams) ? Anything like that would work.

Once I'm done with turning the PoC into something production-ready, I'd be happy to open a PR, but it would only be a PR for the hook I described above, since every complete implementation would/coud be different.

anton-johansson · 2019-09-27T12:48:04Z

@wknapik: Cool! I'd love to see your patch. I don't want to run a patched version either, but I'm still interested in seeing your solution to it.

Either way, an annotation based solution seems like the optimal solution.

fejta-bot · 2019-12-26T13:06:36Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2020-01-25T13:50:56Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

wknapik · 2020-01-25T14:34:06Z

/remove-lifecycle rotten

fejta-bot · 2020-04-24T15:02:24Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

wknapik · 2020-04-26T16:44:31Z

/remove-lifecycle stale

fejta-bot · 2020-07-25T16:53:43Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

wknapik · 2020-07-27T09:04:45Z

/remove-lifecycle stale

ArronaxKP · 2020-09-18T15:53:50Z

This would be a nice feature to have. The only ingress I have found that supports this without an additional cost/deployment outside of a cluster is: jcmoraisjr/haproxy-ingress

fejta-bot · 2020-12-17T16:26:56Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2021-01-16T17:12:01Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

evenh · 2021-01-16T23:06:00Z

/remove-lifecycle rotten

fejta-bot · 2021-04-16T23:36:00Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

fejta-bot · 2021-05-16T23:58:14Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

fejta-bot · 2021-06-16T00:19:29Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

k8s-ci-robot · 2021-06-16T00:19:34Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

angeloxx · 2024-09-10T10:31:45Z

Even the if issue is closed (and outdated) I add our experience in these days. We used the canary deployment strategy; you can create a new ingress with the same host, that points to a service with a reduced set of pods (in order to select only pods where new users has to land) and:

nginx.ingress.kubernetes.io/canary: "true"
nginx.ingress.kubernetes.io/canary-weight: "100"

All new users are directed to the subset, old users keeps their session to old pods and when the migration has finished, you can remove the canary ingress. The persistence cookie will be still valid and the standard ingress will honor it.

anton-johansson changed the title ~~Drain based on annotation~~ Feature request: Drain based on annotation Jun 13, 2019

aledbf added the kind/feature Categorizes issue or PR as related to a new feature. label Jul 11, 2019

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 26, 2019

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 25, 2020

k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 25, 2020

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 24, 2020

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 26, 2020

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 25, 2020

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 27, 2020

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 17, 2020

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 16, 2021

k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 16, 2021

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 16, 2021

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 16, 2021

k8s-ci-robot closed this as completed Jun 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Drain based on annotation #4188

Feature request: Drain based on annotation #4188

anton-johansson commented Jun 11, 2019

aledbf commented Sep 3, 2019

anton-johansson commented Sep 11, 2019 •

edited

Loading

ElvinEfendi commented Sep 16, 2019

anton-johansson commented Sep 17, 2019

ElvinEfendi commented Sep 17, 2019

wknapik commented Sep 27, 2019

anton-johansson commented Sep 27, 2019

fejta-bot commented Dec 26, 2019

fejta-bot commented Jan 25, 2020

wknapik commented Jan 25, 2020

fejta-bot commented Apr 24, 2020

wknapik commented Apr 26, 2020

fejta-bot commented Jul 25, 2020

wknapik commented Jul 27, 2020

ArronaxKP commented Sep 18, 2020

fejta-bot commented Dec 17, 2020

fejta-bot commented Jan 16, 2021

evenh commented Jan 16, 2021

fejta-bot commented Apr 16, 2021

fejta-bot commented May 16, 2021

fejta-bot commented Jun 16, 2021

k8s-ci-robot commented Jun 16, 2021

angeloxx commented Sep 10, 2024 •

edited

Loading

Feature request: Drain based on annotation #4188

Feature request: Drain based on annotation #4188

Comments

anton-johansson commented Jun 11, 2019

aledbf commented Sep 3, 2019

anton-johansson commented Sep 11, 2019 • edited Loading

ElvinEfendi commented Sep 16, 2019

anton-johansson commented Sep 17, 2019

ElvinEfendi commented Sep 17, 2019

wknapik commented Sep 27, 2019

anton-johansson commented Sep 27, 2019

fejta-bot commented Dec 26, 2019

fejta-bot commented Jan 25, 2020

wknapik commented Jan 25, 2020

fejta-bot commented Apr 24, 2020

wknapik commented Apr 26, 2020

fejta-bot commented Jul 25, 2020

wknapik commented Jul 27, 2020

ArronaxKP commented Sep 18, 2020

fejta-bot commented Dec 17, 2020

fejta-bot commented Jan 16, 2021

evenh commented Jan 16, 2021

fejta-bot commented Apr 16, 2021

fejta-bot commented May 16, 2021

fejta-bot commented Jun 16, 2021

k8s-ci-robot commented Jun 16, 2021

angeloxx commented Sep 10, 2024 • edited Loading

anton-johansson commented Sep 11, 2019 •

edited

Loading

angeloxx commented Sep 10, 2024 •

edited

Loading