-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: declare node ready while decommissioning #43889
server: declare node ready while decommissioning #43889
Conversation
started a discussion about the change here |
I got a nod from @asubiotto on the mailing list, so I think this is good to go |
Before this patch, a decommissioning or draining, but otherwise healthy, node would return an error from the /health?ready=1 endpoint, thus declaring itself unready and signaling load balancers to send traffic away. This patch makes it so that the decommissioning status no longer matters for the readyness determination. Draining nodes continue to declare themselves unready. Note that decommissioning nodes typically go through draining at the end of the process. The justification is that a node can be decommissioning for an arbitrary amount of time. During that time it can continue to hold leases, etc. So trying to avoid traffic is not particularly desirable. In fact, some people even want to keep a node in a decommissioning state indefinitely (by restarting a decommissioning node without recommissioning it). Also, the server.shutdown.drain_wait cluster setting is there to give load balancers ample time to find out about a draining node. Also, tactically, the code is simplified. Release note (general change): A node no longer declares itself to not be ready through the /health/ready=1 endpoint while it's in the process of decommissioning. It continues to declare itself unready while draining.
59f874a
to
139bd21
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @asubiotto)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors r+
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @asubiotto)
Build failed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
43417: sql: remove restriction of foreign key on multiple outbound columns r=lucy-zhang a=otan Resolves #38850. Removing the restriction of foreign keys on multiple outbound columns. There is one minor issue we need to address which is done here -- if the constraint name already exists, append a number next to it as postgres does as well. We can make this fancier if we wish later. This PR comes with a test suite of corner cases that we suspect may be hinted at being wrong. It uses both legacy and the new optimizer settings for foreign key checks. Release note (sql change): We previously had a restriction that foreign keys can only reference one outbound column at any given point in time, e.g. in `CREATE TABLE test(a int)`, we cannot have two foreign keys on column a. This PR removes this restriction. 43889: server: declare node ready while decommissioning r=andreimatei a=andreimatei Before this patch, a decommissioning or draining, but otherwise healthy, node would return an error from the /health?ready=1 endpoint, thus declaring itself unready and signaling load balancers to send traffic away. This patch makes it so that the decommissioning status no longer matters for the readyness determination. Draining nodes continue to declare themselves unready. Note that decommissioning nodes typically go through draining at the end of the process. The justification is that a node can be decommissioning for an arbitrary amount of time. During that time it can continue to hold leases, etc. So trying to avoid traffic is not particularly desirable. In fact, some people even want to keep a node in a decommissioning state indefinitely (by restarting a decommissioning node without recommissioning it). Also, the server.shutdown.drain_wait cluster setting is there to give load balancers ample time to find out about a draining node. Also, tactically, the code is simplified. Release note (general change): A node no longer declares itself to not be ready through the /health/ready=1 endpoint while it's in the process of decommissioning. It continues to declare itself unready while draining. Co-authored-by: Oliver Tan <[email protected]> Co-authored-by: Andrei Matei <[email protected]>
Build succeeded |
Before this patch, a decommissioning or draining, but otherwise healthy,
node would return an error from the /health?ready=1 endpoint, thus
declaring itself unready and signaling load balancers to send traffic
away. This patch makes it so that the decommissioning status no longer
matters for the readyness determination. Draining nodes continue to
declare themselves unready.
Note that decommissioning nodes typically go through draining at the
end of the process.
The justification is that a node can be decommissioning for an arbitrary
amount of time. During that time it can continue to hold leases, etc. So
trying to avoid traffic is not particularly desirable. In fact, some
people even want to keep a node in a decommissioning state indefinitely
(by restarting a decommissioning node without recommissioning it).
Also, the server.shutdown.drain_wait cluster setting is there to give
load balancers ample time to find out about a draining node.
Also, tactically, the code is simplified.
Release note (general change): A node no longer declares itself to not
be ready through the /health/ready=1 endpoint while it's in the process
of decommissioning. It continues to declare itself unready while
draining.