Skip to content

Commit

Permalink
server: declare node ready while decommissioning
Browse files Browse the repository at this point in the history
Before this patch, a decommissioning or draining, but otherwise healthy,
node would return an error from the /health?ready=1 endpoint, thus
declaring itself unready and signaling load balancers to send traffic
away. This patch makes it so that the decommissioning status no longer
matters for the readyness determination. Draining nodes continue to
declare themselves unready.
Note that decommissioning nodes typically go through draining at the
end of the process.

The justification is that a node can be decommissioning for an arbitrary
amount of time. During that time it can continue to hold leases, etc. So
trying to avoid traffic is not particularly desirable. In fact, some
people even want to keep a node in a decommissioning state indefinitely
(by restarting a decommissioning node without recommissioning it).
Also, the server.shutdown.drain_wait cluster setting is there to give
load balancers ample time to find out about a draining node.

Also, tactically, the code is simplified.

Release note (general change): A node no longer declares itself to not
be ready through the /health/ready=1 endpoint while it's in the process
of decommissioning. It continues to declare itself unready while
draining.
  • Loading branch information
andreimatei committed Jan 14, 2020
1 parent 811f75a commit 139bd21
Showing 1 changed file with 4 additions and 3 deletions.
7 changes: 4 additions & 3 deletions pkg/server/status.go
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ import (
"github.com/cockroachdb/cockroach/pkg/storage"
"github.com/cockroachdb/cockroach/pkg/storage/storagepb"
"github.com/cockroachdb/cockroach/pkg/util/contextutil"
"github.com/cockroachdb/cockroach/pkg/util/hlc"
"github.com/cockroachdb/cockroach/pkg/util/httputil"
"github.com/cockroachdb/cockroach/pkg/util/log"
"github.com/cockroachdb/cockroach/pkg/util/stop"
Expand Down Expand Up @@ -670,9 +671,9 @@ func (s *statusServer) Details(
if err != nil {
return nil, grpcstatus.Error(codes.Internal, err.Error())
}
ls := l.LivenessStatus(s.admin.server.clock.PhysicalTime(), 0 /* threshold */)
isHealthy := ls == storagepb.NodeLivenessStatus_LIVE
if !isHealthy {
nowHlc := hlc.Timestamp{WallTime: s.admin.server.clock.PhysicalNow()}
isReady := l.IsLive(nowHlc) && !l.Draining
if !isReady {
return nil, grpcstatus.Error(codes.Unavailable, "node is not ready")
}

Expand Down

0 comments on commit 139bd21

Please sign in to comment.