Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add node shutdown API for shutting down nodes cleanly #70338

Closed
18 of 22 tasks
dakrone opened this issue Mar 11, 2021 · 9 comments
Closed
18 of 22 tasks

Add node shutdown API for shutting down nodes cleanly #70338

dakrone opened this issue Mar 11, 2021 · 9 comments
Assignees
Labels
:Core/Infra/Node Lifecycle Node startup, bootstrapping, and shutdown >feature Meta Team:Core/Infra Meta label for core/infra team

Comments

@dakrone
Copy link
Member

dakrone commented Mar 11, 2021

This issue supersedes #49064, which will be closed.

The node shutdown API should provide a safe way for operators to shutdown a node ensuring all relevant orchestration steps are taken to prevent cluster instability and data loss. The feature can be used to decommission, power cycle or upgrade nodes.

An example of marking a node as part of the shutdown:

PUT /_nodes/<node_id>/shutdown
{
  "type": "remove",¹
  "reason": "shutdown of node so we can remove it from the cluster"²
}
¹ The type of decommission, in this case either a "remove" (the node is never coming back) or a "restart"
² A user-enterable free text block description of the reason why the node is being shut down

And retrieving the shutdown status:

GET /_nodes/<node_id>/shutdown

{
  "node": "data-node-1",
  "node_id": "node-id-1",
  "type": "remove",
  "reason": "shutdown of node so we can remove it from the cluster"
  "status": {¹
     "shutdown_status": "IN_PROGRESS",²
     "shard_migration": {
       "status": "IN_PROGRESS",
       "shard_migrations_remaining": 7,³
       "time_started": "<user readable date>",
       "time_started_millis": 234091892
     },
     "persistent_tasks": {
       "status": "IN_PROGRESS",
       "tasks_remaining": 2,
       "error": "ICouldntStopTheTasksException[i can't do that dave]...etc stacktrack etc...",
       "time_started": "<user readable date>",
       "time_started_millis": 128391987
     },
     "plugins": {
       "status": "NOT_STARTED",
     },
     "data_loss_on_removal": false⁷
  },
  "time_since_shutdown": "1.2h",
  "time_since_shutdown_millis": 4320000,
  "shutdown_started": "<user readable date>",9
  "shutdown_started_millis": 128391987
}
1. Shows the current state of the shutdown for this node. This can be used by operators to track progress
2. Overall shutdown status. Possible values are: "IN_PROGRESS", "COMPLETE", "STALLED". IF the shutdown is STALLED a error field will also be returned containing the reason the shutdown is stalled (e.g. no nodes can take remaining shards)
3. How many shards remain to be migrated off of this node
4. Whether in progress persistent tasks have been halt and new tasks have been blocked
5. The number of tasks that need to be completed before shutdown
6. Whether plugins have indicated that they are ready for shutdown
7. Whether data loss could occur if the node was terminated now
8. How long the shutdown has been ongoing.
9. When the shutdown was initiated.

Here are some high-level tasks that need to be completed for this:


Phase 2:

  • Add "REPLACE" shutdown type
    • Add REST and cluster state support for the "REPLACE" shutdown type (@gwbrown)
    • Add allocation decider and change existing deciders to handle node replacements (@dakrone)
  • Upgrades to persistent task handling
    • Cancel pre-existing tasks running on a node that is marked as shutting down (@dakrone)
    • Hook persistent task state into shutdown status API (@dakrone)
  • Enhance data tier allocation decider to allow migrating to a different tier if all nodes in a certain tier are shutdown (possibly?)
@dakrone dakrone added >feature :Core/Infra/Core Core issues without another label Meta labels Mar 11, 2021
@elasticmachine elasticmachine added the Team:Core/Infra Meta label for core/infra team label Mar 11, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

dakrone added a commit to dakrone/elasticsearch that referenced this issue Mar 22, 2021
This commit adds the rest endpoints for the node shutdown API. These APIs are behind the
`es.shutdown_feature_flag_enabled` feature flag for now, as development is ongoing.

Currently these APIs do not do anything, returning immediately. We plan to implement them for real
in subsequent work.

Relates to elastic#70338
dakrone added a commit that referenced this issue Mar 23, 2021
This commit adds the rest endpoints for the node shutdown API. These APIs are behind the
`es.shutdown_feature_flag_enabled` feature flag for now, as development is ongoing.

Currently these APIs do not do anything, returning immediately. We plan to implement them for real
in subsequent work.

Relates to #70338
dakrone added a commit to dakrone/elasticsearch that referenced this issue Mar 23, 2021
This commit adds the rest endpoints for the node shutdown API. These APIs are behind the
`es.shutdown_feature_flag_enabled` feature flag for now, as development is ongoing.

Currently these APIs do not do anything, returning immediately. We plan to implement them for real
in subsequent work.

Relates to elastic#70338
@gwbrown gwbrown added :Core/Infra/Node Lifecycle Node startup, bootstrapping, and shutdown and removed :Core/Infra/Core Core issues without another label labels Mar 24, 2021
dakrone added a commit to dakrone/elasticsearch that referenced this issue Apr 26, 2021
This commit changes the `PersistentTasksClusterService` to limit nodes for a task to a subset of
nodes (candidates) that are not currently shutting down.

It does not yet cancel tasks that may already be running on the nodes that are shut down, that will
be added in a subsequent request.

Relates to elastic#70338
dakrone added a commit that referenced this issue Apr 28, 2021
This commit changes the `PersistentTasksClusterService` to limit nodes for a task to a subset of
nodes (candidates) that are not currently shutting down.

It does not yet cancel tasks that may already be running on the nodes that are shut down, that will
be added in a subsequent request.

Relates to #70338
dakrone added a commit to dakrone/elasticsearch that referenced this issue Apr 28, 2021
This commit changes the `PersistentTasksClusterService` to limit nodes for a task to a subset of
nodes (candidates) that are not currently shutting down.

It does not yet cancel tasks that may already be running on the nodes that are shut down, that will
be added in a subsequent request.

Relates to elastic#70338
dakrone added a commit that referenced this issue Apr 28, 2021
…72426)

* Don't assign persistent tasks to nodes shutting down (#72260)

This commit changes the `PersistentTasksClusterService` to limit nodes for a task to a subset of
nodes (candidates) that are not currently shutting down.

It does not yet cancel tasks that may already be running on the nodes that are shut down, that will
be added in a subsequent request.

Relates to #70338

* Fix transport client usage in test
dakrone added a commit to dakrone/elasticsearch that referenced this issue May 3, 2021
Originally these were stored in the cluster state using a single class, however, they will need to
be different objects without common parts, and they will be calculated on the fly rather than
persisted into cluster state.

This removes the `NodeShutdownComponentStatus` class, as its no longer needed.

Relates to elastic#70338
dakrone added a commit that referenced this issue May 4, 2021
Originally these were stored in the cluster state using a single class, however, they will need to
be different objects without common parts, and they will be calculated on the fly rather than
persisted into cluster state.

This removes the NodeShutdownComponentStatus class, as its no longer needed.

Relates to #70338
dakrone added a commit to dakrone/elasticsearch that referenced this issue May 4, 2021
Originally these were stored in the cluster state using a single class, however, they will need to
be different objects without common parts, and they will be calculated on the fly rather than
persisted into cluster state.

This removes the NodeShutdownComponentStatus class, as its no longer needed.

Relates to elastic#70338
dakrone added a commit to dakrone/elasticsearch that referenced this issue Jun 22, 2021
…c#74267)

This converts the system property feature flag 'es.shutdown_feature_flag_enabled' to a regular
non-dynamic node setting. This setting can only be set to 'true' on a snapshot build of
Elasticsearch (not a release build).

Relates to elastic#70338
dakrone added a commit that referenced this issue Jun 22, 2021
…74267) (#74446)

This converts the system property feature flag 'es.shutdown_feature_flag_enabled' to a regular
non-dynamic node setting. This setting can only be set to 'true' on a snapshot build of
Elasticsearch (not a release build).

Relates to #70338
dakrone added a commit to dakrone/elasticsearch that referenced this issue Aug 2, 2021
It previously defaulted to false. The setting can still only be set to 'true' on a
non-release (snapshot) build of Elasticsearch.

Relates to elastic#70338
elasticsearchmachine pushed a commit that referenced this issue Aug 2, 2021
…#75962)

* Flip node shutdown feature flag to default to true on snapshot builds

It previously defaulted to false. The setting can still only be set to 'true' on a
non-release (snapshot) build of Elasticsearch.

Relates to #70338

* Handle case where operator privileges are enabled
elasticsearchmachine pushed a commit to elasticsearchmachine/elasticsearch that referenced this issue Aug 2, 2021
…elastic#75962)

* Flip node shutdown feature flag to default to true on snapshot builds

It previously defaulted to false. The setting can still only be set to 'true' on a
non-release (snapshot) build of Elasticsearch.

Relates to elastic#70338

* Handle case where operator privileges are enabled
lockewritesdocs pushed a commit to lockewritesdocs/elasticsearch that referenced this issue Aug 3, 2021
…elastic#75962)

* Flip node shutdown feature flag to default to true on snapshot builds

It previously defaulted to false. The setting can still only be set to 'true' on a
non-release (snapshot) build of Elasticsearch.

Relates to elastic#70338

* Handle case where operator privileges are enabled
dakrone added a commit to dakrone/elasticsearch that referenced this issue Oct 11, 2021
This commit enhances `DiskThresholdMonitor` so that indices that have a flood-stage block will not
have the block removed while they reside on a node that is part of a "REPLACE"-type node shutdown.

This prevents a situation where a node is blocked due to disk usage, then during the replacement the
block is removed while shards are relocating to the target node, indexing occurs, and then the
target runs out of space due to the additional documents.

Relates to elastic#70338 and elastic#76247
dakrone added a commit that referenced this issue Oct 12, 2021
#78942)

This commit enhances `DiskThresholdMonitor` so that indices that have a flood-stage block will not
have the block removed while they reside on a node that is part of a "REPLACE"-type node shutdown.

This prevents a situation where a node is blocked due to disk usage, then during the replacement the
block is removed while shards are relocating to the target node, indexing occurs, and then the
target runs out of space due to the additional documents.

Relates to #70338 and #76247
dakrone added a commit to dakrone/elasticsearch that referenced this issue Oct 12, 2021
elastic#78942)

This commit enhances `DiskThresholdMonitor` so that indices that have a flood-stage block will not
have the block removed while they reside on a node that is part of a "REPLACE"-type node shutdown.

This prevents a situation where a node is blocked due to disk usage, then during the replacement the
block is removed while shards are relocating to the target node, indexing occurs, and then the
target runs out of space due to the additional documents.

Relates to elastic#70338 and elastic#76247
# Conflicts:
#	server/src/test/java/org/elasticsearch/cluster/routing/allocation/DiskThresholdMonitorTests.java
elasticsearchmachine pushed a commit that referenced this issue Oct 12, 2021
#78942) (#79008)

This commit enhances `DiskThresholdMonitor` so that indices that have a flood-stage block will not
have the block removed while they reside on a node that is part of a "REPLACE"-type node shutdown.

This prevents a situation where a node is blocked due to disk usage, then during the replacement the
block is removed while shards are relocating to the target node, indexing occurs, and then the
target runs out of space due to the additional documents.

Relates to #70338 and #76247
# Conflicts:
#	server/src/test/java/org/elasticsearch/cluster/routing/allocation/DiskThresholdMonitorTests.java
dakrone added a commit to dakrone/elasticsearch that referenced this issue Oct 14, 2021
This commit allows replica shards that have existing data on disk to be re-allocated to the target
of a "REPLACE" type node shutdown. Prior to this if the target node of a shutdown were to restart,
the replicas would not be allowed to be allocated even if their data existed on disk.

Relates to elastic#70338 as a follow-up to elastic#76247
dakrone added a commit that referenced this issue Oct 15, 2021
…ement (#79171)

This commit allows replica shards that have existing data on disk to be re-allocated to the target
of a "REPLACE" type node shutdown. Prior to this if the target node of a shutdown were to restart,
the replicas would not be allowed to be allocated even if their data existed on disk.

Relates to #70338 as a follow-up to #76247
dakrone added a commit to dakrone/elasticsearch that referenced this issue Oct 15, 2021
…ement (elastic#79171)

This commit allows replica shards that have existing data on disk to be re-allocated to the target
of a "REPLACE" type node shutdown. Prior to this if the target node of a shutdown were to restart,
the replicas would not be allowed to be allocated even if their data existed on disk.

Relates to elastic#70338 as a follow-up to elastic#76247
elasticsearchmachine pushed a commit that referenced this issue Oct 15, 2021
…ement (#79171) (#79266)

This commit allows replica shards that have existing data on disk to be re-allocated to the target
of a "REPLACE" type node shutdown. Prior to this if the target node of a shutdown were to restart,
the replicas would not be allowed to be allocated even if their data existed on disk.

Relates to #70338 as a follow-up to #76247
pgomulka added a commit that referenced this issue Dec 16, 2021
This PR adds full cluster restart and rolling upgrade tests,
to ensure that Node Shutdown handles BWC correctly.

Relates #70338
pgomulka added a commit to pgomulka/elasticsearch that referenced this issue Dec 16, 2021
This PR adds full cluster restart and rolling upgrade tests,
to ensure that Node Shutdown handles BWC correctly.

Relates elastic#70338
elasticsearchmachine pushed a commit that referenced this issue Dec 16, 2021
This PR adds full cluster restart and rolling upgrade tests,
to ensure that Node Shutdown handles BWC correctly.

Relates #70338
@colings86 colings86 removed their assignment Dec 20, 2021
@dakrone
Copy link
Member Author

dakrone commented May 3, 2022

I believe since this API has been released, we can close this issue. Any further work can go into dedicated issues.

@dakrone dakrone closed this as completed May 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Node Lifecycle Node startup, bootstrapping, and shutdown >feature Meta Team:Core/Infra Meta label for core/infra team
Projects
None yet
Development

No branches or pull requests

5 participants