Adding a deprecation info API check for too many shards #85967

masseyke · 2022-04-18T22:07:04Z

This commit makes sure that there is enough room in a cluster to add a small number of shards during an upgrade. The
information is exposed in the deprecation info API as a cluster configuration check.
Closes #85702

elasticmachine · 2022-04-18T22:07:08Z

Pinging @elastic/es-data-management (Team:Data Management)

joegallo · 2022-04-28T16:27:57Z

Do we actually need room for the replicas? That is, does something in the process truly require these indices to be green or is (merely) yellow sufficient?

As an edge case, I'm not sure your API is correctly shaped. Imagine a three data node cluster where two of the three nodes are already at cluster.max_shards_per_node and the last node is 20 shards under that. I think the math you're doing internally would indicate that "yes, I can totally allocate 5 new shards + 1 replica each", but that isn't actually so. I think checkShardLimit is closer to the right API already.

edit: adding a little additional commentary: personally, I think it's fine if the system gets the above case wrong, I'd just rather that the wrong logic lives in ClusterDeprecationChecks (maybe with a "this is a bit of a hack, but c'est la vie"-type comment) rather than ShardLimitValidator -- the latter seems so closer to the core of the system, and so I think it's more important that it is technically correct.

masseyke · 2022-04-28T17:13:42Z

Yeah my initial thought was to just check that we're under the limit. But it sounds like the problem is actually that during an upgrade kibana creates more indices. See #85702 (comment) for a little more. So you need a little more room. My thinking (and maybe it's not great) was that on a tiny cluster like you describe, you wouldn't be on cloud and you wouldn't let something like the deprecation API stop you from upgrading. But I don't know how we handle both supporting upgrades on cloud and tiny self-managed clusters, unless we kick this check over to cloud to do.

dakrone

But I don't know how we handle both supporting upgrades on cloud and tiny self-managed clusters, unless we kick this check over to cloud to do.

Since this is a dynamic setting, this is not the end of the world for supporting. For example, a tiny self-managed cluster could probably bump the limit up prior to their upgrade.

dakrone · 2022-04-28T18:07:29Z

.../deprecation/src/main/java/org/elasticsearch/xpack/deprecation/ClusterDeprecationChecks.java

+                "The cluster has too many shards to be able to upgrade",
+                "https://ela.st/es-deprecation-7-shard-limit",
+                "Delete indices to clear up space",


Is this message from another deprecation error? Otherwise maybe we can be more specific and say something like "there is not enough room for N more shards, increase the cluster.max_shards_per_node or cluster.max_shards_per_node.frozen settings or remove indices to clear up resources."

(using "space" is hard here, because it makes it sound like it's disk-related, when really it's just our arbitrary limits)

It might be used in another error message, but in this case I think it was just me trying to walk the line between being too technical and not technical enough to be useful. I'll reword it.

I intentionally didn't include cluster.max_shards_per_node.frozen in the reworded version because we're not checking that setting (the assumption is that the new small index would not be a frozen index).

masseyke · 2022-05-04T13:14:26Z

@elasticmachine update branch

dakrone

LGTM, sorry I took so long for review on this.

This commit makes sure that there is enough room in a cluster to add a small number of shards during an upgrade. The information is exposed in the deprecation info API as a cluster configuration check. Closes elastic#85702

elasticsearchmachine · 2022-05-06T13:05:20Z

💔 Backport failed

Status	Branch	Result
❌	7.17	Commit could not be cherrypicked due to conflicts
✅	8.2

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 85967

#86518) * Adding a deprecation info API check for too many shards (#85967) This commit makes sure that there is enough room in a cluster to add a small number of shards during an upgrade. The information is exposed in the deprecation info API as a cluster configuration check. Closes #85702 * fixing unit test

ShardLimitValidatorTests was accidentally broken as part of #85967 because the new assertions in that class were not taking the type of nodes into account ("normal" or "frozen"). This is a simple change to take that into account.

Adding a deprecation info API check for too many shards

54f4f46

masseyke added >non-issue backport pending :Data Management/Other v7.17.3 v8.3.0 v8.2.1 labels Apr 18, 2022

elasticmachine added the Team:Data Management Meta label for data/management team label Apr 18, 2022

dakrone self-requested a review April 28, 2022 15:44

dakrone reviewed Apr 28, 2022

View reviewed changes

improving wording

30a4e44

masseyke requested a review from dakrone May 3, 2022 22:59

Merge branch 'master' into feature/deprecation-info-shard-check

e3454e4

dakrone approved these changes May 5, 2022

View reviewed changes

masseyke added the auto-backport-and-merge label May 6, 2022

masseyke merged commit 192b4ca into elastic:master May 6, 2022

masseyke deleted the feature/deprecation-info-shard-check branch May 6, 2022 13:03

masseyke mentioned this pull request May 6, 2022

[8.2] Adding a deprecation info API check for too many shards (#85967) #86518

Merged

This was referenced May 6, 2022

Fixing ShardLimitValidatorTests #86520

Merged

[7.17] Adding a deprecation info API check for too many shards (#85967) #86523

Merged

masseyke removed the backport pending label May 6, 2022

Leaf-Lin mentioned this pull request May 31, 2022

System indices should be exempt from shard limits #71486

Open

HiDAl mentioned this pull request Feb 23, 2023

Move checkShards from Deprecation API to Health API #94079

Closed

4 tasks

masseyke mentioned this pull request Feb 28, 2023

Remove unneeded argument from ShardLimitValidator #94193

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding a deprecation info API check for too many shards #85967

Adding a deprecation info API check for too many shards #85967

masseyke commented Apr 18, 2022

elasticmachine commented Apr 18, 2022

joegallo commented Apr 28, 2022 •

edited

Loading

masseyke commented Apr 28, 2022

dakrone left a comment

dakrone Apr 28, 2022

masseyke May 3, 2022

masseyke May 3, 2022

masseyke commented May 4, 2022

dakrone left a comment

elasticsearchmachine commented May 6, 2022

Adding a deprecation info API check for too many shards #85967

Adding a deprecation info API check for too many shards #85967

Conversation

masseyke commented Apr 18, 2022

elasticmachine commented Apr 18, 2022

joegallo commented Apr 28, 2022 • edited Loading

masseyke commented Apr 28, 2022

dakrone left a comment

Choose a reason for hiding this comment

dakrone Apr 28, 2022

Choose a reason for hiding this comment

masseyke May 3, 2022

Choose a reason for hiding this comment

masseyke May 3, 2022

Choose a reason for hiding this comment

masseyke commented May 4, 2022

dakrone left a comment

Choose a reason for hiding this comment

elasticsearchmachine commented May 6, 2022

💔 Backport failed

joegallo commented Apr 28, 2022 •

edited

Loading