-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Voting config exclusions should work with absent nodes #50836
Conversation
Pinging @elastic/es-distributed (:Distributed/Cluster Coordination) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking at this.
We would like to entirely drop support for the more complex node resolution process from DiscoveryNodes#resolveNodes
and only support node names and node IDs.
I suggest doing this by deprecating support for describing the nodes using POST /_cluster/voting_config_exclusions/{node_name}
and instead adding query parameters ?node_names={...}
and ?node_ids={...}
. We could use these to add VotingConfigExclusion
s for the corresponding nodes, using a placeholder for nodeId
or nodeName
if it is not known at resolution time.
I see. I tried to add the suggested query params as follow and this gave me an error. It seems as query param might be optional, they don't get taken into account when checking for url uniqueness
In addition, when I just keep one of those two, it still conflicts with RestClearVotingConfigExclusionsAction as it's registered with the url What do you think we register with the following urls instead?
|
@DaveCTurner I just pushed a new commit to use a new API that excludes node just based on id or name. Could you please let me know if this is in the right direction? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update @zacharymorn and apologies for taking longer than normal to get back to you. I have left some more comments on the approach.
...ain/java/org/elasticsearch/rest/action/admin/cluster/RestAddVotingConfigExclusionAction.java
Outdated
Show resolved
Hide resolved
...ain/java/org/elasticsearch/rest/action/admin/cluster/RestAddVotingConfigExclusionAction.java
Outdated
Show resolved
Hide resolved
...ain/java/org/elasticsearch/rest/action/admin/cluster/RestAddVotingConfigExclusionAction.java
Outdated
Show resolved
Hide resolved
...a/org/elasticsearch/action/admin/cluster/configuration/AddVotingConfigExclusionsRequest.java
Outdated
Show resolved
Hide resolved
@DaveCTurner I’ve taken your suggestions and made a new commit. The changes could use more unit tests but I would like to get your feedback early. Could you please take a look and let me know if it is in the right direction? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @zacharymorn, this looks neater. I left a few more comments.
...a/org/elasticsearch/action/admin/cluster/configuration/AddVotingConfigExclusionsRequest.java
Outdated
Show resolved
Hide resolved
...a/org/elasticsearch/action/admin/cluster/configuration/AddVotingConfigExclusionsRequest.java
Outdated
Show resolved
Hide resolved
...a/org/elasticsearch/action/admin/cluster/configuration/AddVotingConfigExclusionsRequest.java
Outdated
Show resolved
Hide resolved
...a/org/elasticsearch/action/admin/cluster/configuration/AddVotingConfigExclusionsRequest.java
Outdated
Show resolved
Hide resolved
@@ -47,7 +53,7 @@ | |||
* @param nodeDescriptions Descriptions of the nodes to add - see {@link DiscoveryNodes#resolveNodes(String...)} | |||
*/ | |||
public AddVotingConfigExclusionsRequest(String[] nodeDescriptions) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This constructor is only used in tests, and it looks like we could migrate all of those tests over to using node names instead of node descriptions. Some of them would also be neater if we used a varargs:
public AddVotingConfigExclusionsRequest(String[] nodeDescriptions) { | |
public AddVotingConfigExclusionsRequest(String... nodeNames) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we migrate these tests now to use node names instead of descriptions, I'm a bit concerned that we may not have tests to prove that the changes are still backward compatible and don't have bugs that may break logic based on nodeDescriptions, before it is fully migrated to nodeIds / nodeNames. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, you're right, I didn't quite mean "all" these tests. We should comprehensively test the different kinds of node resolution by strengthening AddVotingConfigExclusionsRequestTests
. The other tests can move over to node names without loss of coverage IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't resolved yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry missed this earlier. Done in commit 02a3533
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry when I re-read this thread I somehow got the wrong idea that this constructor need to be removed. I reverted that commit and tried again in commit 5c7a226. Could you let me know if this looks good?
server/src/main/java/org/elasticsearch/cluster/node/DiscoveryNodes.java
Outdated
Show resolved
Hide resolved
} | ||
else { | ||
Map<String, String> existingNodesNameId = new HashMap<>(); | ||
for (DiscoveryNode node : this) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just the master-eligible nodes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry a question just came up when I looked at this again. When we resolve by nodeId, we use ALL existing nodes to check if it exists, not just the master-eligible ones. Shall we keep this behavior the same for resolving by node name as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll open a new comment thread on the newly-moved code.
...a/org/elasticsearch/action/admin/cluster/configuration/AddVotingConfigExclusionsRequest.java
Outdated
Show resolved
Hide resolved
...a/org/elasticsearch/action/admin/cluster/configuration/AddVotingConfigExclusionsRequest.java
Outdated
Show resolved
Hide resolved
...a/org/elasticsearch/action/admin/cluster/configuration/AddVotingConfigExclusionsRequest.java
Outdated
Show resolved
Hide resolved
@DaveCTurner I've added more tests to cover most of my changes I believe, could you please take a look and let me know if more is needed? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @zacharymorn, I have done another pass and left more comments.
@@ -47,7 +53,7 @@ | |||
* @param nodeDescriptions Descriptions of the nodes to add - see {@link DiscoveryNodes#resolveNodes(String...)} | |||
*/ | |||
public AddVotingConfigExclusionsRequest(String[] nodeDescriptions) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't resolved yet?
...a/org/elasticsearch/action/admin/cluster/configuration/AddVotingConfigExclusionsRequest.java
Outdated
Show resolved
Hide resolved
...a/org/elasticsearch/action/admin/cluster/configuration/AddVotingConfigExclusionsRequest.java
Outdated
Show resolved
Hide resolved
...a/org/elasticsearch/action/admin/cluster/configuration/AddVotingConfigExclusionsRequest.java
Outdated
Show resolved
Hide resolved
...a/org/elasticsearch/action/admin/cluster/configuration/AddVotingConfigExclusionsRequest.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/coordination/JoinTaskExecutor.java
Outdated
Show resolved
Hide resolved
.../elasticsearch/action/admin/cluster/configuration/AddVotingConfigExclusionsRequestTests.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/cluster/coordination/NodeJoinTests.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/cluster/coordination/Coordinator.java
Outdated
Show resolved
Hide resolved
server/src/test/java/org/elasticsearch/cluster/coordination/CoordinatorTests.java
Outdated
Show resolved
Hide resolved
@elasticmachine ok to test |
In elastic#50836 we deprecated the existing voting config exclusions API and added a new one. This commit adjust the docs to match.
…inTaskExecutor.java Co-Authored-By: David Turner <[email protected]>
…nfiguration/AddVotingConfigExclusionsRequest.java Co-Authored-By: David Turner <[email protected]>
…ordinatorTests.java Co-Authored-By: David Turner <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Today the voting config exclusions API accepts node filters and resolves them to a collection of node IDs against the current cluster membership. This is problematic since we may want to exclude nodes that are not currently members of the cluster. For instance: - if attempting to remove a flaky node from the cluster you cannot reliably exclude it from the voting configuration since it may not reliably be a member of the cluster - if `cluster.auto_shrink_voting_configuration: false` then naively shrinking the cluster will remove some nodes but will leaving their node IDs in the voting configuration. The only way to clean up the voting configuration is to grow the cluster back to its original size (potentially replacing some of the voting configuration) and then use the exclusions API. This commit adds an alternative API that accepts node names and node IDs but not node filters in general, and deprecates the current node-filters-based API. Relates elastic#47990.
Today the voting config exclusions API accepts node filters and resolves them to a collection of node IDs against the current cluster membership. This is problematic since we may want to exclude nodes that are not currently members of the cluster. For instance: - if attempting to remove a flaky node from the cluster you cannot reliably exclude it from the voting configuration since it may not reliably be a member of the cluster - if `cluster.auto_shrink_voting_configuration: false` then naively shrinking the cluster will remove some nodes but will leaving their node IDs in the voting configuration. The only way to clean up the voting configuration is to grow the cluster back to its original size (potentially replacing some of the voting configuration) and then use the exclusions API. This commit adds an alternative API that accepts node names and node IDs but not node filters in general, and deprecates the current node-filters-based API. Relates #47990. Backport of #50836 to 7.x. Co-authored-by: zacharymorn <[email protected]>
In #50836 we deprecated the existing voting config exclusions API and added a new one. This commit adjust the docs to match.
In #50836 we deprecated the existing voting config exclusions API and added a new one. This commit adjust the docs to match.
I will follow up with a PR to remove the deprecated API in the next few days. |
Voting config exclusions should work with absent nodes. For details, please see #47990.