Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support dynamic node role #3436

Merged
merged 10 commits into from
Jun 14, 2022
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@
v: true
node_selector:
# Only send request to nodes in <2.0 versions, especially during ':qa:mixed-cluster:v1.x.x#mixedClusterTest'.
# Because YAML REST test takes the minimum OpenSearch version in the cluster to apply the filter in 'skip' section,
# Because YAML REST test takes the minimum OpenSearch version in the cluster to apply the filter in 'skip' section,
# see OpenSearchClientYamlSuiteTestCase#initAndResetContext() for detail.
# During 'mixedClusterTest', the cluster can be mixed with nodes in 1.x and 2.x versions,
# During 'mixedClusterTest', the cluster can be mixed with nodes in 1.x and 2.x versions,
# so node_selector is required, and only filtering version in 'skip' is not enough.
version: "1.0.0 - 1.4.99"

Expand All @@ -32,17 +32,17 @@

- match:
$body: |
/ #ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role cluster_manager name
^ ((\d{1,3}\.){3}\d{1,3} \s+ \d+ \s+ \d* \s+ (-)?\d* \s+ ((-)?\d*(\.\d+)?)? \s+ ((-)?\d*(\.\d+)?)?\s+ ((-)?\d*(\.\d+)?)? \s+ (-|[cdhilmrstvw]{1,11}) \s+ [-*x] \s+ (\S+\s?)+ \n)+ $/
/ #ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role node.roles cluster_manager name
^ ((\d{1,3}\.){3}\d{1,3} \s+ \d+ \s+ \d* \s+ (-)?\d* \s+ ((-)?\d*(\.\d+)?)? \s+ ((-)?\d*(\.\d+)?)?\s+ ((-)?\d*(\.\d+)?)? \s+ (-|[cdhilmrstvw]{1,11}) (\s+ (-|\w+(,\w+)*+))? \s+ [-*x] \s+ (\S+\s?)+ \n)+ $/

- do:
cat.nodes:
v: true

- match:
$body: |
/^ ip \s+ heap\.percent \s+ ram\.percent \s+ cpu \s+ load_1m \s+ load_5m \s+ load_15m \s+ node\.role \s+ cluster_manager \s+ name \n
((\d{1,3}\.){3}\d{1,3} \s+ \d+ \s+ \d* \s+ (-)?\d* \s+ ((-)?\d*(\.\d+)?)? \s+ ((-)?\d*(\.\d+)?)? \s+ ((-)?\d*(\.\d+)?)? \s+ (-|[cdhilmrstvw]{1,11}) \s+ [-*x] \s+ (\S+\s?)+ \n)+ $/
/^ ip \s+ heap\.percent \s+ ram\.percent \s+ cpu \s+ load_1m \s+ load_5m \s+ load_15m \s+ node\.role (\s+ node\.roles)? \s+ cluster_manager \s+ name \n
((\d{1,3}\.){3}\d{1,3} \s+ \d+ \s+ \d* \s+ (-)?\d* \s+ ((-)?\d*(\.\d+)?)? \s+ ((-)?\d*(\.\d+)?)? \s+ ((-)?\d*(\.\d+)?)? \s+ (-|[cdhilmrstvw]{1,11}) (\s+ (-|\w+(,\w+)*+ ))? \s+ [-*x] \s+ (\S+\s?)+ \n)+ $/

- do:
cat.nodes:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -567,10 +567,10 @@ private static Map<String, DiscoveryNodeRole> rolesToMap(final Stream<DiscoveryN
private static Map<String, DiscoveryNodeRole> roleMap = rolesToMap(DiscoveryNodeRole.BUILT_IN_ROLES.stream());

public static DiscoveryNodeRole getRoleFromRoleName(final String roleName) {
if (roleMap.containsKey(roleName) == false) {
throw new IllegalArgumentException("unknown role [" + roleName + "]");
if (roleMap.containsKey(roleName)) {
Copy link
Collaborator

@reta reta Jun 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While experimenting, run into interesting issue: the role name are case-sensitive, for example:

node.roles: [ master, Master]

Are 2 different roles. In general, it was fine, OpenSearch would refuse to start because Master was not recognized as a valid role. You may already see where I am going: with the dynamic roles, such set of roles becomes totally valid and confusing:

curl http://localhost:9200/_cat/nodes
127.0.0.1 14 84 7 2.24 2.35 3.06 m Master,master * my-ThinkPad-P15-Gen-1

I think we should be a bit more strict now with the way roles are treated and checks should be case-insensitive: in the example above master and Master should be reduced to same role master. @ylwu-amzn wdyt?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ylwu-amzn can I have you look over this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole case-sensitive thing can be dealt with as a separate issue IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, thanks for reminding. Busy with something else and missed this comment.

@reta that's a great point! I will make the code change to make role name case-insensitive.

return roleMap.get(roleName);
}
return roleMap.get(roleName);
return new DiscoveryNodeRole.UnknownRole(roleName, roleName, false);
Copy link
Collaborator

@reta reta May 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we should not use UnknownRole here but introduce the concept of DynamicRole, complementing the builtin roles. Why: the UnknownRole is supposed to be used to deal with migrations when some of the builtin roles may be removed in new version but still present in the old one. The DynamicRole would help to introduce such distinction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Good point. As we support dynamic/custom roles, it will be hard to differentiate UnknownRole (for example deprecated built-in role like master) and DynamicRole role (for example ml role) as both are not in built-in role list. So how about we just rename the UnknownRole as DynamicRole?

Copy link
Contributor Author

@ylwu-amzn ylwu-amzn May 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I think we should keep UnknownRole for bwc. If we plan to release this change on OS 2.1, the logic of https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/cluster/node/DiscoveryNode.java#L331 will be

if (in.getVersion().onOrAfter(Version.V_2_1_0)) {
    roles.add(new DiscoveryNodeRole.DynamicRole(roleName, roleNameAbbreviation, canContainData));
} else {
    roles.add(new DiscoveryNodeRole.UnknownRole(roleName, roleNameAbbreviation, canContainData));
}

}

public static Set<DiscoveryNodeRole> getPossibleRoles() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,6 @@ public abstract class DiscoveryNodeRole implements Comparable<DiscoveryNodeRole>
private static final DeprecationLogger deprecationLogger = DeprecationLogger.getLogger(DiscoveryNodeRole.class);
public static final String MASTER_ROLE_DEPRECATION_MESSAGE =
"Assigning [master] role in setting [node.roles] is deprecated. To promote inclusive language, please use [cluster_manager] role instead.";

private final String roleName;

/**
Expand Down Expand Up @@ -299,7 +298,7 @@ public Setting<Boolean> legacySetting() {

/**
* Represents an unknown role. This can occur if a newer version adds a role that an older version does not know about, or a newer
* version removes a role that an older version knows about.
* version removes a role that an older version knows about, or some custom role for extension function provided by plugin.
*/
static class UnknownRole extends DiscoveryNodeRole {

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -195,10 +195,12 @@ protected Table getTableWithHeader(final RestRequest request) {
table.addCell("load_5m", "alias:l;text-align:right;desc:5m load avg");
table.addCell("load_15m", "alias:l;text-align:right;desc:15m load avg");
table.addCell("uptime", "default:false;alias:u;text-align:right;desc:node uptime");
// TODO: Deprecate "node.role", use "node.roles" which shows full node role names
table.addCell(
"node.role",
"alias:r,role,nodeRole;desc:m:master eligible node, d:data node, i:ingest node, -:coordinating node only"
);
table.addCell("node.roles", "alias:rs,all roles;desc: -:coordinating node only");
// TODO: Remove the header alias 'master', after removing MASTER_ROLE. It's added for compatibility when using parameter 'h=master'.
table.addCell("cluster_manager", "alias:cm,m,master;desc:*:current cluster manager");
table.addCell("name", "alias:n;desc:node name");
Expand Down Expand Up @@ -423,12 +425,22 @@ Table buildTable(
table.addCell(jvmStats == null ? null : jvmStats.getUptime());

final String roles;
final String allRoles;
if (node.getRoles().isEmpty()) {
roles = "-";
allRoles = "-";
} else {
roles = node.getRoles().stream().map(DiscoveryNodeRole::roleNameAbbreviation).sorted().collect(Collectors.joining());
List<DiscoveryNodeRole> knownNodeRoles = node.getRoles()
.stream()
.filter(DiscoveryNodeRole::isKnownRole)
.collect(Collectors.toList());
roles = knownNodeRoles.size() > 0
? knownNodeRoles.stream().map(DiscoveryNodeRole::roleNameAbbreviation).sorted().collect(Collectors.joining())
: "-";
allRoles = node.getRoles().stream().map(DiscoveryNodeRole::roleName).sorted().collect(Collectors.joining(","));
}
table.addCell(roles);
table.addCell(allRoles);
table.addCell(clusterManagerId == null ? "x" : clusterManagerId.equals(node.getId()) ? "*" : "-");
table.addCell(node.getName());

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
/*
* SPDX-License-Identifier: Apache-2.0
*
* The OpenSearch Contributors require contributions made to
* this file be licensed under the Apache-2.0 license or a
* compatible open source license.
*/

package org.opensearch.cluster.node;

public class DiscoveryNodeRoleGenerator {

public static DiscoveryNodeRole createUnknownRole(String roleName) {
return new DiscoveryNodeRole.UnknownRole(roleName, roleName, false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think UnknownRole, aka NamedRole should have a constructor that takes the roleName, and we don't need a generator.

Copy link
Contributor Author

@ylwu-amzn ylwu-amzn Jun 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we remove this generator , we need to make the DynamicRole class public and expose the constructor as public. That seems open some door to hack node roles. For example, some plugin can create new DynamicRole in code directly rather than reply on the opensearch.yml, that breaks the design. For example, if user tweak the node roles by changing from image_processing to ml, then the ml task may be dispatched to a wrong node.

This generator is only for testing. How about we keep this?

}
}
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@

import java.util.Arrays;
import java.util.Collections;
import java.util.List;

import static org.hamcrest.Matchers.containsString;

Expand Down Expand Up @@ -54,4 +55,23 @@ public void testMasterRoleDeprecationMessage() {
assertEquals(Collections.singletonList(DiscoveryNodeRole.MASTER_ROLE), NodeRoleSettings.NODE_ROLES_SETTING.get(roleSettings));
assertWarnings(DiscoveryNodeRole.MASTER_ROLE_DEPRECATION_MESSAGE);
}

public void testUnknownNodeRoleAndBuiltInRoleCanCoexist() {
String testRole = "test_role";
Settings roleSettings = Settings.builder().put(NodeRoleSettings.NODE_ROLES_SETTING.getKey(), "data, " + testRole).build();
List<DiscoveryNodeRole> nodeRoles = NodeRoleSettings.NODE_ROLES_SETTING.get(roleSettings);
assertEquals(2, nodeRoles.size());
assertEquals(DiscoveryNodeRole.DATA_ROLE, nodeRoles.get(0));
assertEquals(testRole, nodeRoles.get(1).roleName());
assertEquals(testRole, nodeRoles.get(1).roleNameAbbreviation());
}

public void testUnknownNodeRoleOnly() {
String testRole = "test_role";
Settings roleSettings = Settings.builder().put(NodeRoleSettings.NODE_ROLES_SETTING.getKey(), testRole).build();
List<DiscoveryNodeRole> nodeRoles = NodeRoleSettings.NODE_ROLES_SETTING.get(roleSettings);
assertEquals(1, nodeRoles.size());
assertEquals(testRole, nodeRoles.get(0).roleName());
assertEquals(testRole, nodeRoles.get(0).roleNameAbbreviation());
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -40,14 +40,22 @@
import org.opensearch.cluster.ClusterName;
import org.opensearch.cluster.ClusterState;
import org.opensearch.cluster.node.DiscoveryNode;
import org.opensearch.cluster.node.DiscoveryNodeRole;
import org.opensearch.cluster.node.DiscoveryNodeRoleGenerator;
import org.opensearch.cluster.node.DiscoveryNodes;
import org.opensearch.common.Table;
import org.opensearch.common.settings.Settings;
import org.opensearch.test.OpenSearchTestCase;
import org.opensearch.test.rest.FakeRestRequest;
import org.opensearch.threadpool.TestThreadPool;
import org.junit.Before;

import java.util.Collections;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.function.Consumer;

import static java.util.Collections.emptyMap;
import static java.util.Collections.emptySet;
Expand All @@ -64,18 +72,15 @@ public void setUpAction() {
}

public void testBuildTableDoesNotThrowGivenNullNodeInfoAndStats() {
ClusterName clusterName = new ClusterName("cluster-1");
DiscoveryNodes.Builder builder = DiscoveryNodes.builder();
builder.add(new DiscoveryNode("node-1", buildNewFakeTransportAddress(), emptyMap(), emptySet(), Version.CURRENT));
DiscoveryNodes discoveryNodes = builder.build();
ClusterState clusterState = mock(ClusterState.class);
when(clusterState.nodes()).thenReturn(discoveryNodes);

ClusterStateResponse clusterStateResponse = new ClusterStateResponse(clusterName, clusterState, false);
NodesInfoResponse nodesInfoResponse = new NodesInfoResponse(clusterName, Collections.emptyList(), Collections.emptyList());
NodesStatsResponse nodesStatsResponse = new NodesStatsResponse(clusterName, Collections.emptyList(), Collections.emptyList());

action.buildTable(false, new FakeRestRequest(), clusterStateResponse, nodesInfoResponse, nodesStatsResponse);
testBuildTableWithRoles(emptySet(), (table) -> {
Map<String, List<Table.Cell>> nodeInfoMap = table.getAsMap();
List<Table.Cell> cells = nodeInfoMap.get("node.role");
assertEquals(1, cells.size());
assertEquals("-", cells.get(0).value);
cells = nodeInfoMap.get("node.roles");
assertEquals(1, cells.size());
assertEquals("-", cells.get(0).value);
});
}

public void testCatNodesWithLocalDeprecationWarning() {
Expand All @@ -89,4 +94,51 @@ public void testCatNodesWithLocalDeprecationWarning() {

terminate(threadPool);
}

public void testBuildTableWithUnknownRoleOnly() {
Set<DiscoveryNodeRole> roles = new HashSet<>();
String roleName = "test_role";
DiscoveryNodeRole testRole = DiscoveryNodeRoleGenerator.createUnknownRole(roleName);
roles.add(testRole);

testBuildTableWithRoles(roles, (table) -> {
Map<String, List<Table.Cell>> nodeInfoMap = table.getAsMap();
List<Table.Cell> cells = nodeInfoMap.get("node.roles");
assertEquals(1, cells.size());
assertEquals(roleName, cells.get(0).value);
});
}

public void testBuildTableWithBothBuiltInAndUnknownRoles() {
Set<DiscoveryNodeRole> roles = new HashSet<>();
roles.add(DiscoveryNodeRole.DATA_ROLE);
String roleName = "test_role";
DiscoveryNodeRole testRole = DiscoveryNodeRoleGenerator.createUnknownRole(roleName);
roles.add(testRole);

testBuildTableWithRoles(roles, (table) -> {
Map<String, List<Table.Cell>> nodeInfoMap = table.getAsMap();
List<Table.Cell> cells = nodeInfoMap.get("node.roles");
assertEquals(1, cells.size());
assertEquals("data,test_role", cells.get(0).value);
});
}

private void testBuildTableWithRoles(Set<DiscoveryNodeRole> roles, Consumer<Table> verificationFunction) {
ClusterName clusterName = new ClusterName("cluster-1");
DiscoveryNodes.Builder builder = DiscoveryNodes.builder();

builder.add(new DiscoveryNode("node-1", buildNewFakeTransportAddress(), emptyMap(), roles, Version.CURRENT));
DiscoveryNodes discoveryNodes = builder.build();
ClusterState clusterState = mock(ClusterState.class);
when(clusterState.nodes()).thenReturn(discoveryNodes);

ClusterStateResponse clusterStateResponse = new ClusterStateResponse(clusterName, clusterState, false);
NodesInfoResponse nodesInfoResponse = new NodesInfoResponse(clusterName, Collections.emptyList(), Collections.emptyList());
NodesStatsResponse nodesStatsResponse = new NodesStatsResponse(clusterName, Collections.emptyList(), Collections.emptyList());

Table table = action.buildTable(false, new FakeRestRequest(), clusterStateResponse, nodesInfoResponse, nodesStatsResponse);

verificationFunction.accept(table);
}
}