Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing _list/shards API for closed indices #16606

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

gargharsh3134
Copy link
Contributor

@gargharsh3134 gargharsh3134 commented Nov 11, 2024

Description

By default, the Broadcast requests use IndicesOptions.strictExpandOpenAndForbidClosed(). For _list/shards API, concrete indices from strategy are being passed to IndicesStats Action with the default indicesOptions.
Details around IndexNameExpressionResolver which are causing issues with the list/shards API:

  1. Concrete indices if specified in request with indices options as IndicesOptions.strictExpandOpenAndForbidClosed(), it would result in an error (index_closed_exception).
  2. Concrete index even if is hidden, will get evaluated and fetched. Basically, if wildcard is to be expanded the IndexOption will ignore hidden indices, but if concretely specified it will get included. Similar is the behaviour with existing _cat/indices API, which by default (curl localhost:9200/_cat/indices), omits the hidden indices but if one of them is explicitly queried, it will return the stats (curl localhost:9200/_cat/indices/hidden-index).

_list/shards API throws index_closed_exception whenever the cluster has closed indices. This change is to resolve and ignore the closed indices from getting explicitly passed to IndicesStatsAction.

Functional Testing

On a local cluster with 2 indices:
open-test-index -> Open index with 1P and 1R.
new-test-idx -> closed index with 1P and 2 R.

gkharsh@bcd07443f159 OpenSearch % curl localhost:9200/_cat/indices
green open  open-test-index rvZzxo5BSnK3TFlmlPQMkw 1 1 0 0 416b 208b
green close new-test-idx    eHm4npFbTQ-dfU8abSLJoQ 1 2 

Existing _cat/shards behaviour:

Unfiltered default query outputs closed shards without stats.

gkharsh@bcd07443f159 OpenSearch % curl "localhost:9200/_cat/shards"                                           
open-test-index 0 p STARTED 0 208b 127.0.0.1 data1
open-test-index 0 r STARTED 0 208b 127.0.0.1 data2
new-test-idx    0 p STARTED        127.0.0.1 data1
new-test-idx    0 r STARTED        127.0.0.1 data2
new-test-idx    0 r STARTED        127.0.0.1 data3


Explicitly querying for a concrete closed index, throws error.


gkharsh@bcd07443f159 OpenSearch % curl "localhost:9200/_cat/shards/new-test-idx"   
{"error":{"root_cause":[{"type":"index_closed_exception","reason":"closed","index":"new-test-idx","index_uuid":"eHm4npFbTQ-dfU8abSLJoQ"}],"type":"index_closed_exception","reason":"closed","index":"new-test-idx","index_uuid":"eHm4npFbTQ-dfU8abSLJoQ"},"status":400}%    

Wildcards only match open indices and response is empty if only closed indices match the expression

gkharsh@bcd07443f159 OpenSearch % curl "localhost:9200/_cat/shards/new-test-idx*"
gkharsh@bcd07443f159 OpenSearch % 


Fixed _list/shards behaviour:

Unfiltered default query will output closed shards without stats.

gkharsh@bcd07443f159 OpenSearch % curl "localhost:9200/_list/shards"             
new-test-idx    0 p STARTED        127.0.0.1 data1
new-test-idx    0 r STARTED        127.0.0.1 data2
new-test-idx    0 r STARTED        127.0.0.1 data3
open-test-index 0 p STARTED 0 208b 127.0.0.1 data1
open-test-index 0 r STARTED 0 208b 127.0.0.1 data2
next_token null

Explicitly querying for a concrete closed index, will output closed shards without stats instead of throwing error.


gkharsh@bcd07443f159 OpenSearch % curl "localhost:9200/_list/shards/new-test-idx" 
new-test-idx 0 p STARTED   127.0.0.1 data1
new-test-idx 0 r STARTED   127.0.0.1 data2
new-test-idx 0 r STARTED   127.0.0.1 data3
next_token null


Wildcards continue to match only open indices and response is empty if only closed indices match the expression


gkharsh@bcd07443f159 OpenSearch % curl "localhost:9200/_list/shards/new-test-idx*"
next_token null


Related Issues

Resolves #16626

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

❌ Gradle check result for c62e5f2: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❌ Gradle check result for 691803c: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions github-actions bot added bug Something isn't working Cluster Manager labels Nov 13, 2024
Copy link
Contributor

✅ Gradle check result for 4663f2b: SUCCESS

Copy link

codecov bot commented Nov 13, 2024

Codecov Report

Attention: Patch coverage is 15.78947% with 16 lines in your changes missing coverage. Please review.

Project coverage is 72.12%. Comparing base (c82cd2e) to head (40b2641).
Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
...admin/cluster/shards/TransportCatShardsAction.java 0.00% 12 Missing ⚠️
...ction/admin/indices/stats/IndicesStatsRequest.java 50.00% 2 Missing ⚠️
...min/indices/stats/TransportIndicesStatsAction.java 33.33% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #16606      +/-   ##
============================================
- Coverage     72.51%   72.12%   -0.40%     
+ Complexity    65562    65186     -376     
============================================
  Files          5318     5318              
  Lines        303945   303959      +14     
  Branches      43976    43978       +2     
============================================
- Hits         220413   219228    -1185     
- Misses        65798    66751     +953     
- Partials      17734    17980     +246     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -167,6 +167,10 @@ public enum Option {
EnumSet.of(Option.ALLOW_NO_INDICES, Option.IGNORE_UNAVAILABLE),
EnumSet.of(WildcardStates.OPEN, WildcardStates.CLOSED, WildcardStates.HIDDEN)
);
public static final IndicesOptions LENIENT_EXPAND_OPEN_FORBID_CLOSED = new IndicesOptions(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did we check the behavior in existing _cat/shards API for hidden or system indices? dothey show up by default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shwetathareja yes, they do get displayed as part of _cat/shards by default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Should we remove Option.FORBID_CLOSED_INDICES from the options and have only EnumSet.of(Option.ALLOW_NO_INDICES) for first argument.
  • Should we add WildcardStates.CLOSED to expand the closed index also ?

yes, they do get displayed as part of _cat/shards by default

How are the hidden/system index getting displayed without specifying the option WildcardStates.HIDDEN

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove Option.FORBID_CLOSED_INDICES from the options and have only EnumSet.of(Option.ALLOW_NO_INDICES) for first argument.

This is done to NOT fetch and simply ignore the closed indices.

Should we add WildcardStates.CLOSED to expand the closed index also ?

Kept this to keep it consistent with the existing behaviour of _cat/shards. I'm thinking to keep it as is for now, and re-visit later once we decide on a unified behaviour for both the APIs. But open to suggestion, please let me know if we should add this now itself.

How are the hidden/system index getting displayed without specifying the option WildcardStates.HIDDEN

They just get shown in the response without any stats similar to closed shards.

Copy link
Contributor

✅ Gradle check result for 06a1118: SUCCESS

Copy link
Contributor

@rajiv-kv rajiv-kv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you briefly explain why the behaviour of _cat/shards and _list/shards is different in PR description ?

@@ -167,6 +167,10 @@ public enum Option {
EnumSet.of(Option.ALLOW_NO_INDICES, Option.IGNORE_UNAVAILABLE),
EnumSet.of(WildcardStates.OPEN, WildcardStates.CLOSED, WildcardStates.HIDDEN)
);
public static final IndicesOptions LENIENT_EXPAND_OPEN_FORBID_CLOSED = new IndicesOptions(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Should we remove Option.FORBID_CLOSED_INDICES from the options and have only EnumSet.of(Option.ALLOW_NO_INDICES) for first argument.
  • Should we add WildcardStates.CLOSED to expand the closed index also ?

yes, they do get displayed as part of _cat/shards by default

How are the hidden/system index getting displayed without specifying the option WildcardStates.HIDDEN

@@ -125,4 +128,33 @@ public void onFailure(Exception e) {
latch.await();
}

public void testCatShardsSuccessWithPaginationWithClosedIndices() throws InterruptedException, ExecutionException {
Copy link
Contributor

@rajiv-kv rajiv-kv Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Can we include closed indices across pages and verify that all the required fields are present in the output ?
  • might be also good to add a test that does /_cat/shards call and ensures that the response match with /_list/shards.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expanded the test to make both cat/shards and list/shards call, and verify that the number of shardRoutings to be displayed and number of stats fetched are same for both the calls.

shardsRequest.setCancelAfterTimeInterval(NO_TIMEOUT);
shardsRequest.setIndices(Strings.EMPTY_ARRAY);
ActionFuture<CatShardsResponse> response = client().execute(CatShardsAction.INSTANCE, shardsRequest);
assertTrue(response.get().getResponseShards().stream().anyMatch(shard -> shard.getIndexName().equals("test-3")));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also include a test to verify hidden and system indices ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the IT to include hidden indices, in the revision. There isn't any setting of system index, its just Open and Closed, out of which the indices could be hidden or not, so just added a hidden index in the test case.

Copy link
Contributor

❌ Gradle check result for 692eeb6: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

❕ Gradle check result for 503683f: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link
Contributor

❕ Gradle check result for 537fb2d: UNSTABLE

  • TEST FAILURES:
      2 org.opensearch.cluster.MinimumClusterManagerNodesIT.testThreeNodesNoClusterManagerBlock

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link
Contributor

✅ Gradle check result for 40b2641: SUCCESS

@@ -57,6 +57,7 @@
public class IndicesStatsRequest extends BroadcastRequest<IndicesStatsRequest> {

private CommonStatsFlags flags = new CommonStatsFlags();
private boolean skipIndexNameResolver = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs to be ser/deser for transport out/in

@Override
protected String[] resolveConcreteIndexNames(ClusterState clusterState, IndicesStatsRequest request) {
if (request.skipIndexNameResolver()) {
return request.indices();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that clusterState could have changed, should we return only indices that are present in cluster-state ?

* IndicesStats. Since pagination strategy always passes concrete indices to TransportIndicesStatsAction,
* the default behaviour of StrictExpandOpenAndForbidClosed leads to errors if closed indices are encountered.
*/
private List<String> filterPaginationResponse(ClusterState clusterState, List<String> strategyIndices) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • nit :- Can we have the method return string[]
  • method name can be filterClosedIndices

private List<String> filterPaginationResponse(ClusterState clusterState, List<String> strategyIndices) {
return strategyIndices.stream().filter(index -> {
IndexMetadata metadata = clusterState.metadata().indices().get(index);
return metadata != null && metadata.getState().equals(IndexMetadata.State.OPEN);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we check for !Closed ?

@@ -63,6 +69,7 @@ public void doExecute(Task parentTask, CatShardsRequest shardsRequest, ActionLis
clusterStateRequest.clear().nodes(true).routingTable(true).indices(shardsRequest.getIndices());
} else {
clusterStateRequest.clear().nodes(true).routingTable(true).indices(shardsRequest.getIndices()).metadata(true);
clusterStateRequest.indicesOptions(IndicesOptions.lenientExpandHidden());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the default option is LENIENT_EXPAND_OPEN for clusterStateRequest. Why is this required to be changed ?

.put(IndexMetadata.SETTING_INDEX_HIDDEN, true)
.build()
);
ensureGreen();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Can we also add system indices ?
  • Do we need to also include datastreams and aliases ?

listShardsRequest.setCancelAfterTimeInterval(NO_TIMEOUT);
listShardsRequest.setIndices(Strings.EMPTY_ARRAY);
listShardsRequest.setPageParams(new PageParams(null, PageParams.PARAM_ASC_SORT_VALUE, pageSize));
ActionFuture<CatShardsResponse> listShardsResponse = client().execute(CatShardsAction.INSTANCE, listShardsRequest);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we assert on length to verify all shards are returned ?

@@ -125,4 +130,101 @@ public void onFailure(Exception e) {
latch.await();
}

public void testListShardsWithClosedAndHiddenIndices() throws InterruptedException, ExecutionException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add tests with cluster-state containing

  • only closed index
  • only hidden index
  • combination of hidden and closed
    We can repeat the same queries as this test and verify output response

);

// Verifying responses when hidden indices are queried with wildcards: /_list/shards/test-hidden-idx*
// Shards for hidden index should appear in response without stats
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this typo error ? Does it return stats for hidden index always ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Cluster Manager
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

[BUG] _list/shards API failing with index_closed_exception
4 participants