Simplify `IndicesShardStoresAction` #94507

DaveCTurner · 2023-03-14T19:17:29Z

No need to use an AsyncShardFetch here, there is no caching
Response may be very large, introduce chunking
Fan-out may be very large, introduce throttling
Processing time may be nontrivial, introduce cancellability
Eliminate many unnecessary intermediate data structures
Do shard-level response processing more eagerly
Determine allocation from RoutingTable not RoutingNodes
Add tests

- No need to use an `AsyncShardFetch` here, there is no caching - Response may be very large, introduce chunking - Fan-out may be very large, introduce throttling - Processing time may be nontrivial, introduce cancellability - Eliminate many unnecessary intermediate data structures - Do shard-level response processing more eagerly - Determine allocation from `RoutingTable` not `RoutingNodes` - Add tests Relates elastic#81081

elasticsearchmachine · 2023-03-14T19:17:54Z

Pinging @elastic/es-distributed (Team:Distributed)

idegtiarenko · 2023-03-24T14:51:37Z

...in/java/org/elasticsearch/action/admin/indices/shards/TransportIndicesShardStoresAction.java

+        private final String[] concreteIndices;
+        private final RoutingTable routingTable;
+        private final Metadata metadata;
+        private final Map<String, Map<Integer, List<StoreStatus>>> indicesStatuses;


Why not Map<ShardId, List<StoreStatus>> ?

Really just because that's what the response wants.

idegtiarenko · 2023-03-24T14:52:24Z

...in/java/org/elasticsearch/action/admin/indices/shards/TransportIndicesShardStoresAction.java

+            this.failures = new ConcurrentLinkedQueue<>();
+            this.outerListener = new RefCountingListener(1, listener.map(ignored -> {
+                task.ensureNotCancelled();
+                return new IndicesShardStoresResponse(Map.copyOf(indicesStatuses), List.copyOf(failures));


Oh, was IndicesShardStoresResponse requiring map of maps before?

idegtiarenko · 2023-03-24T15:08:03Z

...in/java/org/elasticsearch/action/admin/indices/shards/TransportIndicesShardStoresAction.java

@@ -66,6 +63,8 @@ public class TransportIndicesShardStoresAction extends TransportMasterNodeReadAc

    private static final Logger logger = LogManager.getLogger(TransportIndicesShardStoresAction.class);

+    static final int CONCURRENT_REQUESTS_LIMIT = 100; // TODO configurable?


Should it be converted to a Setting or was it intentionally left as a TODO?
Not sure if this is important though as it was not limited previously

Well it's worth a discussion. Bounding concurrency here might make things slower. 100 feels high enough to me to have limited impact, whilst being low enough to avoid this API being actively harmful to an enormous cluster.

Seems it could also depend on CPU/network. Have we ever recorded this was an issue in the wild when it was unlimited?
If not I am leaning towards approach of making it configurable with a huge default (1000?) and let them set it lower if/when issue is detected.

I'm going to mark this as not-approved while I think about the concurrent-requests-limit part. Otherwise I'll forget that this is outstanding.

DaveCTurner · 2023-03-27T13:40:03Z

Ok I pushed d0333bf which makes the limit configurable on a request-by-request basis (using the same parameter as we use in the search API: max_concurrent_shard_requests. I've left the default at 100.

DaveCTurner · 2023-03-27T14:31:30Z

@elasticmachine please run elasticsearch-ci/docs

Added in elastic#94507 but without the comment from elastic#93101, which this commit fixes.

Added in #94507 but without the comment from #93101, which this commit fixes.

DaveCTurner added >non-issue :Distributed Coordination/Allocation All issues relating to the decision making around placing a shard (both master logic & on the nodes) v8.8.0 labels Mar 14, 2023

DaveCTurner requested a review from idegtiarenko March 14, 2023 19:17

elasticsearchmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Mar 14, 2023

DaveCTurner mentioned this pull request Mar 21, 2023

Make use of chunked REST response infrastructure in more APIs #89838

Open

19 tasks

idegtiarenko reviewed Mar 24, 2023

View reviewed changes

idegtiarenko previously approved these changes Mar 24, 2023

View reviewed changes

DaveCTurner added 2 commits March 27, 2023 13:36

Merge branch 'main' into 2023-03-13-IndicesShardStores-simplify

516b728

Per-request concurrency limit

d0333bf

DaveCTurner requested a review from idegtiarenko March 27, 2023 13:40

idegtiarenko approved these changes Mar 27, 2023

View reviewed changes

Fix compile error

4f220fc

DaveCTurner added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Mar 27, 2023

DaveCTurner added 2 commits March 27, 2023 15:44

Merge branch 'main' into 2023-03-13-IndicesShardStores-simplify

1fbbf74

Docs fix

4df7569

elasticsearchmachine merged commit e377a86 into elastic:main Mar 27, 2023

DaveCTurner deleted the 2023-03-13-IndicesShardStores-simplify branch March 27, 2023 15:34

DaveCTurner mentioned this pull request Mar 28, 2023

Introduce RefCountingListener#isFailing #93101

Closed

DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Mar 28, 2023

Add JavaDoc to RefCountingListener#isFailing

813c1bf

Added in elastic#94507 but without the comment from elastic#93101, which this commit fixes.

DaveCTurner mentioned this pull request Mar 28, 2023

Add JavaDoc to RefCountingListener#isFailing #94810

Merged

DaveCTurner added a commit that referenced this pull request Apr 6, 2023

Add JavaDoc to RefCountingListener#isFailing (#94810)

08063f6

Added in #94507 but without the comment from #93101, which this commit fixes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify `IndicesShardStoresAction` #94507

Simplify `IndicesShardStoresAction` #94507

DaveCTurner commented Mar 14, 2023

elasticsearchmachine commented Mar 14, 2023

idegtiarenko Mar 24, 2023

DaveCTurner Mar 24, 2023

idegtiarenko Mar 24, 2023

idegtiarenko Mar 24, 2023

DaveCTurner Mar 24, 2023

idegtiarenko Mar 24, 2023

DaveCTurner commented Mar 27, 2023

DaveCTurner commented Mar 27, 2023

		@@ -66,6 +63,8 @@ public class TransportIndicesShardStoresAction extends TransportMasterNodeReadAc

		private static final Logger logger = LogManager.getLogger(TransportIndicesShardStoresAction.class);

		static final int CONCURRENT_REQUESTS_LIMIT = 100; // TODO configurable?

Simplify IndicesShardStoresAction #94507

Simplify IndicesShardStoresAction #94507

Conversation

DaveCTurner commented Mar 14, 2023

elasticsearchmachine commented Mar 14, 2023

idegtiarenko Mar 24, 2023

Choose a reason for hiding this comment

DaveCTurner Mar 24, 2023

Choose a reason for hiding this comment

idegtiarenko Mar 24, 2023

Choose a reason for hiding this comment

idegtiarenko Mar 24, 2023

Choose a reason for hiding this comment

DaveCTurner Mar 24, 2023

Choose a reason for hiding this comment

idegtiarenko Mar 24, 2023

Choose a reason for hiding this comment

DaveCTurner commented Mar 27, 2023

DaveCTurner commented Mar 27, 2023

Simplify `IndicesShardStoresAction` #94507

Simplify `IndicesShardStoresAction` #94507