-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify IndicesShardStoresAction
#94507
Simplify IndicesShardStoresAction
#94507
Conversation
- No need to use an `AsyncShardFetch` here, there is no caching - Response may be very large, introduce chunking - Fan-out may be very large, introduce throttling - Processing time may be nontrivial, introduce cancellability - Eliminate many unnecessary intermediate data structures - Do shard-level response processing more eagerly - Determine allocation from `RoutingTable` not `RoutingNodes` - Add tests Relates elastic#81081
Pinging @elastic/es-distributed (Team:Distributed) |
private final String[] concreteIndices; | ||
private final RoutingTable routingTable; | ||
private final Metadata metadata; | ||
private final Map<String, Map<Integer, List<StoreStatus>>> indicesStatuses; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not Map<ShardId, List<StoreStatus>>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really just because that's what the response wants.
this.failures = new ConcurrentLinkedQueue<>(); | ||
this.outerListener = new RefCountingListener(1, listener.map(ignored -> { | ||
task.ensureNotCancelled(); | ||
return new IndicesShardStoresResponse(Map.copyOf(indicesStatuses), List.copyOf(failures)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, was IndicesShardStoresResponse
requiring map of maps before?
@@ -66,6 +63,8 @@ public class TransportIndicesShardStoresAction extends TransportMasterNodeReadAc | |||
|
|||
private static final Logger logger = LogManager.getLogger(TransportIndicesShardStoresAction.class); | |||
|
|||
static final int CONCURRENT_REQUESTS_LIMIT = 100; // TODO configurable? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be converted to a Setting or was it intentionally left as a TODO?
Not sure if this is important though as it was not limited previously
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well it's worth a discussion. Bounding concurrency here might make things slower. 100 feels high enough to me to have limited impact, whilst being low enough to avoid this API being actively harmful to an enormous cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems it could also depend on CPU/network. Have we ever recorded this was an issue in the wild when it was unlimited?
If not I am leaning towards approach of making it configurable with a huge default (1000?) and let them set it lower if/when issue is detected.
I'm going to mark this as not-approved while I think about the concurrent-requests-limit part. Otherwise I'll forget that this is outstanding.
Ok I pushed d0333bf which makes the limit configurable on a request-by-request basis (using the same parameter as we use in the search API: |
@elasticmachine please run elasticsearch-ci/docs |
Added in elastic#94507 but without the comment from elastic#93101, which this commit fixes.
AsyncShardFetch
here, there is no cachingRoutingTable
notRoutingNodes
Relates #81081