Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Concurrent Index Auto Create Failing Bulk Requests #82541

Conversation

original-brownbear
Copy link
Member

Batching these requests introduced a bug where auto-create requests for system
indices would fail because system indices are always auto-created and thus
throw resource-already-exists if multiple equal ones are batched together
even though the index doesn't yet exist in the cluster state but only
in the intermediary task executor state.
This leads to bulk requests ignoring the exeception (thinking that the index
already exists) in their auto-create callback when the request doesn't yet
exist.
Fixed by deduplicating these requests for now, added a TODO to do it a little
nicer down the road but this fix is somewhat urgent as it breaks ML integ
tests.

Non-issue as this hasn't been released yet.

relates #82159

Batching these requests introduced a bug where auto-create requests for system
indices would fail because system indices are always auto-created and thus
throw resource-already-exists if multiple equal ones are batched together
even though the index doesn't yet exist in the cluster state but only
in the intermediary task executor state.
This leads to bulk requests ignoring the exeception (thinking that the index
already exists) in their auto-create callback when the request doesn't yet
exist.
Fixed by deduplicating these requests for now, added a TODO to do it a little
nicer down the road but this fix is somewhat urgent as it breaks ML integ
tests.
@original-brownbear original-brownbear added >non-issue :Data Management/Indices APIs APIs to create and manage indices and templates v8.1.0 labels Jan 13, 2022
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Jan 13, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

builder.success(task);
} catch (Exception e) {
builder.failure(task, e);
}
}
state = allocationService.reroute(state, "auto-create");
if (state != currentState) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small optimization in case the full batch doesn't do anything no need to reroute.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we move this to the clusterStatePublished event and pass it off to ClusterService#rerouteService for batching with other pending reroutes too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about that. This adds one more CS update if no reroutes are pending and saves a reroute every now and then. A reroute is pretty fast these days though. With all the fixes we added recently I'm not sure it's worth it in 8.1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm it's not trivially fast in a large cluster still so I think we still prefer batching. Yes it's one more publication in an otherwise-quiet cluster but that's not really where the problems arise.

ClusterState state = currentState;
for (AckedClusterStateUpdateTask task : tasks) {
final Map<CreateIndexRequest, CreateIndexTask> successfulRequests = new HashMap<>(tasks.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: this and line 109 are good candidates to use var to simplify declaration

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok but I left some thoughts and suggestions.

builder.success(task);
} catch (Exception e) {
builder.failure(task, e);
}
}
state = allocationService.reroute(state, "auto-create");
if (state != currentState) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we move this to the clusterStatePublished event and pass it off to ClusterService#rerouteService for batching with other pending reroutes too?

private final class CreateIndexTask extends AckedClusterStateUpdateTask {

final CreateIndexRequest request;
final AtomicReference<String> indexNameRef;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be an AtomicReference or could it just be a volatile field?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annoyingly enough yes, with the weird way the listener is constructed here it currently seems like we need this.

request.masterNodeTimeout(),
request.timeout(),
false
private final class CreateIndexTask extends AckedClusterStateUpdateTask {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we getting much benefit from the extends AckedClusterStateUpdateTask here? I think if we dropped that and just implemented AckedClusterStateTaskListener directly we wouldn't need to construct the listener up front and could probably solve the TODO about deduplicating the listeners.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ I agree, the main motivation of this PR is to unblock some broken tests quickly (also in ML QA). Are we good doing that kind of refactoring in a follow-up? :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, maybe just leave a TODO here tho.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO added!

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@original-brownbear
Copy link
Member Author

Thanks David + Ievgen!

@original-brownbear original-brownbear merged commit 584df17 into elastic:master Jan 14, 2022
@original-brownbear original-brownbear deleted the fix-auto-create-system-index-concurrency branch January 14, 2022 14:49
@original-brownbear original-brownbear restored the fix-auto-create-system-index-concurrency branch April 18, 2023 20:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Indices APIs APIs to create and manage indices and templates >non-issue Team:Data Management Meta label for data/management team v8.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants