-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make reindexing managed by a persistent task #43382
Make reindexing managed by a persistent task #43382
Conversation
Pinging @elastic/es-distributed |
This PR still needs a few cleanups. I will assign reviewers and comment when it is ready. |
@henningandersen @ywelsch I have adjusted this PR for some change in the direction of returning the task-id from the allocated persistent task. I have not adjusted it to work with rethrottling yet. But it appears that the other test are passing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
This approach looks good. Apart from minor comments I think it is ready to merge to feature branch. Feel free to defer some of them to follow-ups if that is easier.
* task can't totally validate until it starts but this is better than | ||
* nothing. | ||
*/ | ||
ActionRequestValidationException validationException = internal.getReindexRequest().validate(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can go outside the if
now.
} | ||
} | ||
|
||
public static class Response extends AcknowledgedResponse { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems we either return acknowledged=true or an exception. Could this just be an ActionResponse then?
|
||
import java.util.List; | ||
|
||
public class TransportGetReindexJobTaskAction extends TransportTasksAction<ReindexTask, GetReindexJobTaskAction.Request, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks unused now?
rethrottle(logger, clusterService.localNode().getId(), client, bulkByScrollTask, request.getRequestsPerSecond(), listener); | ||
} else if (task instanceof ReindexTask) { | ||
BulkByScrollTask childTask = null; | ||
for (Task task1 : taskManager.getTasks().values()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
task1
rename to childCandidate
?
return jobException; | ||
} | ||
|
||
public TaskId getTaskId() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename to getEphemeralTaskId(), it could make call-site code a little easier to read (at least I mixed it up with the persistent task id).
@@ -196,7 +196,7 @@ public XContentBuilder toXContent(XContentBuilder builder, Params params) throws | |||
} | |||
builder.field("running_time_in_nanos", runningTimeNanos); | |||
builder.field("cancellable", cancellable); | |||
if (parentTaskId.isSet()) { | |||
if (parentTaskId.isSet() && parentTaskId.getNodeId().equals("cluster") == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let us add a todo here, to ensure we revisit this. I think this means that cluster:
parent ids no longer appear in the _tasks
output, which could harm diagnostics/troubleshooting.
@ywelsch - This is probably ready for another look. I have address Henning's comments. The primary issue currently is due to a task submitting another task, there are some races with the child tasks being available for a follow-up rethrottle action. I have currently resolved this for testing purposes by a spin loop waiting for all the tasks to be available in Rethrottle action. I think the fix is to extract the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a few more minor comments, and one item (rethrottle) that we probably need to discuss
modules/reindex/src/main/java/org/elasticsearch/index/reindex/StartReindexJobAction.java
Outdated
Show resolved
Hide resolved
modules/reindex/src/main/java/org/elasticsearch/index/reindex/TransportReindexAction.java
Outdated
Show resolved
Hide resolved
modules/reindex/src/main/java/org/elasticsearch/index/reindex/TransportReindexAction.java
Outdated
Show resolved
Hide resolved
modules/reindex/src/main/java/org/elasticsearch/index/reindex/TransportRethrottleAction.java
Show resolved
Hide resolved
...es/reindex/src/main/java/org/elasticsearch/index/reindex/TransportStartReindexJobAction.java
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/reindex/ReindexTask.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/reindex/ReindexTask.java
Outdated
Show resolved
Hide resolved
@ywelsch I made the changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This is related to #42612. Currently the reindexing transport action
creates a task on the local coordinator node. Unfortunately this is not
resilient to coordinator node failures. This commit adds a new action
that creates a reindexing job as a persistent task.