Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Restore analytics state if available #47128

Conversation

dimitris-athanasiou
Copy link
Contributor

This commit restores the model state if available in data
frame analytics jobs.

In addition, this changes the start API so that a stopped job
can be restarted. As we now store the progress in the state index
when the task is stopped, we can use it to determine what state
the job was in when it got stopped.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

@benwtrent benwtrent self-requested a review September 25, 2019 16:48
}

public TaskParams(StreamInput in) throws IOException {
this.id = in.readString();
this.version = Version.readVersion(in);
if (in.getVersion().onOrAfter(Version.V_7_5_0)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there BWC tests out there yet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet. I'll add them right after merging this one in to save the trouble with flipping the versions here.


static {
PARSER.declareString(ConstructingObjectParser.constructorArg(), DataFrameAnalyticsConfig.ID);
PARSER.declareString(ConstructingObjectParser.constructorArg(), DataFrameAnalyticsConfig.VERSION);
PARSER.declareObjectArray(ConstructingObjectParser.constructorArg(), PhaseProgress.PARSER, PROGRESS_ON_START);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this required or optional? There are no null checks in the ctor, so it seems like it is optional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. This is optional.

this.id = Objects.requireNonNull(id);
this.version = Objects.requireNonNull(version);
this.progressOnStart = progressOnStart;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
this.progressOnStart = progressOnStart;
this.progressOnStart = progressOnStart == null ? null : Collections.unmodifiableList(progressOnStart);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the constructor used by the parser to pass an empty list when progress_on_start is missing. The main constructor is now calling Collections.unmodifiableList which throws if the list is null.

}

public TaskParams(StreamInput in) throws IOException {
this.id = in.readString();
this.version = Version.readVersion(in);
if (in.getVersion().onOrAfter(Version.V_7_5_0)) {
progressOnStart = in.readList(PhaseProgress::new);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this nullable or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On version 7.5.0 onwards it will never be null. I'm addressing the null issues in total.

break;
case RESUMING_REINDEXING:
case RESUMING_ANALYZING:
toValidateMappingsListener.onResponse(startContext);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to validate the mappings can be merged if we are already analyzing and have already reindexed all the data?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are in resuming_reindexing we have not necessarily reindexed all the data. In general, I'd rather we perform a redundant check here and keep the path the same rather than skipping it and trying to reason about whether there are some cases that the mappings check would be necessary.

this.config = config;
this.progressOnStart = progressOnStart;
for (PhaseProgress phase : progressOnStart) {
LOGGER.info("[{}] Progress [{}] is [{}]", config.getId(), phase.getPhase(), phase.getProgressPercent());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like a debug level to me?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, that was a leftover. We don't really need it at all.

@dimitris-athanasiou
Copy link
Contributor Author

run elasticsearch-ci/2

@dimitris-athanasiou
Copy link
Contributor Author

run elasticsearch-ci/bwc

@dimitris-athanasiou
Copy link
Contributor Author

@elasticmachine update branch

This commit restores the model state if available in data
frame analytics jobs.

In addition, this changes the start API so that a stopped job
can be restarted. As we now store the progress in the state index
when the task is stopped, we can use it to determine what state
the job was in when it got stopped.
In order to be able to distinguish between a job
that runs for the first time and another that is restarting,
we ensure reindexing progress is reported to be at least 1
for a running task.

Also fixes an issue in the tests.
@dimitris-athanasiou dimitris-athanasiou force-pushed the restore-analytics-state-if-available branch from 4f1eddc to 689ee27 Compare October 1, 2019 14:49
@dimitris-athanasiou dimitris-athanasiou merged commit f47da1d into elastic:master Oct 1, 2019
@dimitris-athanasiou dimitris-athanasiou deleted the restore-analytics-state-if-available branch October 1, 2019 17:03
dimitris-athanasiou added a commit to dimitris-athanasiou/elasticsearch that referenced this pull request Oct 1, 2019
This commit restores the model state if available in data
frame analytics jobs.

In addition, this changes the start API so that a stopped job
can be restarted. As we now store the progress in the state index
when the task is stopped, we can use it to determine what state
the job was in when it got stopped.

Note that in order to be able to distinguish between a job
that runs for the first time and another that is restarting,
we ensure reindexing progress is reported to be at least 1
for a running task.
dimitris-athanasiou added a commit that referenced this pull request Oct 2, 2019
This commit restores the model state if available in data
frame analytics jobs.

In addition, this changes the start API so that a stopped job
can be restarted. As we now store the progress in the state index
when the task is stopped, we can use it to determine what state
the job was in when it got stopped.

Note that in order to be able to distinguish between a job
that runs for the first time and another that is restarting,
we ensure reindexing progress is reported to be at least 1
for a running task.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants