Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[7.x][ML] Only report complete writing_results progress after complet… #49577

Conversation

dimitris-athanasiou
Copy link
Contributor

…ion (#49551)

We depend on the number of data frame rows in order to report progress
for the writing of results, the last phase of a job run. However, results
include other objects than just the data frame rows (e.g, progress, inference model, etc.).

The problem this commit fixes is that if we receive the last data frame row results
we'll report that progress is complete even though we still have more results to process
potentially. If the job gets stopped for any reason at this point, we will not be able
to restart the job properly as we'll think that the job was completed.

This commit addresses this by limiting the max progress we can report for the
writing_results phase before the results processor completes to 98.
At the end, when the process is done we set the progress to 100.

The commit also improves failure capturing and reporting in the results processor.

Backport of #49551

…ion (elastic#49551)

We depend on the number of data frame rows in order to report progress
for the writing of results, the last phase of a job run. However, results
include other objects than just the data frame rows (e.g, progress, inference model, etc.).

The problem this commit fixes is that if we receive the last data frame row results
we'll report that progress is complete even though we still have more results to process
potentially. If the job gets stopped for any reason at this point, we will not be able
to restart the job properly as we'll think that the job was completed.

This commit addresses this by limiting the max progress we can report for the
writing_results phase before the results processor completes to 98.
At the end, when the process is done we set the progress to 100.

The commit also improves failure capturing and reporting in the results processor.

Backport of elastic#49551
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

@dimitris-athanasiou dimitris-athanasiou merged commit c23a218 into elastic:7.x Nov 26, 2019
@dimitris-athanasiou dimitris-athanasiou deleted the only-report-complete-progress-after-completion-7x branch November 26, 2019 10:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants