Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TimingStats document is not deleted on job deletion. #44957

Closed
przemekwitek opened this issue Jul 29, 2019 · 2 comments · Fixed by #45840
Closed

TimingStats document is not deleted on job deletion. #44957

przemekwitek opened this issue Jul 29, 2019 · 2 comments · Fixed by #45840
Labels
:ml Machine learning >non-issue

Comments

@przemekwitek
Copy link
Contributor

I've come across this issue while running a datafeed job a few times sequentially, every time with the same job name. The stats (it was clearly visible looking at timing_stats.bucket_count) were getting bigger and bigger as if they were accumulating multiple job runs.
After analyzing the code I found out that the TimingStats document is not deleted when the job is deleted.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

@przemekwitek
Copy link
Contributor Author

Actually, my conclusions were premature.
In fact there exists code (in TransportDeleteJobAction.java) that deletes from the shared results index all the documents having 'job_id' == jobToBeDeleted:

                        DeleteByQueryRequest request = new DeleteByQueryRequest(indexNames.get());
                        ConstantScoreQueryBuilder query =
                                new ConstantScoreQueryBuilder(new TermQueryBuilder(Job.ID.getPreferredName(), jobId));
                        request.setQuery(query);
                        request.setIndicesOptions(MlIndicesUtils.addIgnoreUnavailable(IndicesOptions.lenientExpandOpen()));
                        request.setSlices(AbstractBulkByScrollRequest.AUTO_SLICES);
                        request.setAbortOnVersionConflict(false);
                        request.setRefresh(true);

So the increased 'bucket_count' in 'timing_stats' is not related to the TimingStats document not being deleted.

I've filed a separate issue (#45839) that focuses on this large 'bucket_count' values.
I've also implemented a test (#45840) that proves that the TimingStats document gets deleted together with the job.

Closing this issue as irrelevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning >non-issue
Projects
None yet
2 participants