Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stats actions should discard intermediate state on cancellation #82337

Closed
Tracked by #77466
DaveCTurner opened this issue Jan 7, 2022 · 1 comment · Fixed by #82685
Closed
Tracked by #77466

Stats actions should discard intermediate state on cancellation #82337

DaveCTurner opened this issue Jan 7, 2022 · 1 comment · Fixed by #82685
Assignees
Labels
>bug :Data Management/Stats Statistics tracking and retrieval APIs Team:Data Management Meta label for data/management team

Comments

@DaveCTurner
Copy link
Contributor

DaveCTurner commented Jan 7, 2022

Most stats actions fan out to various nodes in the cluster and collect per-node responses which are then aggregated into the final result. The per-node responses may sometimes be many MBs in size. If the client cancels the request by closing its connection then we broadcast the cancellation to all the target nodes and wait for them to respond with a TaskCancelledException before discarding the intermediate results. It's possible for one of the target nodes to take many minutes to respond to the cancellation if, for instance, it is overwhelmed by GC activity. In that case we retain many MBs of unnecessary intermediate state for many minutes.

We should instead react to the cancellation by immediately discarding the intermediate results and dropping any further results that arrive to free up this unnecessary memory usage. One possible way to do this would be to allow a CancellableTask to accumulate listeners which are completed by CancellableTask#onCancelled().

Relates #55550 (comment) which contains a list of some of the more important cases of this to address.

@DaveCTurner DaveCTurner added >bug needs:triage Requires assignment of a team area label :Data Management/Stats Statistics tracking and retrieval APIs and removed needs:triage Requires assignment of a team area label labels Jan 7, 2022
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Jan 7, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Data Management/Stats Statistics tracking and retrieval APIs Team:Data Management Meta label for data/management team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants