Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Spilling logic can spill data that cannot be freed #6864

Closed
jlowe opened this issue Oct 19, 2022 · 2 comments · Fixed by #7572
Closed

[BUG] Spilling logic can spill data that cannot be freed #6864

jlowe opened this issue Oct 19, 2022 · 2 comments · Fixed by #7572
Assignees
Labels
cudf_dependency An issue or PR with this label depends on a new feature in cudf performance A performance related task/issue reliability Features to improve reliability or bugs that severly impact the reliability of the plugin

Comments

@jlowe
Copy link
Member

jlowe commented Oct 19, 2022

The spill code in RapidsBufferStore will spill data from a buffer that is currently not spillable. This can be very wasteful when the thread that has locked the buffer is busy blocked on an allocate call that is causing spilling to occur. Such buffers cannot be freed to satisfy the allocation, and spilling them in the hopes the buffer can be relinquished later can be a complete waste of time and I/O bandwidth if the buffer is never unlocked.

Rather than spilling all buffers, the code should track spillable (i.e.: not currently acquired) buffers and only spill those. The waiting logic for pending free buffers should be changed to waiting logic for buffers to be released (and thus spilled only at that point).

@jlowe jlowe added ? - Needs Triage Need team to review and classify performance A performance related task/issue labels Oct 19, 2022
@abellina
Copy link
Collaborator

Related issue: #6561, since in that one we want to make sure the spill framework is told when the underlying buffer is actually freed. For consideration when working on this specific issue.

@abellina abellina added the reliability Features to improve reliability or bugs that severly impact the reliability of the plugin label Oct 21, 2022
@sameerz sameerz removed the ? - Needs Triage Need team to review and classify label Oct 25, 2022
@abellina
Copy link
Collaborator

This is the cuDF side of this change: rapidsai/cudf#12125. We have decided that a callback is all we need from the memory buffer, where we would make something spillable once the memory buffer reaches refCount=1 (e.g. it is held only by the spillable framework).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cudf_dependency An issue or PR with this label depends on a new feature in cudf performance A performance related task/issue reliability Features to improve reliability or bugs that severly impact the reliability of the plugin
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants