Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Allow broadcast tables to be spilled if necessary #836

Closed
jlowe opened this issue Sep 23, 2020 · 1 comment · Fixed by #6604
Closed

[FEA] Allow broadcast tables to be spilled if necessary #836

jlowe opened this issue Sep 23, 2020 · 1 comment · Fixed by #6604
Assignees
Labels
P0 Must have for release reliability Features to improve reliability or bugs that severly impact the reliability of the plugin

Comments

@jlowe
Copy link
Member

jlowe commented Sep 23, 2020

Is your feature request related to a problem? Please describe.
Currently broadcast tables are "intentionally leaked" in GPU memory, as they are only cleaned up when garbage collected. It would be nice if instead of leaving them permanently in GPU memory until garbage collected we were able to spill them to host memory (and ultimately disk if necessary) when GPU memory is low.

Describe the solution you'd like
We should add broadcast tables to the spillable buffer framework. There could be some performance considerations if broadcast tables are spilled and constantly fetched from host memory to be used (as the spill framework currently doesn't migrate a buffer's recorded location from host back to device once spilled). However running a bit slow beats crashing due to OOM, so this would be a good first step.

@jlowe jlowe added feature request New feature or request ? - Needs Triage Need team to review and classify labels Sep 23, 2020
@sameerz sameerz added P1 Nice to have for release and removed ? - Needs Triage Need team to review and classify labels Sep 29, 2020
@abellina
Copy link
Collaborator

abellina commented Nov 24, 2020

As noted in #1168, we need to make the broadcast variable spillable, and also figure out how to deal with the projected batch in the join, for example, which are based on the broadcast variable. In my test so far, the memory used in these batches is higher than the memory used in the broadcast variable.

@revans2 revans2 mentioned this issue Apr 8, 2022
14 tasks
@revans2 revans2 added the reliability Features to improve reliability or bugs that severly impact the reliability of the plugin label Apr 12, 2022
@sameerz sameerz added P0 Must have for release and removed feature request New feature or request P1 Nice to have for release labels Sep 14, 2022
gerashegalov added a commit to gerashegalov/spark-rapids that referenced this issue Sep 23, 2022
Fixes NVIDIA#836. This is an MVP utilizing SpillableColumnarBatch wrapper.

Signed-off-by: Gera Shegalov <[email protected]>
revans2 pushed a commit that referenced this issue Oct 3, 2022
Fixes #836. This is an MVP utilizing SpillableColumnarBatch wrapper.

Signed-off-by: Gera Shegalov <[email protected]>
tgravescs pushed a commit to tgravescs/spark-rapids that referenced this issue Nov 30, 2023
…IDIA#836)

Signed-off-by: spark-rapids automation <[email protected]>

Signed-off-by: spark-rapids automation <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P0 Must have for release reliability Features to improve reliability or bugs that severly impact the reliability of the plugin
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants