Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Update GpuHashAggregate to use OOM retry framework #7256

Closed
4 tasks
revans2 opened this issue Dec 5, 2022 · 1 comment
Closed
4 tasks

[FEA] Update GpuHashAggregate to use OOM retry framework #7256

revans2 opened this issue Dec 5, 2022 · 1 comment
Assignees
Labels
reliability Features to improve reliability or bugs that severly impact the reliability of the plugin

Comments

@revans2
Copy link
Collaborator

revans2 commented Dec 5, 2022

Is your feature request related to a problem? Please describe.
This is very similar to #7254

Describe the solution you'd like
For hash aggregate operators we can run into situations where aggregates can use an exceptionally large amount of GPU memory, enough that we may need more memory than we get in the default lease.

We should do similar tasks as we are going to do with GpuWindowExec.

  • Do a high water mark estimation on how much memory will be needed in the worst case to complete the hash aggregate, given the input sizes and the aggregations being done.
  • Request a higher lease if needed.
  • Experiment with RMM high water mark tracking to see how good our estimate is, and verify that we are not missing something
  • Write scale testing to verify that our estimation code does not under estimate the amount of memory needed.
@revans2 revans2 added feature request New feature or request ? - Needs Triage Need team to review and classify labels Dec 5, 2022
@mattahrens mattahrens added reliability Features to improve reliability or bugs that severly impact the reliability of the plugin and removed ? - Needs Triage Need team to review and classify labels Dec 6, 2022
@mattahrens mattahrens changed the title [FEA] Update GpuHashAggregate to use GpuMemoryLeaseManager [FEA] Update GpuHashAggregate to use OOO retry framework Jan 27, 2023
@sameerz sameerz changed the title [FEA] Update GpuHashAggregate to use OOO retry framework [FEA] Update GpuHashAggregate to use OOM retry framework Feb 18, 2023
@abellina
Copy link
Collaborator

This was done in #7771

@sameerz sameerz removed the feature request New feature or request label Mar 1, 2023
@mattahrens mattahrens added the feature request New feature or request label Mar 10, 2023
@sameerz sameerz removed the feature request New feature or request label Apr 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reliability Features to improve reliability or bugs that severly impact the reliability of the plugin
Projects
None yet
Development

No branches or pull requests

4 participants