You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
This is very similar to #7254
Describe the solution you'd like
For join operators we can run into situations where one or both sides of the join needs to be completely in GPU memory while we stream through the other side of the join. We are working on some out of core fixes to try and help with these situations, but either way at some point we are likely going to need more memory than we get in the default lease.
We should do the same tasks as with GpuWindowExec.
Combine the join with any GpuCoalesceBatchExec nodes that would precede it.
Do a high water mark estimation on how much memory will be needed in the worst case to complete the join, given the input sizes.
Request a higher lease if needed
Experiment with RMM high water mark tracking to see how good our estimate is, and verify that we are not missing something
Write scale testing to verify that our estimation code does not under estimate the amount of memory needed.
The text was updated successfully, but these errors were encountered:
mattahrens
changed the title
[FEA] Update GpuHashJoin and GpuBroadcastHashJoin to use GpuMemoryLeaseManager
[FEA] Update GpuHashJoin and GpuBroadcastHashJoin to use OOO retry framework
Jan 27, 2023
sameerz
changed the title
[FEA] Update GpuHashJoin and GpuBroadcastHashJoin to use OOO retry framework
[FEA] Update GpuHashJoin and GpuBroadcastHashJoin to use OOM retry framework
Feb 18, 2023
This issue is being used for join oom retry work in 23.04. We will file additional issues for work in future releases.
This covers two OOM failures we were seeing when testing with low memory. #7930 adds retrying without splits for cases where we were running out of memory on Table.gather calls (JoinGatherer.gatherNext) #7902 adds retrying without splits for cases where we were running out of memory during SplittableJoinIterator.createGatherer, typically when creating the gather maps.
Is your feature request related to a problem? Please describe.
This is very similar to #7254
Describe the solution you'd like
For join operators we can run into situations where one or both sides of the join needs to be completely in GPU memory while we stream through the other side of the join. We are working on some out of core fixes to try and help with these situations, but either way at some point we are likely going to need more memory than we get in the default lease.
We should do the same tasks as with GpuWindowExec.
The text was updated successfully, but these errors were encountered: