Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Split stream side join batches into smaller sizes if the join explodes #2408

Closed
revans2 opened this issue May 13, 2021 · 2 comments · Fixed by #2433
Closed

[FEA] Split stream side join batches into smaller sizes if the join explodes #2408

revans2 opened this issue May 13, 2021 · 2 comments · Fixed by #2433
Assignees
Labels
task Work required that improves the product but is not user facing

Comments

@revans2
Copy link
Collaborator

revans2 commented May 13, 2021

We have started to work on joins that are too large to fit in GPU memory. The first step in this was to get back the gather maps from the join instead of materializing the results directly, and then be able to output the results in batches that are the appropriate size. For the most part this is working really well (All of the NDS queries except for 1 we are able to run at scale factor 3000). For query 72, however, it is not great. In a typical run we are seeing a join that really explodes. When joining a 600MB table by a 400MB table we see the join try to allocate over 260GB just for the gather map. (The row count on average will increase 2,000x).

We run into issues in part because this alloc request is so large that we end up spilling a lot of data to host memory, or even disk, when we could avoid doing it if we knew the join would explode, so we could split the input batches up.

Part of the problem is that this is an equi-join with an inequality join as well. So the 2,000x increase in size is before filtering. After the filter it is only a 200x explosion in size. So the AST work for Joins in CUDF will be a huge win for us. But even then the gather map for this join would be 26GB, which is larger than can fit on a typical 16GB GPU.

The idea is to take rapidsai/cudf#8237 and check the size against a budget for the gather map. If the size exceeds that budget, then we use the result to be able to cut the input batches down into smaller pieces that should fit.

Note that for Left-semi and Left-anti joins the gather map will never be larger than the number of rows in the left table, and they will always build on the right side. So with that we already know our limit, it should be reasonable, so this feature does not apply here.

For full outer joins we need complete sets of keys for both sides. With that we would need to do something more complicated. Probably starting with a sort merge join #2354 and then if we have a customer where the join is still exploding we would need to slice it up into smaller chunks that honor key boundaries.

@revans2 revans2 added ? - Needs Triage Need team to review and classify task Work required that improves the product but is not user facing labels May 13, 2021
@revans2 revans2 self-assigned this May 18, 2021
@revans2 revans2 removed the ? - Needs Triage Need team to review and classify label May 18, 2021
@revans2 revans2 added this to the May 10 - May 21 milestone May 18, 2021
@revans2
Copy link
Collaborator Author

revans2 commented May 18, 2021

Will file a separate follow on for anything missing after #2433 goes in.

@revans2 revans2 linked a pull request May 19, 2021 that will close this issue
@revans2
Copy link
Collaborator Author

revans2 commented May 19, 2021

I filed #2440 and #2441 as follow on issues.

@revans2 revans2 closed this as completed May 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task Work required that improves the product but is not user facing
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant