[FEA] Split stream side join batches into smaller sizes if the join explodes #2408

revans2 · 2021-05-13T16:58:00Z

We have started to work on joins that are too large to fit in GPU memory. The first step in this was to get back the gather maps from the join instead of materializing the results directly, and then be able to output the results in batches that are the appropriate size. For the most part this is working really well (All of the NDS queries except for 1 we are able to run at scale factor 3000). For query 72, however, it is not great. In a typical run we are seeing a join that really explodes. When joining a 600MB table by a 400MB table we see the join try to allocate over 260GB just for the gather map. (The row count on average will increase 2,000x).

We run into issues in part because this alloc request is so large that we end up spilling a lot of data to host memory, or even disk, when we could avoid doing it if we knew the join would explode, so we could split the input batches up.

Part of the problem is that this is an equi-join with an inequality join as well. So the 2,000x increase in size is before filtering. After the filter it is only a 200x explosion in size. So the AST work for Joins in CUDF will be a huge win for us. But even then the gather map for this join would be 26GB, which is larger than can fit on a typical 16GB GPU.

The idea is to take rapidsai/cudf#8237 and check the size against a budget for the gather map. If the size exceeds that budget, then we use the result to be able to cut the input batches down into smaller pieces that should fit.

Note that for Left-semi and Left-anti joins the gather map will never be larger than the number of rows in the left table, and they will always build on the right side. So with that we already know our limit, it should be reasonable, so this feature does not apply here.

For full outer joins we need complete sets of keys for both sides. With that we would need to do something more complicated. Probably starting with a sort merge join #2354 and then if we have a customer where the join is still exploding we would need to slice it up into smaller chunks that honor key boundaries.

revans2 · 2021-05-18T20:40:07Z

Will file a separate follow on for anything missing after #2433 goes in.

revans2 · 2021-05-19T11:16:36Z

I filed #2440 and #2441 as follow on issues.

revans2 added ? - Needs Triage Need team to review and classify task Work required that improves the product but is not user facing labels May 13, 2021

This was referenced May 13, 2021

Allow inner joins to fall back to the CPU if the gather map is too large #2411

Closed

Split up stream side for exploding joins #2433

Merged

revans2 self-assigned this May 18, 2021

revans2 removed the ? - Needs Triage Need team to review and classify label May 18, 2021

revans2 added this to the May 10 - May 21 milestone May 18, 2021

revans2 linked a pull request May 19, 2021 that will close this issue

Split up stream side for exploding joins #2433

Merged

revans2 closed this as completed May 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Split stream side join batches into smaller sizes if the join explodes #2408

[FEA] Split stream side join batches into smaller sizes if the join explodes #2408

revans2 commented May 13, 2021

revans2 commented May 18, 2021

revans2 commented May 19, 2021

[FEA] Split stream side join batches into smaller sizes if the join explodes #2408

[FEA] Split stream side join batches into smaller sizes if the join explodes #2408

Comments

revans2 commented May 13, 2021

revans2 commented May 18, 2021

revans2 commented May 19, 2021