Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix struct row comparator's exception on empty structs #10604

Merged
merged 4 commits into from
Apr 8, 2022

Conversation

sperlingxx
Copy link
Contributor

Fixes #10603

This PR is to fix a bug of the optimized struct row comparator. For now, the struct row comparator assumes that structs being compared are non-empty, so it fails when comparing empty structs.

@sperlingxx sperlingxx requested a review from devavret April 6, 2022 11:22
@sperlingxx sperlingxx requested a review from a team as a code owner April 6, 2022 11:22
@sperlingxx sperlingxx requested a review from harrism April 6, 2022 11:22
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Apr 6, 2022
@sperlingxx sperlingxx added bug Something isn't working 3 - Ready for Review Ready for review by team non-breaking Non-breaking change and removed libcudf Affects libcudf (C++/CUDA) code. labels Apr 6, 2022
@sperlingxx sperlingxx added the libcudf Affects libcudf (C++/CUDA) code. label Apr 6, 2022
Copy link
Contributor

@devavret devavret left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine. Any perf hit? You can check using SORT_NVBENCH

@devavret
Copy link
Contributor

devavret commented Apr 6, 2022

I didn't know there could be empty structs. Does this work if a struct contains 2 struct children, the first of which is empty?

1 similar comment
@devavret
Copy link
Contributor

devavret commented Apr 6, 2022

I didn't know there could be empty structs. Does this work if a struct contains 2 struct children, the first of which is empty?

@codecov
Copy link

codecov bot commented Apr 6, 2022

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.06@bc8f578). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head 89eff27 differs from pull request most recent head c2176c2. Consider uploading reports for the commit c2176c2 to get more accurate results

@@               Coverage Diff               @@
##             branch-22.06   #10604   +/-   ##
===============================================
  Coverage                ?   86.34%           
===============================================
  Files                   ?      140           
  Lines                   ?    22280           
  Branches                ?        0           
===============================================
  Hits                    ?    19237           
  Misses                  ?     3043           
  Partials                ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bc8f578...c2176c2. Read the comment docs.

@jrhemstad
Copy link
Contributor

I didn't know there could be empty structs. Does this work if a struct contains 2 struct children, the first of which is empty?

I'm a bit skeptical of that too, though there doesn't seem to be anything in the Arrow spec that forbids it: https://arrow.apache.org/docs/format/Columnar.html#struct-layout

Comment on lines +200 to +202
if (lcol.num_child_columns() == 0) {
return cuda::std::make_pair(weak_ordering::EQUIVALENT, depth);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devavret I assume there is some check earlier on in hose code that verifies corresponding struct columns have the same number of children?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. If it reaches here, they're the same structure.

Copy link
Contributor Author

@sperlingxx sperlingxx Apr 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know there could be empty structs. Does this work if a struct contains 2 struct children, the first of which is empty?

It does work. I added a test case to cover this scenario. IIUC, following the rule of decompose_structs, the second child will be detached from the root struct.

   S0
   / \
  S1  S2

=>

S0
|
S1   S2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of decompose_structs itself was broken because it operated under the assumption that struct has at least 1 child. If S1 was empty then it would create

S0
|
S1
|
S2

I fixed the impl so this is good to go.

@jrhemstad
Copy link
Contributor

If empty structs always compare equal, then can't we just strip struct columns without children during the pre-processing step?

@sperlingxx
Copy link
Contributor Author

If empty structs always compare equal, then can't we just strip struct columns without children during the pre-processing step?

Because empty structs may carry null masks, which triggers the null comparsion.

@ttnghia
Copy link
Contributor

ttnghia commented Apr 7, 2022

Rerun tests.

@devavret
Copy link
Contributor

devavret commented Apr 7, 2022

@sperlingxx mind if I push changes to this? I don't think the changes are sufficient. There is probably another corner case here. #10604 (comment)

@sperlingxx
Copy link
Contributor Author

SORT_NVBENCH

I added 0 as the depth of struct. Below is the perf result of my local machine:

### [0] NVIDIA TITAN RTX

|     NumRows     | Depth | Nulls | Samples |  CPU Time  | Noise  |  GPU Time  | Noise  |
|-----------------|-------|-------|---------|------------|--------|------------|--------|
|     2^10 = 1024 |     0 |     0 |  11824x |  45.422 us | 11.03% |  42.301 us |  7.77% |
|   2^18 = 262144 |     0 |     0 |   4576x | 112.615 us |  2.82% | 109.586 us |  0.65% |
| 2^26 = 67108864 |     0 |     0 |     44x |  11.388 ms |  0.13% |  11.385 ms |  0.13% |
|     2^10 = 1024 |     1 |     0 |   1630x | 309.766 us |  1.00% | 306.816 us |  0.30% |
|   2^18 = 262144 |     1 |     0 |    499x |   1.006 ms |  0.41% |   1.003 ms |  0.28% |
| 2^26 = 67108864 |     1 |     0 |     11x | 179.348 ms |  0.30% | 179.345 ms |  0.30% |
|     2^10 = 1024 |     8 |     0 |    543x | 924.738 us |  0.37% | 921.886 us |  0.20% |
|   2^18 = 262144 |     8 |     0 |    163x |   3.073 ms |  0.23% |   3.070 ms |  0.21% |
| 2^26 = 67108864 |     8 |     0 |     11x | 589.395 ms |  0.08% | 589.395 ms |  0.08% |
|     2^10 = 1024 |     0 |     1 |   7424x |  70.415 us |  4.55% |  67.393 us |  0.84% |
|   2^18 = 262144 |     0 |     1 |   1716x | 294.552 us |  1.10% | 291.521 us |  0.38% |
| 2^26 = 67108864 |     0 |     1 |     11x |  78.508 ms |  0.09% |  78.505 ms |  0.09% |
|     2^10 = 1024 |     1 |     1 |   1553x | 324.959 us |  1.02% | 322.001 us |  0.43% |
|   2^18 = 262144 |     1 |     1 |   3232x |   1.032 ms | 14.20% |   1.028 ms | 14.20% |
| 2^26 = 67108864 |     1 |     1 |     11x | 180.008 ms |  0.31% | 180.005 ms |  0.31% |
|     2^10 = 1024 |     8 |     1 |   1050x | 940.746 us |  0.60% | 937.648 us |  0.50% |
|   2^18 = 262144 |     8 |     1 |    162x |   3.105 ms |  0.23% |   3.102 ms |  0.20% |
| 2^26 = 67108864 |     8 |     1 |     11x | 588.961 ms |  0.15% | 588.961 ms |  0.15% |

Signed-off-by: sperlingxx <[email protected]>
@devavret
Copy link
Contributor

devavret commented Apr 7, 2022

SORT_NVBENCH

I added 0 as the depth of struct. Below is the perf result of my local machine:

### [0] NVIDIA TITAN RTX

|     NumRows     | Depth | Nulls | Samples |  CPU Time  | Noise  |  GPU Time  | Noise  |
|-----------------|-------|-------|---------|------------|--------|------------|--------|
| 2^26 = 67108864 |     1 |     0 |     11x | 179.348 ms |  0.30% | 179.345 ms |  0.30% |
| 2^26 = 67108864 |     8 |     0 |     11x | 589.395 ms |  0.08% | 589.395 ms |  0.08% |

So a minor perf hit for increasing depth vs old one #10164 (comment)

----------------------------------------------------------
Benchmark                                            Time 
----------------------------------------------------------
Sort<false>/unstable/67108864/1/manual_time        174 ms 
Sort<false>/unstable/67108864/8/manual_time        517 ms 

Depth 1 time increased from 174 ms to 179 ms and depth 8 increased from 517 ms to 589 ms. Not bad. Correctness is more important.

@sperlingxx
Copy link
Contributor Author

@sperlingxx mind if I push changes to this? I don't think the changes are sufficient. There is probably another corner case here. #10604 (comment)

Of course not. You are the champion of this area, please go ahead :)

@sperlingxx
Copy link
Contributor Author

Rerun tests.

std::vector<std::unique_ptr<cudf::column>> child_columns2;
auto child_col_1 = cudf::make_structs_column(6, {}, UNKNOWN_NULL_COUNT, std::move(null_mask2));
child_columns2.push_back(std::move(child_col_1));
int_col col4{{0, 1, 2, 3, 4, 5}};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverse this and the answer will not be the one you expect. You'd expect 415320 but you'd get 145320.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed 0ee82bc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the TL;DR on the change made in #0ee82bc?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than flattening the struct and then traversing backwards to create branches, we now create branches while traversing. In the earlier method we depended on finding a leaf column to know when to break off the branch.

auto got3 = sorted_order(input3, {order::ASCENDING});
CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected3, got3->view());
CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected3, got3->view(), debug_output_level::ALL_ERRORS);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we remove debug_output_level::ALL_ERRORS?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, why not.

@sperlingxx
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit bc43e6a into rapidsai:branch-22.06 Apr 8, 2022
@sperlingxx sperlingxx deleted the fix_empty_struct_cmp branch April 8, 2022 23:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] struct row comparator crashes on empty structs
4 participants