Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix offset_end iterator for lists_column_view, which was not correctl… #7551

Merged
merged 2 commits into from
Mar 10, 2021

Conversation

ttnghia
Copy link
Contributor

@ttnghia ttnghia commented Mar 10, 2021

Fix the offset_end iterator in lists_column_view. Since the offset column size is one element larger than the number of column rows, the offset_end should be computed as offset_begin() + size() + 1. This can also be done by offset_begin() + offsets().size().

This PR blocks #7528, thus it must be merged before that PR.

@ttnghia ttnghia requested a review from a team as a code owner March 10, 2021 15:36
@ttnghia ttnghia requested review from trxcllnt and jrhemstad March 10, 2021 15:36
@ttnghia ttnghia added libcudf Affects libcudf (C++/CUDA) code. 3 - Ready for Review Ready for review by team 5 - Ready to Merge Testing and reviews complete, ready to merge bug Something isn't working breaking Breaking change labels Mar 10, 2021
@codecov
Copy link

codecov bot commented Mar 10, 2021

Codecov Report

Merging #7551 (d95af41) into branch-0.19 (2e4b5a6) will increase coverage by 0.44%.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.19    #7551      +/-   ##
===============================================
+ Coverage        81.90%   82.35%   +0.44%     
===============================================
  Files              101      101              
  Lines            16900    17283     +383     
===============================================
+ Hits             13842    14233     +391     
+ Misses            3058     3050       -8     
Impacted Files Coverage Δ
python/cudf/cudf/core/column/decimal.py 93.33% <0.00%> (-2.50%) ⬇️
python/cudf/cudf/core/column/numerical.py 94.85% <0.00%> (-0.17%) ⬇️
python/cudf/cudf/io/feather.py 100.00% <0.00%> (ø)
python/cudf/cudf/comm/serialize.py 0.00% <0.00%> (ø)
python/cudf/cudf/_fuzz_testing/io.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/struct.py 100.00% <0.00%> (ø)
python/dask_cudf/dask_cudf/_version.py 0.00% <0.00%> (ø)
python/dask_cudf/dask_cudf/io/tests/test_csv.py 100.00% <0.00%> (ø)
python/dask_cudf/dask_cudf/io/tests/test_orc.py 100.00% <0.00%> (ø)
python/dask_cudf/dask_cudf/io/tests/test_json.py 100.00% <0.00%> (ø)
... and 39 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2e4b5a6...d95af41. Read the comment docs.

@ttnghia
Copy link
Contributor Author

ttnghia commented Mar 10, 2021

@gpucibot merge.

@kkraus14
Copy link
Collaborator

kkraus14 commented Mar 10, 2021

@gpucibot merge.

We require two cpp approvals before merging.

@kkraus14
Copy link
Collaborator

cc @trxcllnt

@ttnghia
Copy link
Contributor Author

ttnghia commented Mar 10, 2021

@gpucibot merge.

We require two cpp approvals before merging.

Hah, github tells me "1 approving review" 😄

@ttnghia ttnghia removed the request for review from trxcllnt March 10, 2021 21:32
@ttnghia ttnghia added the 0 - Blocked Cannot progress due to external reasons label Mar 10, 2021
@ttnghia
Copy link
Contributor Author

ttnghia commented Mar 10, 2021

@gpucibot merge.

@kkraus14 kkraus14 removed 0 - Blocked Cannot progress due to external reasons 3 - Ready for Review Ready for review by team labels Mar 10, 2021
@kkraus14
Copy link
Collaborator

kkraus14 commented Mar 10, 2021

Why is this a breaking change? Isn't this a bug fix without API/ABI change?

@ttnghia
Copy link
Contributor Author

ttnghia commented Mar 10, 2021

Why is this a breaking change? Isn't this just an internal bug fix?

Sorry, it's breaking in my other PR. Maybe I miss understood the breaking meaning.

Hah, without the "breaking" label, CI check fails.

@kkraus14 kkraus14 added non-breaking Non-breaking change and removed breaking Breaking change labels Mar 10, 2021
@ttnghia ttnghia removed the non-breaking Non-breaking change label Mar 10, 2021
@kkraus14
Copy link
Collaborator

Sorry, it's breaking in my other PR. Maybe I miss understood the breaking meaning.

Got it. breaking and non-breaking are referring to downstream consumers of libcudf related to API/ABI/behavior changes that would break their consumption of libcudf.

@ttnghia ttnghia added the breaking Breaking change label Mar 10, 2021
@ttnghia
Copy link
Contributor Author

ttnghia commented Mar 10, 2021

@gpucibot merge!!!!!!!!!

Apparently the bot never listens to me.

@kkraus14
Copy link
Collaborator

Why is this a breaking change? This is a bug fix for incorrect results with no API or ABI breaking changes?

@ttnghia
Copy link
Contributor Author

ttnghia commented Mar 10, 2021

Why is this a breaking change? This is a bug fix for incorrect results with no API or ABI breaking changes?

It should not be any breaking change here. But if I remove the breaking label, CI test fails.

@kkraus14 kkraus14 added non-breaking Non-breaking change and removed breaking Breaking change labels Mar 10, 2021
@ttnghia
Copy link
Contributor Author

ttnghia commented Mar 10, 2021

Here we go. Adding non-breaking is also a fix. I didn't know that solution.

@kkraus14
Copy link
Collaborator

There's a label checker that checks that either the breaking or non-breaking label is set to indicate the behavior. Please ask if there's uncertainty instead of pushing incorrect information.

@kkraus14
Copy link
Collaborator

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 2d055c3 into rapidsai:branch-0.19 Mar 10, 2021
@ttnghia
Copy link
Contributor Author

ttnghia commented Mar 10, 2021

There's a label checker that checks that either the breaking or non-breaking label is set to indicate the behavior. Please ask if there's uncertainty instead of pushing incorrect information.

Thanks very much, Keith.

@ttnghia ttnghia deleted the fix_list_last_offset branch March 10, 2021 22:52
rapids-bot bot pushed a commit that referenced this pull request Mar 12, 2021
This is another fix for `offsets_end()` iterator in lists_column_view. The last fix (#7551) was still not correct---that iterator should not be computed using the size of the `offsets()` child column, which is also the offsets of the original (non-sliced) column. Instead, it should be computed using the `size()` of the current column.

Interestingly, my previous fix passed all the unit tests, since thrust does not throw anything (like access violation) when the input range is larger than the output range.

Authors:
  - Nghia Truong (@ttnghia)

Approvers:
  - Jake Hemstad (@jrhemstad)
  - David (@davidwendt)

URL: #7575
hyperbolic2346 pushed a commit to hyperbolic2346/cudf that referenced this pull request Mar 25, 2021
rapidsai#7551)

Fix the `offset_end` iterator in `lists_column_view`. Since the offset column size is one element larger than the number of column rows, the `offset_end` should be computed as `offset_begin() + size() + 1`. This can also be done by `offset_begin() + offsets().size()`.

This PR blocks rapidsai#7528, thus it must be merged before that PR.

Authors:
  - Nghia Truong (@ttnghia)

Approvers:
  - Jake Hemstad (@jrhemstad)
  - Mike Wilson (@hyperbolic2346)
  - Vukasin Milovanovic (@vuule)

URL: rapidsai#7551
hyperbolic2346 pushed a commit to hyperbolic2346/cudf that referenced this pull request Mar 25, 2021
…#7575)

This is another fix for `offsets_end()` iterator in lists_column_view. The last fix (rapidsai#7551) was still not correct---that iterator should not be computed using the size of the `offsets()` child column, which is also the offsets of the original (non-sliced) column. Instead, it should be computed using the `size()` of the current column.

Interestingly, my previous fix passed all the unit tests, since thrust does not throw anything (like access violation) when the input range is larger than the output range.

Authors:
  - Nghia Truong (@ttnghia)

Approvers:
  - Jake Hemstad (@jrhemstad)
  - David (@davidwendt)

URL: rapidsai#7575
@ttnghia ttnghia self-assigned this Apr 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants