Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix slicing and arrow representations of decimal columns #7755

Merged
merged 6 commits into from
Apr 1, 2021

Conversation

vyasr
Copy link
Contributor

@vyasr vyasr commented Mar 29, 2021

Slices of decimal columns were not passing the appropriate offset to the new column views being created. Additionally, the conversion of decimal columns to and from arrow did not include the offset. This made all slices of decimal columns appear invalid.

Thanks to @shwina for helping me trace these bugs back to their source!

Edit
This PR also now includes an additional fix for the behavior of slices of decimal columns that should generate empty data sets (e.g. starting a slice past the end of a column).

@vyasr vyasr requested a review from a team as a code owner March 29, 2021 21:36
@github-actions github-actions bot added the Python Affects Python cuDF API. label Mar 29, 2021
@vyasr vyasr added bug Something isn't working non-breaking Non-breaking change labels Mar 29, 2021
@vyasr vyasr changed the title Fix/decimal column bugs Fix slicing and arrow representations of decimal columns Mar 29, 2021
@codecov
Copy link

codecov bot commented Mar 29, 2021

Codecov Report

Merging #7755 (d58c8f8) into branch-0.19 (7871e7a) will increase coverage by 0.80%.
The diff coverage is n/a.

❗ Current head d58c8f8 differs from pull request most recent head fae2063. Consider uploading reports for the commit fae2063 to get more accurate results
Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.19    #7755      +/-   ##
===============================================
+ Coverage        81.86%   82.67%   +0.80%     
===============================================
  Files              101      103       +2     
  Lines            16884    17548     +664     
===============================================
+ Hits             13822    14507     +685     
+ Misses            3062     3041      -21     
Impacted Files Coverage Δ
python/cudf/cudf/utils/dtypes.py 83.54% <0.00%> (-5.97%) ⬇️
python/cudf/cudf/utils/gpu_utils.py 53.65% <0.00%> (-4.88%) ⬇️
python/cudf/cudf/core/column/lists.py 87.32% <0.00%> (-4.08%) ⬇️
python/dask_cudf/dask_cudf/backends.py 87.50% <0.00%> (-2.13%) ⬇️
python/cudf/cudf/core/abc.py 87.23% <0.00%> (-1.14%) ⬇️
python/cudf/cudf/core/column/decimal.py 94.36% <0.00%> (-0.51%) ⬇️
python/cudf/cudf/core/column/column.py 87.53% <0.00%> (-0.23%) ⬇️
python/cudf/cudf/utils/utils.py 85.36% <0.00%> (-0.07%) ⬇️
python/cudf/cudf/io/feather.py 100.00% <0.00%> (ø)
python/cudf/cudf/utils/ioutils.py 78.71% <0.00%> (ø)
... and 49 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b937112...fae2063. Read the comment docs.

@brandon-b-miller
Copy link
Contributor

Looks good - there's some tests for offset and size in test_column.py, I think this was not seen earlier because that set of tests only covers the dtypes that pandas supports. I'd suggest adding a small test there.

Copy link
Contributor

@brandon-b-miller brandon-b-miller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple quick test cases and this should be good to go.

@vyasr vyasr requested a review from brandon-b-miller March 31, 2021 21:45
@kkraus14 kkraus14 added the 5 - Ready to Merge Testing and reviews complete, ready to merge label Apr 1, 2021
@kkraus14
Copy link
Collaborator

kkraus14 commented Apr 1, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 93050d4 into rapidsai:branch-0.19 Apr 1, 2021
@vyasr vyasr deleted the fix/decimal_column_bugs branch January 14, 2022 18:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge bug Something isn't working non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants