Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for Decimal/Fixed-point to ORC reader #7970

Conversation

rgsl888prabhu
Copy link
Contributor

@rgsl888prabhu rgsl888prabhu commented Apr 15, 2021

Added support for Decimal/fixed-point column in ORC reader along with test cases. All decimal columns would be read as Decimal64 type column, and if precision is >18, it will loudly fail. This PR also remove couple of options which are of no use after the addition of Decimal support.

#7126

@rgsl888prabhu rgsl888prabhu added 3 - Ready for Review Ready for review by team Python Affects Python cuDF API. 4 - Needs cuDF (Python) Reviewer cuIO cuIO issue labels Apr 15, 2021
@rgsl888prabhu rgsl888prabhu self-assigned this Apr 15, 2021
@rgsl888prabhu rgsl888prabhu requested a review from a team as a code owner April 15, 2021 17:05
@rgsl888prabhu rgsl888prabhu requested a review from a team as a code owner April 15, 2021 17:05
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Apr 15, 2021
@rgsl888prabhu rgsl888prabhu changed the title Adding support Decimal/Fixed-point to ORC reader Adding support for Decimal/Fixed-point to ORC reader Apr 15, 2021
@rgsl888prabhu rgsl888prabhu added breaking Breaking change feature request New feature or request labels Apr 15, 2021
@rgsl888prabhu rgsl888prabhu requested review from vuule and devavret April 15, 2021 18:02
Copy link
Contributor

@devavret devavret left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat! There may be more to ORC logic that I couldn't properly check but the code looks fine.

@rgsl888prabhu
Copy link
Contributor Author

rerun tests

@codecov
Copy link

codecov bot commented Apr 16, 2021

Codecov Report

Merging #7970 (bf0e29a) into branch-0.20 (599f62d) will increase coverage by 0.21%.
The diff coverage is 92.81%.

❗ Current head bf0e29a differs from pull request most recent head c95db42. Consider uploading reports for the commit c95db42 to get more accurate results
Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.20    #7970      +/-   ##
===============================================
+ Coverage        82.30%   82.52%   +0.21%     
===============================================
  Files              101      103       +2     
  Lines            17053    17296     +243     
===============================================
+ Hits             14035    14273     +238     
- Misses            3018     3023       +5     
Impacted Files Coverage Δ
python/cudf/cudf/core/column/__init__.py 100.00% <ø> (ø)
python/cudf/cudf/core/column/column.py 88.48% <85.36%> (+1.04%) ⬆️
python/cudf/cudf/core/_internals/where.py 94.61% <94.61%> (ø)
python/cudf/cudf/__init__.py 100.00% <100.00%> (ø)
python/cudf/cudf/core/__init__.py 100.00% <100.00%> (ø)
python/cudf/cudf/core/_compat.py 100.00% <100.00%> (ø)
python/cudf/cudf/core/_internals/__init__.py 100.00% <100.00%> (ø)
python/cudf/cudf/core/column/datetime.py 89.40% <100.00%> (+0.19%) ⬆️
python/cudf/cudf/utils/dtypes.py 83.44% <0.00%> (-6.45%) ⬇️
python/cudf/cudf/utils/gpu_utils.py 53.65% <0.00%> (-4.88%) ⬇️
... and 28 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 18964ff...c95db42. Read the comment docs.

@rgsl888prabhu
Copy link
Contributor Author

@vuule

Copy link
Contributor

@vuule vuule left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presently surprised with how small the implementation changes are. Requesting a few small changes, but looks good overall.

cpp/src/io/orc/reader_impl.cu Outdated Show resolved Hide resolved
cpp/src/io/orc/stripe_data.cu Outdated Show resolved Hide resolved
@rgsl888prabhu rgsl888prabhu requested a review from vuule April 19, 2021 22:13
Copy link
Contributor

@galipremsagar galipremsagar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pytest and Cython changes Look good! 👍

@galipremsagar galipremsagar added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 4 - Needs cuDF (Python) Reviewer 3 - Ready for Review Ready for review by team labels Apr 20, 2021
@rgsl888prabhu
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit d501d2c into rapidsai:branch-0.20 Apr 20, 2021
rapids-bot bot pushed a commit that referenced this pull request Apr 22, 2021
`decimals_as_float64` was recently removed in #7970. This PR fix benchmark builds by removing it's usage from benchmarks.

Authors:
  - Christopher Harris (https://github.com/cwharris)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)
  - Conor Hoekstra (https://github.com/codereport)
  - Vukasin Milovanovic (https://github.com/vuule)
  - Devavret Makkar (https://github.com/devavret)

URL: #8007
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge breaking Breaking change cuIO cuIO issue feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants