[REVIEW] Add support for `decimal128` in cudf python #9533

galipremsagar · 2021-10-27T14:22:01Z

Resolves: #10031

Note: The CI for this PR is not going to pass until #9986 is admin-merged(Admin merge needed since #9986 requires this PR changes too).

…mal128

Co-authored-by: Vyas Ramasubramani <[email protected]>

…cudf into python_decimal128

vuule

optional suggestion

python/cudf/cudf/_lib/orc.pyx

Co-authored-by: Vukasin Milovanovic <[email protected]>

…cudf into python_decimal128

vyasr

I have a couple final small suggestions but I'm going ahead and approving right now. It looks great!

python/cudf/cudf/core/dtypes.py

vyasr · 2022-01-14T22:07:08Z

python/cudf/cudf/core/dtypes.py

+    name = "decimal128"
+    MAX_PRECISION = 38
+    itemsize = 16


I don't know if we have a convention for class variables being upper or lower case, but I would suggest being consistent at least within these classes in lieu of a broader rule. It is odd to see MAX_PRECISION capitalized while itemsize is lowercase.

In column.pyx we use itemsize attribute to calculate a columns base_size:

@property def base_size(self): return int(self.base_data.size / self.dtype.itemsize)

Should we capitalize class constant to ITEMSIZE and introduce a property in parent DecimalType class that just returns the class constant?

@property def itemsize(self): return self.ITEMSIZE

I did the above changes, if you think it was unnecessary let me know I'll revert it.

Unless it needs to be evaluated per instance per access, there's no need for a property IMO. A lowercase attribute (for consistency with np.dtype) is fine.

I take that back, a property does effectively make this a read-only attribute..

Ah I missed that numpy.dtype has an itemsize attribute and that we needed to match that. @shwina IMO in general we shouldn't use properties over class attributes just to enforce immutability. If something really belongs to the class, not to instances, I think it is more Pythonic to put it in the class and allow users to shoot themselves in the foot by modifying it (as long as we document appropriately). That said, in this case we should match whatever numpy is doing, so @galipremsagar's solution looks fine to me.

…mal128

Resolves C++ side of #9980. The reason this PR is breaking is because Arrow only has a notion of `decimal128` (see `arrow::Type::DECIMAL`). We can still support both `decimal64` **and** `decimal128` for `to_arrow` but for `from_arrow` it only makes sense to support one of them, and `decimal128` (now that we have it) is the logical choice. Therfore, the switching of the return type of a column coming `from_arrow` from `decimal64` to `decimal128` is a breaking change. Requires: * #7314 * #9533 Authors: - Conor Hoekstra (https://github.com/codereport) Approvers: - Devavret Makkar (https://github.com/devavret) - Mike Wilson (https://github.com/hyperbolic2346)

codecov · 2022-01-18T19:22:55Z

Codecov Report

Merging #9533 (e4ccbb2) into branch-22.02 (967a333) will decrease coverage by 0.07%.
The diff coverage is n/a.

@@               Coverage Diff                @@
##           branch-22.02    #9533      +/-   ##
================================================
- Coverage         10.49%   10.41%   -0.08%     
================================================
  Files               119      119              
  Lines             20305    20541     +236     
================================================
+ Hits               2130     2139       +9     
- Misses            18175    18402     +227

Impacted Files	Coverage Δ
python/custreamz/custreamz/kafka.py	`29.16% <0.00%> (-0.63%)`	⬇️
python/dask_cudf/dask_cudf/sorting.py	`92.66% <0.00%> (-0.25%)`	⬇️
python/dask_cudf/dask_cudf/core.py	`70.85% <0.00%> (-0.17%)`	⬇️
python/cudf/cudf/__init__.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/api/types.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/core/frame.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/core/index.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/io/parquet.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/core/dtypes.py	`0.00% <0.00%> (ø)`
python/cudf/cudf/core/scalar.py	`0.00% <0.00%> (ø)`
... and 31 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 45c20d1...e4ccbb2. Read the comment docs.

galipremsagar · 2022-01-18T19:24:53Z

@gpucibot merge

galipremsagar · 2022-01-18T19:39:52Z

rerun tests

choekstra added 30 commits July 20, 2021 02:07

Initial changes

c8a171c

More changes

afe6ec6

Small cleanup

43b615a

Small cleanup

ebedcad

Removal of device_storage_type_id, formatting and more

1d2e0b4

Formatting

2ea39fe

cudf::round support for __int128_t

606d6e3

Enable tests & fixes

ee70203

Missing changes

fd6157b

Scan, column_wrapper, orc, etc

d4506af

Binop changes

791e91c

detail::to_string

ad5fe35

Aggregation changes

7cc9db1

Small fix in fixed_point.hpp

5dd6874

Enable quantile

a89f958

Comment update

a16a2b8

REDUCTION_TEST working changes

e89a9ba

ROLLING_TEST changes

7ef28bf

Initial changes for STRINGS_TEST

7fd4ac4

STRINGS changes

016c35a

Clean up

dbd0504

Merge remote-tracking branch 'upstream/branch-21.10' into decimal128

9c764e6

std::is_same_v

bf34d20

is_integral & is_arithmetic

103a4db

Clean up

575fca7

Fixes / cleanup

8549753

DECIMAL128 custom reduction tests

22de55a

Another REDUCTION test

5b69c0c

numeric_limits / temporary cleanup

95667c8

More changes, 10+ files

825ab86

galipremsagar and others added 4 commits January 14, 2022 12:18

Merge remote-tracking branch 'upstream/branch-22.02' into python_deci…

658473f

…mal128

Update python/cudf/cudf/core/column/decimal.py

fb5b8d2

Co-authored-by: Vyas Ramasubramani <[email protected]>

address reviews

700fa59

Merge branch 'python_decimal128' of https://github.com/galipremsagar/…

8c133a0

…cudf into python_decimal128

galipremsagar requested a review from vyasr January 14, 2022 20:41

vuule reviewed Jan 14, 2022

View reviewed changes

python/cudf/cudf/_lib/orc.pyx Outdated Show resolved Hide resolved

galipremsagar and others added 3 commits January 14, 2022 14:55

Update python/cudf/cudf/_lib/orc.pyx

00959cb

Co-authored-by: Vukasin Milovanovic <[email protected]>

merge

3ceb94b

Merge branch 'python_decimal128' of https://github.com/galipremsagar/…

18d51f3

…cudf into python_decimal128

vyasr approved these changes Jan 14, 2022

View reviewed changes

galipremsagar added 2 commits January 18, 2022 06:33

Merge remote-tracking branch 'upstream/branch-22.02' into python_deci…

f1b3bb3

…mal128

address review comments

2aa6ab8

Merge branch 'rapidsai:branch-22.02' into python_decimal128

e4ccbb2

galipremsagar removed request for robertmaynard, quasiben, mythrocks, codereport, brandon-b-miller and skirui-source January 18, 2022 19:24

galipremsagar added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team 4 - Needs cuDF (Python) Reviewer labels Jan 18, 2022

rapids-bot bot merged commit 8d7330f into rapidsai:branch-22.02 Jan 18, 2022

codereport mentioned this pull request Feb 7, 2022

[QST] Output Type of DecimalType Binary Operation #10230

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REVIEW] Add support for `decimal128` in cudf python #9533

[REVIEW] Add support for `decimal128` in cudf python #9533

galipremsagar commented Oct 27, 2021 •

edited

Loading

vuule left a comment

vyasr left a comment

vyasr Jan 14, 2022

galipremsagar Jan 18, 2022

galipremsagar Jan 18, 2022

shwina Jan 18, 2022

shwina Jan 18, 2022

vyasr Jan 18, 2022

codecov bot commented Jan 18, 2022 •

edited

Loading

galipremsagar commented Jan 18, 2022

galipremsagar commented Jan 18, 2022

[REVIEW] Add support for decimal128 in cudf python #9533

[REVIEW] Add support for decimal128 in cudf python #9533

Conversation

galipremsagar commented Oct 27, 2021 • edited Loading

vuule left a comment

Choose a reason for hiding this comment

vyasr left a comment

Choose a reason for hiding this comment

vyasr Jan 14, 2022

Choose a reason for hiding this comment

galipremsagar Jan 18, 2022

Choose a reason for hiding this comment

galipremsagar Jan 18, 2022

Choose a reason for hiding this comment

shwina Jan 18, 2022

Choose a reason for hiding this comment

shwina Jan 18, 2022

Choose a reason for hiding this comment

vyasr Jan 18, 2022

Choose a reason for hiding this comment

codecov bot commented Jan 18, 2022 • edited Loading

Codecov Report

galipremsagar commented Jan 18, 2022

galipremsagar commented Jan 18, 2022

[REVIEW] Add support for `decimal128` in cudf python #9533

[REVIEW] Add support for `decimal128` in cudf python #9533

galipremsagar commented Oct 27, 2021 •

edited

Loading

codecov bot commented Jan 18, 2022 •

edited

Loading