-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] Avoid decimal
type narrowing for decimal binops
#10299
Changes from 6 commits
7b084c1
1f5ca2b
6d2cbab
7f6ec3e
d43e944
840630b
a2c6e8a
bd54b5e
c12f9f7
71572ec
56987a4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -364,18 +364,51 @@ def _get_decimal_type(lhs_dtype, rhs_dtype, op): | |
else: | ||
raise NotImplementedError() | ||
|
||
if isinstance(lhs_dtype, type(rhs_dtype)): | ||
# SCENARIO 1: If `lhs_dtype` & `rhs_dtype` are same, then try to | ||
# see if `precision` & `scale` can be fit into this type. | ||
try: | ||
return lhs_dtype.__class__(precision=precision, scale=scale) | ||
except ValueError: | ||
# Call to _validate fails, which means we need | ||
# to goto SCENARIO 3. | ||
pass | ||
else: | ||
# SCENARIO 2: If `lhs_dtype` & `rhs_dtype` are of different dtypes, | ||
# then try to see if `precision` & `scale` can be fit into the type | ||
# with greater MAX_PRECISION (i.e., the bigger dtype). | ||
try: | ||
if lhs_dtype.MAX_PRECISION > rhs_dtype.MAX_PRECISION: | ||
galipremsagar marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return lhs_dtype.__class__(precision=precision, scale=scale) | ||
else: | ||
return rhs_dtype.__class__(precision=precision, scale=scale) | ||
except ValueError: | ||
# Call to _validate fails, which means we need | ||
# to goto SCENARIO 3. | ||
pass | ||
|
||
# SCENARIO 3: If either of the above two scenarios fail, then get the | ||
# MAX_PRECISION of `lhs_dtype` & `rhs_dtype` so that we can only check | ||
# and return a dtype that is greater than or equal to input dtype that | ||
# can fit `precision` & `scale`. | ||
lhs_rhs_max_precision = max( | ||
galipremsagar marked this conversation as resolved.
Show resolved
Hide resolved
|
||
lhs_dtype.MAX_PRECISION, rhs_dtype.MAX_PRECISION | ||
) | ||
for decimal_type in ( | ||
cudf.Decimal32Dtype, | ||
cudf.Decimal64Dtype, | ||
cudf.Decimal128Dtype, | ||
): | ||
try: | ||
min_decimal_type = decimal_type(precision=precision, scale=scale) | ||
except ValueError: | ||
# Call to _validate fails, which means we need | ||
# to try the next dtype | ||
pass | ||
else: | ||
return min_decimal_type | ||
if decimal_type.MAX_PRECISION >= lhs_rhs_max_precision: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It looks like this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I might be wrong here -- I see you're constructing the returned type with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It's not a bug, the dtype is expected to have
This was a necessary duplication because we want to pick a |
||
try: | ||
min_decimal_type = decimal_type( | ||
precision=precision, scale=scale | ||
) | ||
galipremsagar marked this conversation as resolved.
Show resolved
Hide resolved
|
||
except ValueError: | ||
# Call to _validate fails, which means we need | ||
# to try the next dtype | ||
pass | ||
galipremsagar marked this conversation as resolved.
Show resolved
Hide resolved
|
||
else: | ||
return min_decimal_type | ||
|
||
raise OverflowError("Maximum supported decimal type is Decimal128") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see a comment above,
Do we only support add/sub/mul/div operations right now in Python because of limitations in this function? I know that other operations are implemented in libcudf, so piping that through might be a significant improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not just
binary_operation_fixed_point_scale
but I think support for other binop's are not supported from libcudf side.Looking into
binary_operation_fixed_point_scale
, it seems the formula forDIV
is wrong? I could be wrong here but don't match what is specified here: https://docs.microsoft.com/en-us/sql/t-sql/data-types/precision-scale-and-length-transact-sqlThough libcudf doesn't take
precision
as input the python side will need calculation so probably better to have those two computations in a single place rather than having to have to look at two places.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Support for other operators exists, e.g.
MOD
/PMOD
/PYMOD
: #10179.I'm fine with keeping both precision/scale calculations together here. I just wanted to make a note to ask, since I saw the comment above.
There may or may not be issues with the scale/precision calculations. I think the page you referenced has different conventions than libcudf. In my understanding:
Neither value appears to correspond to the linked SQL docs. That page appears to always use powers of 10 for both scale and precision. Also the definition of scale is the negative of libcudf's definition. It does not surprise me that these different conventions would result in different expressions. I spent an hour looking into this but I have no idea how to make the two definitions mathematically correspond.
Working through an example calculation here, for the SQL docs:
I was confused and gave up at this point -- how could
1.28
havep=8, s=6
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @bdice, I think @codereport would have a better understanding on this than me. But I'm merging these changes for now and we can have a follow-up PR if changes need to be done.