Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Inconsistent handling of integer division by zero #12092

Closed
wence- opened this issue Nov 8, 2022 · 2 comments · Fixed by #12074
Closed

[BUG] Inconsistent handling of integer division by zero #12092

wence- opened this issue Nov 8, 2022 · 2 comments · Fixed by #12074
Assignees
Labels
2 - In Progress Currently a work in progress bug Something isn't working

Comments

@wence-
Copy link
Contributor

wence- commented Nov 8, 2022

Describe the bug

Type promotion to handle division by zero in __floordiv__ or __mod__ is inconsistently applied if the 0 is already a cudf.Scalar, depending on whether the dtypes of the Scalar and the divided Column match.

Steps/Code to reproduce bug

import cudf

s = cudf.Series([1, 2], dtype="int64")
print(s // 0)
# 0    inf
# 1    inf
# dtype: float64
print(s // cudf.Scalar(0, dtype="int32"))
# 0    inf
# 1    inf
# dtype: float64
print(s // cudf.Scalar(0)) # equivalent to cudf.Scalar(0, dtype="int64")
# 0    9223372036854775807
# 1    9223372036854775807
# dtype: int64
# Technically worse than this, since I think we end up 
# with division by zero of signed integer types in libcudf
# which is UB

Expected behavior

I should always get promotion to float64 (does it need to be float64?) and then inf in the output.

Why does this happen?

NumericalColumn._binaryop checks for division by zero if the op is __mod__ or __floordiv__, but doesn't handle the case where other is a cudf.Scalar. NumericalColumn._wrap_binop_normalization hands back a cudf.Scalar if it is passed a Scalar with the same dtype as the column in question. So in this case, we don't check to see if we have a zero and don't promote to float.

@wence- wence- added bug Something isn't working Needs Triage Need team to review and classify labels Nov 8, 2022
@wence- wence- self-assigned this Nov 8, 2022
wence- added a commit to wence-/cudf that referenced this issue Nov 8, 2022
@wence- wence- added 2 - In Progress Currently a work in progress and removed Needs Triage Need team to review and classify labels Nov 14, 2022
@wence-
Copy link
Contributor Author

wence- commented Nov 14, 2022

Further inconsistency, is that pandas special-cases bool dtypes and raises NotImplementedError or ZeroDivisionError (depending on ???) for division of a bool series by a bool.

And is inconsistent in its handling of division by a scalar boolean and a series with booleans.

@wence-
Copy link
Contributor Author

wence- commented Nov 16, 2022

Further inconsistency, is that pandas special-cases bool dtypes and raises NotImplementedError or ZeroDivisionError (depending on ???) for division of a bool series by a bool.

And is inconsistent in its handling of division by a scalar boolean and a series with booleans.

Opened pandas-dev/pandas#49699 to ask.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 - In Progress Currently a work in progress bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant