-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Add binop operators that match Spark's expected NaN behavior #4752
Comments
So that readers don't have to compile and run your tests just to think about this bug, please print out the values you get in |
Thanks for looking @harrism. I have updated the bug with the test output. I will update it further with the entire output in the AM |
It would be a lot easier to understand the issue if the expected/actual behavior were summarized in a table. Can you please fill in the
|
Pandas behaviour in the above cases: In [8]: x = pd.Series([np.inf, 1.02, 5.0, np.nan, -43.2, -np.inf])
In [9]: np.nan > x
Out[9]:
0 False
1 False
2 False
3 False
4 False
5 False
dtype: bool
In [10]: np.nan == x
Out[10]:
0 False
1 False
2 False
3 False
4 False
5 False
dtype: bool
In [11]: np.nan != x
Out[11]:
0 True
1 True
2 True
3 True
4 True
5 True
dtype: bool This looks to be inline with the IEE 754 standard. |
Seeing as though Spark's behavior is non-conformant with IEEE 754, I'm going to label this as a feature request rather than a bug. Supporting this behavior in libcudf will require adding new binop operators that support the non-conformant behavior, like |
@jrhemstad I have updated the table. |
@jrhemstad is this still valid? considering we have decided to abide by the NaN behavior set by IEEE 754 |
Thanks for the reminder. Per conversation in #4760, this should be implemented via composition of other libcudf features. Closing. |
Describe the bug
Spark expects Nans to perform differently from other languages. Currently, cudf isn't behaving how we expect it to behave as per spark
Steps/Code to reproduce bug
The above tests fail with the following error
Here is the output from the Java unit test displaying the entire column output
Expected behavior
The above tests should pass
Additional context
The above tests are for float32 but similar test should pass for float64
The text was updated successfully, but these errors were encountered: