-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix sort_values
when column is all empty strings
#12988
Fix sort_values
when column is all empty strings
#12988
Conversation
if last := divisions[col].iloc[-1]: | ||
divisions[col].iloc[-1] = chr(ord(last[0]) + 1) | ||
else: | ||
divisions[col].iloc[-1] = chr(1) # b/c "" < chr(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly, using chr(0)
here doesn't work, because drop_duplicates
below drops it by treating ""
and chr(0)
the same, even though "" < chr(0)
. Do we care that chr(1)
is not printable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually a bug in [i]loc.__setitem__
:
import cudf
x = cudf.Series(["", chr(0)])
x.drop_duplicates() == x # True
y = cudf.Series(["a", "b"])
y.iloc[0] = ""
y.iloc[1] = chr(0)
y.iloc[0] == y.iloc[1] # True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess in the case that last
is the empty string, it suffices to provide any non-empty string as the upper bound on division?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it suffices to provide any non-empty string
Yup. How about "wence was here"
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"nonempty"
? I don't like to leave footprints :).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"this string intentionally left empty"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like David's suggestion (or we can wait for #12991 and use chr(0)
)
/merge |
Description
See test for simple MRE.
This fixes rapidsai/cugraph#3058
Checklist