[BUG] Index.difference
does not uniquify output for duplicate indexes
#14489
Labels
Milestone
Index.difference
does not uniquify output for duplicate indexes
#14489
Describe the bug
cudf computes
Index.difference
with aleftanti
join. This preserves duplicates in the left index. In contrast, pandas always produces an index with uniquified values.Steps/Code to reproduce bug
Expected behavior
Match pandas, either by calling
drop_duplicates
or usinglibcudf.search.contains
(see also #14487).Notes
Doesn't apply to
MultiIndex
right now, because that goes through pandas (although that's a perf bug that should be fixed).The text was updated successfully, but these errors were encountered: