-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REF: de-duplicate IntervalIndex._intersection #41929
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow luv it!
failures looks spurious |
This might have caused a big slowdown, see https://pandas.pydata.org/speed/pandas/#index_object.IntervalIndexMethod.time_intersection_both_duplicate?python=3.8&Cython=0.29.21&p-param1=100000&commits=17391997-969688a9 (the range of commits also includes some other commits) |
The indexing bugfix might play a part, but it does seem likely that this is the culprit. Profiling it looks like drop_duplicates is the big time sink. |
ive got a IntervalArray.unique branch in the works that cuts this this from 4x down to 2x, still a ways to go. |
made possible by #41863