-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Bug in MultiIndex.has_duplicates when having many levels causes an indexer overflow (GH9075) #9077
Conversation
so this is implemented on a Should implement the trivial version of |
this can overflow with int64 as well. it should follow something like this and ideally factorized in one place |
@behzadnouri you are probably right but I don't this is a practical overflow (unlike a groupby which deals with s theoretical space). can u come up with an example that actually does overflow? (obviously using int64) |
this commit , which says:
the |
8f8c6f0
to
1fe7a94
Compare
…an indexer overflow (GH9075)
1fe7a94
to
93f9073
Compare
what do you think? |
group_index = np.zeros(len(self), dtype='i8') | ||
for i in range(len(shape)): | ||
stride = np.prod([x for x in shape[i + 1:]], dtype='i8') | ||
group_index += self.labels[i] * stride | ||
group_index += _ensure_int64(self.labels[i]) * stride |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will break if any of the self.labels[i]
are -1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so if their any NaN's, what would you do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if any of the labels are -1, we just need to lift labels and size by one, just as in _maybe_lift
function which is part of this PR.
closing in favor of #9101 |
closes #9075