-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Categorical.copy deep kwarg #27024
Conversation
Codecov Report
@@ Coverage Diff @@
## master #27024 +/- ##
==========================================
+ Coverage 91.99% 92% +<.01%
==========================================
Files 180 180
Lines 50774 50760 -14
==========================================
- Hits 46712 46703 -9
+ Misses 4062 4057 -5
Continue to review full report at Codecov.
|
pandas/core/arrays/categorical.py
Outdated
@@ -2295,7 +2298,7 @@ def unique(self): | |||
|
|||
# unlike np.unique, unique1d does not sort | |||
unique_codes = unique1d(self.codes) | |||
cat = self.copy() | |||
cat = self.copy() # Don't need deep here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm...if you're going to state this, I might briefly explain why.
if self.is_extension: | ||
values = values.copy(deep=True) | ||
else: | ||
values = values.copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is a deep-copy needed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gfyoung just pushed addressing most of your comments. for this one: because without deep=True, it isn't a "real" copy. i.e. ExtensionArray.copy(deep=True)
behaves the same as np.ndarray.copy()
.
other3 = data.copy() | ||
|
||
other3[0] = other3[1] | ||
assert data[0] == data[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots of tests bundled into one function. I would consider breaking this up into at least four different tests, especially since you "reset" every time by doing a copy of some kind of data
.
What was the decision in the issue? I thought we were tending toward deprecating |
I don't think a decision has been reached there (though I agree with your assessment of the momentum). Until/unless that change is made, this is the right move. Largely motivated by wanting to clear road-blocks to #27015. |
yeah let's not add the deep kwarg on EA .copy(), it doesn't really make sense; we may actually be able to deprecate it entirely on Series/DataFrame as well (separate issue). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comments
The issue is that EA already has the deep, kwarg, but Categorical doesn't. I'm pretty sure that this will fix some latent bugs for other EAs, will check and add tests if so. |
Superceded by #27083 |
Would close #26995 if I hadn't just updated that to reflect the fact that several other pandas-internal EAs don't handle the
deep
kwarg correctly.git diff upstream/master -u -- "*.py" | flake8 --diff