-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Series constructor not respecting CoW when called with BlockManager #52017
Conversation
pandas/core/series.py
Outdated
@@ -379,6 +379,9 @@ def __init__( | |||
copy: bool = False, | |||
fastpath: bool = False, | |||
) -> None: | |||
if isinstance(data, SingleBlockManager) and using_copy_on_write() and not copy: | |||
data = data.copy(deep=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this be done inside the isinstance(data, (SingleBlockManager, SingleArrayManager))
block below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a bunch of branches where we handle BlockManagers, so we could yes but we'd have to duplicate the copy call a bunch of times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But shouldn't this only be reached by internal calls where we should be specifying copy explicitly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Theoretically yes, practically I don't know if users use this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in practice e.g. pyarrow does but i think we should get them to stop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you be ok with adding this for 2.0 and deprecating for 2.1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a bunch of branches where we handle BlockManagers, so we could yes but we'd have to duplicate the copy call a bunch of times.
I would duplicate it a bit, or at least for this initial block, and then maybe once after that.
The first if
case here is meant to be our "fastpath" creating a series from just a manager, and to keep this as fast as possible, I would move this check once inside that block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved it down
BTW we should deprecate accepting managers in the constructor and wean 3rd party libraries off of it |
def test_series_from_block_manager(using_copy_on_write, idx, dtype, fastpath): | ||
ser = Series([1, 2, 3], dtype="int64") | ||
ser_orig = ser.copy() | ||
ser2 = Series(ser._mgr, dtype=dtype, fastpath=fastpath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to pass index=idx
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes, thx
If we want to pursue that, I think it would be best to start with proposing and implementing an alternative option to use, since we still need some way to do this for both advanced users (like pyarrow) and subclasses (they need to be able to customize this constructor from a manager, xref #51772) (I personally don't have a strong opinion about this. I think it would certainly make sense to have a separate constructor if we were designing it, just not fully sure it is worth changing now) |
…nager # Conflicts: # pandas/tests/copy_view/test_constructors.py
Addressed all comments, will merge to get in. |
…CoW when called with BlockManager
…cting CoW when called with BlockManager) (#52275) Backport PR #52017: BUG: Series constructor not respecting CoW when called with BlockManager Co-authored-by: Patrick Hoefler <[email protected]>
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.