-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proof of concept for Copy-on-Write implementation #41878
Changes from 11 commits
462526e
41ee2b7
7a8dffc
1c964be
96b6d71
17cedb9
f4614c2
81b09c2
7f183de
693bc4f
a154591
71370c4
4e785d7
6741340
c4527e9
b33aaf2
9fdad69
f5da03f
c632757
e4a5f33
5efad50
8ea48cc
79b7a30
cf5c7e2
297662a
a011a06
cc09001
37e7ce0
e34b9b2
64377e0
0f28095
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1894,6 +1894,16 @@ def _setitem_single_column(self, loc: int, value, plane_indexer): | |
""" | ||
pi = plane_indexer | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. #42887 would make for a good precursor |
||
if not hasattr(self.obj._mgr, "blocks"): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. shouldn't this be an ArrayManager test? why the different semantics? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is just a check to know if it is an ArrayManager, without actually importing it (core/indexing.py currently doesn't import anything from internals). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yah we should for sure have 1 canonical way of doing this check so we can grep for all the places we do it There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. -> #44676 |
||
# ArrayManager | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. comment on why this is AM-specific? |
||
if com.is_null_slice(pi) or com.is_full_slice(pi, len(self.obj)): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. xref #44353 (doesn't need to be resolved for this PR, but will make lots of things easier) |
||
arr = self.obj._sanitize_column(value) | ||
self.obj._mgr.iset(loc, arr) | ||
else: | ||
self.obj._mgr.column_setitem(loc, plane_indexer, value) | ||
self.obj._clear_item_cache() | ||
return | ||
|
||
ser = self.obj._ixs(loc, axis=1) | ||
|
||
# perform the equivalent of a setitem on the info axis | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -351,8 +351,12 @@ def where(self: T, other, cond, align: bool, errors: str) -> T: | |
errors=errors, | ||
) | ||
|
||
def setitem(self: T, indexer, value) -> T: | ||
return self.apply("setitem", indexer=indexer, value=value) | ||
def setitem(self: T, indexer, value, inplace=False) -> T: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why is this change necessary? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To follow the change that was needed in the ArrayManager (to ensure every mutation of the values happens in the manager), see #41879 for this change as a separated pre-cursor PR |
||
if inplace: | ||
assert self.ndim == 1 | ||
self._block.values[indexer] = value | ||
else: | ||
return self.apply("setitem", indexer=indexer, value=value) | ||
|
||
def putmask(self, mask, new, align: bool = True): | ||
|
||
|
@@ -566,6 +570,9 @@ def copy(self: T, deep=True) -> T: | |
------- | ||
BlockManager | ||
""" | ||
if deep is None: | ||
# preserve deep copy for BlockManager with copy=None | ||
deep = True | ||
# this preserves the notion of view copying of axes | ||
if deep: | ||
# hit in e.g. tests.io.json.test_pandas | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
copy=None
is what changesDataFrame.rename
to use a shallow copy with CoW instead of doing a full copy for the ArrayManager.This is eventually passed to
self.copy(deep=copy)
, anddeep=None
is used to signal a shallow copy for ArrayManager while for now preserving the deep copy for BlockManager.