Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REF: Manager.fast_xs to return SingleBlockManager instead of array #47077

Merged
merged 1 commit into from
May 22, 2022

Conversation

jorisvandenbossche
Copy link
Member

This is something that I encountered in #46958, but again something that can be broken off / potentially useful anyway.

Currently the fast_xs method returns an array, but the only two places where it is being used (xs / _ixs), this array is directly wrapped in a Series. So by letting fast_xs return a manager instead, this gives a slight simplification / faster series construction.

The reason that it is also useful for #46958 is because the CoW is handled at the manager level, so when creating a subset of a DataFrame (here in xs), ideally that is done by creating a subset manager (with proper references if needed) instead of going through raw arrays (as how it is done for example when accessing a column as a Series).

@jorisvandenbossche jorisvandenbossche added the Internals Related to non-user accessible pandas implementation label May 20, 2022
@jorisvandenbossche jorisvandenbossche added this to the 1.5 milestone May 20, 2022
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name=self.index[i],
dtype=new_values.dtype,
).__finalize__(self)
copy = isinstance(new_mgr.array, np.ndarray) and new_mgr.array.base is None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not for this PR, but this line is something we should handle more systematically, will be wrong for NDArrayBackedEAs

Copy link
Member

@jbrockmendel jbrockmendel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jreback jreback merged commit d5ba8c0 into pandas-dev:main May 22, 2022
@jorisvandenbossche jorisvandenbossche deleted the internals-fast_xs branch May 22, 2022 09:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Internals Related to non-user accessible pandas implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants