Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: regression in DataFrame.shift method performance #42246

Closed
jorisvandenbossche opened this issue Jun 26, 2021 · 5 comments
Closed

PERF: regression in DataFrame.shift method performance #42246

jorisvandenbossche opened this issue Jun 26, 2021 · 5 comments
Labels
Performance Memory or execution speed performance Regression Functionality that used to work in a prior pandas version
Milestone

Comments

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Jun 26, 2021

Performance regression in shift caused by #40536

I don't know if it's a particularly useful benchmark, but there is a big slowdown potentially related to this PR: https://pandas.pydata.org/speed/pandas/#frame_methods.Shift.time_shift?python=3.8&Cython=0.29.21&p-axis=1&commits=05a0e98a-bc62e768

Originally posted by @jorisvandenbossche in #40536 (comment)

@jorisvandenbossche jorisvandenbossche added Performance Memory or execution speed performance Regression Functionality that used to work in a prior pandas version labels Jun 26, 2021
@jorisvandenbossche jorisvandenbossche added this to the 1.3 milestone Jun 26, 2021
@jbrockmendel
Copy link
Member

2 possible improvements come to mind:

  1. Change concat to allow disabling consolidation

  2. the axis=1 path added in REF: move shift logic from BlockManager to DataFrame #40536 may be unnecessary (there's a comment that suggests otherwise; would need to look more closely) when isinstance(self._mgr, BlockManager) and self._mgr.nblocks == 1, in which case we could do self.T.shift(n).T

@simonjayhawkins
Copy link
Member

2 possible improvements come to mind:

because the regression is a labelled as a refactor and the test fixed is for ArrayManager which is not yet public, is reverting for 1.3 also an option? (and do the refactor for 1.4 instead)

@jbrockmendel
Copy link
Member

is reverting for 1.3 also an option? (and do the refactor for 1.4 instead)

sure

@simonjayhawkins
Copy link
Member

moving to 1.4, xref #42322

@jbrockmendel
Copy link
Member

Closing. The PR that caused the perf regression was reverted, and my initial idea of eventually un-reverting looks very unlikely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

No branches or pull requests

3 participants