Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: ffill/bfill with non-numpy dtypes #53950

Merged
merged 3 commits into from
Jul 2, 2023

Conversation

jbrockmendel
Copy link
Member

Implement ffill/bfill in terms of 'take', avoiding an object cast.

ser = pd.Series(range(10**5), dtype="int64[pyarrow]")
ser[5000] = pd.NA

%timeit ser.ffill()
609 µs ± 25.9 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)  # <- branch
5.49 ms ± 158 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  # <- main

Some of the perf difference is likely due to no longer issuing a PerformanceWarning.

@jbrockmendel jbrockmendel requested a review from WillAyd as a code owner June 30, 2023 16:59
@phofl phofl added the Performance Memory or execution speed performance label Jul 2, 2023
@phofl phofl added this to the 2.1 milestone Jul 2, 2023
@phofl phofl merged commit 4da9cb6 into pandas-dev:main Jul 2, 2023
@phofl
Copy link
Member

phofl commented Jul 2, 2023

thx @jbrockmendel

@jbrockmendel jbrockmendel deleted the perf-ea-pad branch July 2, 2023 21:42
Daquisu pushed a commit to Daquisu/pandas that referenced this pull request Jul 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants