Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG/PERF: Series(index=MultiIndex).rename losing EA dtypes #50930

Merged
merged 1 commit into from
Jan 23, 2023

Conversation

lukemanley
Copy link
Member

Series(index=MultiIndex).rename loses EA dtypes:

import pandas as pd
import numpy as np

lev0 = pd.Index(np.arange(1000), dtype="Int64").astype("category")
lev1 = pd.Index(np.arange(1000), dtype="Int64")

mi = pd.MultiIndex.from_product([lev0, lev1], names=["A", "B"])
ser1 = pd.Series(1, index=mi)
ser2 = ser1.rename({10: 11}, level=1)

print(ser2.index.dtypes)

main:

A       int64
B       int64
dtype: object

PR:

A    category
B       Int64
dtype: object

Perf improves as well:

%timeit ser1.rename({10: 11}, level=1)

1.25 s ± 52.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)  -> main
350 ms ± 8.29 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)  -> PR

@lukemanley lukemanley added Bug Performance Memory or execution speed performance MultiIndex labels Jan 22, 2023
@mroeschke mroeschke added this to the 2.0 milestone Jan 23, 2023
@mroeschke mroeschke merged commit 79b2610 into pandas-dev:main Jan 23, 2023
@mroeschke
Copy link
Member

Thanks @lukemanley

pooja-subramaniam pushed a commit to pooja-subramaniam/pandas that referenced this pull request Jan 25, 2023
@lukemanley lukemanley deleted the series-rename-multiindex branch February 23, 2023 01:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug MultiIndex Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Rename MultiIndex level destroys CategoricalIndex on other levels
2 participants