Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for dict-like argument to Categorical.rename_categories #17336

Closed
linusmarco opened this issue Aug 25, 2017 · 1 comment · Fixed by #17586
Closed

Allow for dict-like argument to Categorical.rename_categories #17336

linusmarco opened this issue Aug 25, 2017 · 1 comment · Fixed by #17586
Labels
Categorical Categorical Data Type Enhancement
Milestone

Comments

@linusmarco
Copy link

Proposal

I think it would be great if pandas.core.categorical.Categorical.rename_categories could take a dict-like argument as an alternative to the index-like argument that it currently requires. This would make the syntax more similar to that of DataFrame.rename. The proposed change would allow for something like this:

Code Sample

>>>> df['categorical_var']
0    apples
1    oranges
2    apples
3    apples

>>>> # rename categories from 'apples' and 'oranges' to 'red fruits' and 'orange fruits'
>>>> df['categorical_var'].rename_categories({ 'apples': 'red fruits', 'oranges': 'orange fruits'})
0    red fruits
1    orange fruits
2    red fruits
3    red fruits

Why?

1. Allows the developer to execute the rename without any prior knowledge of the order of the categories.

This:

>>>> df['categorical_var']
0    apples
1    oranges
2    apples
3    apples

>>>> df['categorical_var'].rename_categories({ 'apples': 'red fruits', 'oranges': 'orange fruits'})
0    red fruits
1    orange fruits
2    red fruits
3    red fruits

instead of this:

>>>> df['categorical_var']
0    apples
1    oranges
2    apples
3    apples

>>>> # to figure out current category order
>>>> print(df['categorical_var'].cat.categories)
['apples', 'oranges']

>>>> # rename based on order
>>>> df['categorical_var'].rename_categories(['red fruits', 'orange fruits'])
0    red fruits
1    orange fruits
2    red fruits
3    red fruits

2. Allows the developer to much more easily rename one or a small subset of categories without having to worry about the rest of the category names.

This:

>>>> df['categorical_var']
0    apples
1    oranges
2    apples
3    apples
4    green fruits
5    yellow fruits
6    blue fruits
7    purple fruits

>>>> df['categorical_var'].rename_categories({ 'apples': 'red fruits', 'oranges': 'orange fruits'})
0    red fruits
1    orange fruits
2    red fruits
3    red fruits
4    green fruits
5    yellow fruits
6    blue fruits
7    purple fruits

instead of this:

>>>> df['categorical_var']
0    apples
1    oranges
2    apples
3    apples
4    green fruits
5    yellow fruits
6    blue fruits
7    purple fruits

>>>> # to find full list of categories
>>>> print(df['categorical_var'].cat.categories)
['apples', 'oranges' 'green fruits', 'yellow fruits', 'blue fruits', 'purple fruits']

>>>> # rename must contain full category list
>>>> df['categorical_var'].rename_categories(['red fruits', 'orange fruits', 'green fruits', 'yellow fruits', 'blue fruits', 'purple fruits'])
0    red fruits
1    orange fruits
2    red fruits
3    red fruits
4    green fruits
5    yellow fruits
6    blue fruits
7    purple fruits

3. As I mentioned above, this also makes the syntax more similar to the DataFrame.rename syntax, which I think increases overall usability

I would be happy to start working on this if others agree that adding the feature makes sense.

@gfyoung gfyoung added Categorical Categorical Data Type Enhancement labels Aug 26, 2017
@gfyoung
Copy link
Member

gfyoung commented Aug 26, 2017

@linusmarco : Thanks for reporting this! I think that makes good sense, especially since it would be consistent with DataFrame.rename. Have a look to see what you would need to do to make this work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type Enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants