Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: groupby over a CategoricalIndex in axis=1 #18432

Closed
ekisslinger opened this issue Nov 22, 2017 · 1 comment · Fixed by #18525
Closed

BUG: groupby over a CategoricalIndex in axis=1 #18432

ekisslinger opened this issue Nov 22, 2017 · 1 comment · Fixed by #18525
Labels
Bug Categorical Categorical Data Type Groupby Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@ekisslinger
Copy link
Contributor

Code Sample

from pandas import DataFrame, CategoricalIndex

cat_index = CategoricalIndex(['a', 'b', 'a', 'b'], categories=['a', 'b'])
df = DataFrame(data=1.0, index=[0, 1], columns=cat_index)
print(df.groupby(axis=1, level=0).sum())
# Attempting a groupby using a CategoricalIndex results in:
#     ValueError: Categorical dtype grouper must have len(grouper) == len(data)

Problem description

Attempting a groupby over a CategoricalIndex for the columns results in a ValueError when using Pandas 0.21.0

Expected Output

     a    b
0  2.0  2.0
1  2.0  2.0

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.5.4.final.0 python-bits: 64 OS: Linux OS-release: 2.6.32-696.10.3.el6.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US LOCALE: en_US.ISO8859-1 pandas: 0.21.0 pytest: None pip: 9.0.1 setuptools: 36.7.2 Cython: None numpy: 1.13.3 scipy: 1.0.0 pyarrow: 0.7.1 xarray: None IPython: 6.2.1 sphinx: 1.6.5 patsy: 0.4.1 dateutil: 2.6.1 pytz: 2017.3 blosc: None bottleneck: 1.2.1 tables: 3.4.2 numexpr: 2.6.4 feather: None matplotlib: 2.0.2 openpyxl: 2.5.0b1 xlrd: 1.0.0 xlwt: 1.3.0 xlsxwriter: None lxml: None bs4: None html5lib: 0.999999999 sqlalchemy: 1.1.6 pymysql: None psycopg2: None jinja2: 2.9.6 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None
@jreback
Copy link
Contributor

jreback commented Nov 25, 2017

This works on the transpose.

In [60]: df.T.groupby(level=0).sum()
Out[60]: 
     0    1
a  2.0  2.0
b  2.0  2.0

a pull-request to fix would be great @ekisslinger

@jreback jreback added Bug Categorical Categorical Data Type Difficulty Intermediate Groupby Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Nov 25, 2017
@jreback jreback added this to the Next Major Release milestone Nov 25, 2017
@jreback jreback changed the title Attempting a groupby over a CategoricalIndex for the .columns index results in a ValueError when using Pandas 0.21.0 BUG: groupby over a CategoricalIndex in axis=1 Nov 25, 2017
@jreback jreback modified the milestones: Next Major Release, 0.21.1 Nov 27, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Categorical Categorical Data Type Groupby Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants