-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Multi grouper containing a categorical not dropped from index when using groupby with as_index=False #8869
Comments
This is a bug. |
FYI this function is causing the problem. It disregards the value of it is also causing other issues:
also, because the column goes into MultiIndex at some point, category type is lost regardless of the value of
|
@behzadnouri Thanks for looking into it. This behaviour (keep the cartesian product) is expected when grouping with a categorical column, see #8138 In [43]: df
Out[43]:
jim joe jolie
0 0 a 84
1 1 b 23
2 2 c 25
In [44]: df.groupby(['jim', 'joe']).agg('mean')
Out[44]:
jolie
jim joe
0 a 84
1 b 23
2 c 25
In [45]: df['joe'] = df['joe'].astype('category')
In [46]: df.groupby(['jim', 'joe']).agg('mean')
Out[46]:
jolie
jim joe
0 a 84
b NaN
c NaN
1 a NaN
b 23
c NaN
2 a NaN
b NaN
c 25 As for your second point (losing categorical dtype), I don't think it makes a difference right now, but it could bite us when/if a categorical index #7629 is implemented. I would then expect the |
Hello,
The following example definitely seems like a bug. The grouper is not dropped from the index of the resulting DataFrame, even when
as_index = False
.Actually, even the aggregation step completely fails, so there may be more to it, as shown in this example.
result
Compare to the expected result:
expected
The text was updated successfully, but these errors were encountered: