Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: incorrectly output index ordering with an ordered Categorical and pivot #8731

Closed
jreback opened this issue Nov 4, 2014 · 0 comments
Closed
Labels
Categorical Categorical Data Type Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Nov 4, 2014

xref #8860, soln might be the same

from SO

In [32]: df = DataFrame({'Sales' : [100,120,220], 'Month' : ['January','January','January'], 'Year' : [2013,2014,2013]})

In [33]: df
Out[33]: 
     Month  Sales  Year
0  January    100  2013
1  January    120  2014
2  January    220  2013

In [34]: df['Month'] = df['Month'].astype('category').cat.set_categories(['January', 'February', 'March', 'April', 'May', 'June',      'July', 'August', 'September', 'October', 'November', 'December'])

In [35]: df.dtypes
Out[35]: 
Month    category
Sales       int64
Year        int64
dtype: object

In [36]: df['Month']
Out[36]: 
0    January
1    January
2    January
Name: Month, dtype: category
Categories (12, object): [January < February < March < April ... September < October < November < December]

In [37]: df.pivot_table(values="Sales", index="Month")
Out[37]: 
Month
January      146.666667
February            NaN
March               NaN
April               NaN
May                 NaN
June                NaN
July                NaN
August              NaN
September           NaN
October             NaN
November            NaN
December            NaN
Name: Sales, dtype: float64

In [38]: df.pivot_table(values="Sales", index="Month").index
Out[38]: Index([u'January', u'February', u'March', u'April', u'May', u'June', u'July', u'August', u'September', u'October', u'November', u'December'], dtype='object')

In [39]: result = df.pivot_table(values='Sales', index="Month", columns="Year", aggfunc="sum")

In [40]: result
Out[40]: 
Year       2013  2014
Month                
April       NaN   NaN
August      NaN   NaN
December    NaN   NaN
February    NaN   NaN
January     320   120
July        NaN   NaN
June        NaN   NaN
March       NaN   NaN
May         NaN   NaN
November    NaN   NaN
October     NaN   NaN
September   NaN   NaN
@jreback jreback added Reshaping Concat, Merge/Join, Stack/Unstack, Explode Categorical Categorical Data Type labels Nov 4, 2014
@jreback jreback added this to the 0.15.2 milestone Nov 4, 2014
@jreback jreback modified the milestones: 0.16.0, 0.15.2 Dec 3, 2014
@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 6, 2015
@jreback jreback modified the milestones: 0.19.2, Next Major Release Dec 6, 2016
@jreback jreback closed this as completed in 1725d24 Dec 6, 2016
@jorisvandenbossche jorisvandenbossche modified the milestones: 0.19.0, 0.19.2 Dec 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants