Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difference in groupby behavior between Pandas 0.13.1 and 0.15.2 #9560

Closed
dakoner opened this issue Feb 26, 2015 · 1 comment
Closed

Difference in groupby behavior between Pandas 0.13.1 and 0.15.2 #9560

dakoner opened this issue Feb 26, 2015 · 1 comment

Comments

@dakoner
Copy link

dakoner commented Feb 26, 2015

Hi, I am seeing a difference in behavior on this groupby between Pandas 0.13.1 and 0.15.2. Specifically, it's like 0.15.2 is doing a cross join while 0.13.1 isn't.

print pandas.DataFrame([
  {'a': 1, 'b': 2, 'c': 3},
  {'a': 4, 'b': 5, 'c': 6}, ]).set_index(
    list('ab')).groupby(level=list('ab')).mean()

0.13.1 produces:

     c
a b   
1 2  3
4 5  6
[2 rows x 1 columns]

while 0.15.2 produces

      c
a b    
1 2   3
  5 NaN
4 2 NaN
  5   6

basically, the same matrix, but with extra cross NaN entries.

We're wondering if this behavior is intentional, or a bug. It wasn't entirely clear from the set of release notes that the groupby behavior changed so much.

@jreback
Copy link
Contributor

jreback commented Feb 27, 2015

the resulting resulting cartesian product of the indices (e.g. the nan entries), were a bug in 0.15+ (maybe only in 0.15.2), and are fixed in 0.16.0 (coming soon), fixed in #9177

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants