Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError: '__dummy__' for pd.crosstab in pandas #10291

Closed
songhuiming opened this issue Jun 5, 2015 · 12 comments
Closed

KeyError: '__dummy__' for pd.crosstab in pandas #10291

songhuiming opened this issue Jun 5, 2015 · 12 comments
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@songhuiming
Copy link

get ~~ KeyError: 'dummy' ~~ when I run the following:

np.random.seed(seed = 99)
s = np.random.randint(1,10,200)
s = pd.Series(np.where(s > 9, np.nan, s))
s1 = s[:100]
s2 = s[100:]
pd.crosstab(s1, s2)

KeyError: '__dummy__
@lexual
Copy link
Contributor

lexual commented Jun 6, 2015

Even simpler example. Perhaps something to do with the indices not overlapping at all.

s1 = pd.Series([1, 2, 3], index=[1, 2, 3])
s2 = pd.Series([4, 5, 6], index=[4, 5, 6])
pd.crosstab(s1, s2)

@lexual
Copy link
Contributor

lexual commented Jun 6, 2015

Believe this is the root cause of things:

http://pandas.pydata.org/pandas-docs/stable/groupby.html#na-group-handling

@lexual
Copy link
Contributor

lexual commented Jun 6, 2015

Yes http://pandas.pydata.org/pandas-docs/stable/groupby.html#na-group-handling is this cause.

Because the 2 indices have no overlapping indexes, this means that each groupby ends up including a nan which then excludes it from groupby result.

You then end up with an empty dataframe and that is the cause of the KeyError, as you're accessing df['dummy'] on an empty dataframe.

@jreback
Copy link
Contributor

jreback commented Jun 7, 2015

yeh, this should just be an empty frame, as there are no cross-tabulations.

@jreback jreback added Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Jun 7, 2015
@jreback jreback added this to the Next Major Release milestone Jun 7, 2015
@lexual
Copy link
Contributor

lexual commented Jun 8, 2015

So this is not a bug?

should we:

  • raise exception
  • return an empty dataframe?

@jreback
Copy link
Contributor

jreback commented Jun 8, 2015

return an empty frame

@dan7davis
Copy link

I'm getting the same KeyError: 'dummy' for my grouped data.

And I'm not really sure how to fix it / what you mean by 'return an empty frame.' Care to dumb it down/show precisely what you mean?

Thanks!

@jreback
Copy link
Contributor

jreback commented Jan 5, 2016

@dan7davis this needs a fix that would return an empty frame when catching the KeyError exception raised by the example above

https://github.com/pydata/pandas/blob/master/pandas/tools/pivot.py#L151, just need something like:

try:
    table = table[values[0]]
except KeyError:
    pass

@dan7davis
Copy link

@jreback problem solved. thank you! really appreciate the alacrity

@jreback
Copy link
Contributor

jreback commented Jan 5, 2016

want to do a pull request to fix in master?

@dan7davis
Copy link

I'm (very) new to coding/python/GitHub, so unfortunately I have no idea
what that means. But it sounds useful for me to know & helpful for others,
so I'd be happy to learn/try..

On Tue, Jan 5, 2016 at 3:32 PM, Jeff Reback [email protected]
wrote:

want to do a pull request to fix in master?


Reply to this email directly or view it on GitHub
#10291 (comment).

@jreback
Copy link
Contributor

jreback commented Jan 6, 2016

contributing is a great way to learn ...., see our docs: http://pandas.pydata.org/pandas-docs/stable/contributing.html

any questions, pls ask.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

No branches or pull requests

4 participants