Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index is not hashable #2461

Closed
ghost opened this issue Dec 9, 2012 · 10 comments
Closed

Index is not hashable #2461

ghost opened this issue Dec 9, 2012 · 10 comments

Comments

@ghost
Copy link

ghost commented Dec 9, 2012

hash(df.index)
> /home/user1/src/pandas/pandas/core/index.py(346)__hash__()
    345     def __hash__(self):
--> 346         return hash(self.view(np.ndarray))
    347 

/home/user1/src/pandas/pandas/core/index.pyc in __hash__(self)
    344 
    345     def __hash__(self):
--> 346         return hash(self.view(np.ndarray))
    347 
    348     def __setitem__(self, key, value):

TypeError: unhashable type: 'numpy.ndarray'
@wesm
Copy link
Member

wesm commented Dec 9, 2012

This could be made to work but won't make Index a valid dict key (since == is vector compare)

@ghost
Copy link
Author

ghost commented Dec 10, 2012

since index keys are immutable, having a hash based on them would have enabled a fast check for
index inequality without a full vector compare. not sure it's worth it.

@wesm
Copy link
Member

wesm commented Dec 10, 2012

I'd thought at one point about adding an MD5 hash. That would work for indexes that don't contain Python objects

@wesm
Copy link
Member

wesm commented Dec 10, 2012

Well, I guess you could hash the python objects then md5 the array of hash values. Some probability of collisions / equal md5 hash but unequal indexes. harrumph

@ghost
Copy link
Author

ghost commented Dec 10, 2012

it would need to be at construction time o'course, otherwise no gain over vector compare,
any(idx1==idx2) does not fail fast, so the best case and worst case are the same.
But It turned out that for my case it was enough to check for identity rather then equality.

@wesm
Copy link
Member

wesm commented Dec 10, 2012

We should just write a custom set of vector compares-- the problem with np.array_equal is that it compares all elements rather than failing at the first unequal pair

@wesm
Copy link
Member

wesm commented Jan 18, 2013

Marked as 0.11. Should think about this eventually

@ghost
Copy link
Author

ghost commented Apr 18, 2013

marked 'someday'.

@ghost ghost mentioned this issue Jun 19, 2013
@jtratner
Copy link
Contributor

jtratner commented Oct 1, 2013

is_ method covers some potential uses for this (i.e., comparing views), definitely would be nice to at least have a short-circuiting equality function that we could use in equals.

@ghost ghost assigned jtratner Oct 1, 2013
@ghost
Copy link
Author

ghost commented Jan 10, 2014

In the year since I haven't needed this again.
Intuitevely, Index is immutable -> should be hashable. But indices have name(s) and those are mutable.
And this raises tricky issues of identity, equality and context-specific "right thing to do".

Would need much stronger use case to commit to semantics on implementaition,
and I can't think of one now. Leave it until that mystical, musical, majestical day.

@ghost ghost closed this as completed Jan 10, 2014
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants