Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PMap.__getitem__ gets the wrong value #212

Open
brunonicko opened this issue Sep 29, 2020 · 5 comments
Open

PMap.__getitem__ gets the wrong value #212

brunonicko opened this issue Sep 29, 2020 · 5 comments

Comments

@brunonicko
Copy link

PMap.__getitem__ will return the wrong value if the object used for the key defines a unique __hash__ and a custom __eq__ method.

Repro (tested with python 3.8 and pyrsistent 0.17.3):

from pyrsistent import pmap

class CustomObject(object):
    __slots__ = ()

    def __hash__(self):
        return object.__hash__(self)

    def __eq__(self, other):  # Having __eq__ declared causes the failure
        return True  # Changing this to False will fail with a different error

keys = []
py_dict = {}
for i in range(10000):
    key = CustomObject()
    keys.append(key)
    py_dict[key] = i

p_map = pmap(py_dict)
assert len(py_dict) == len(p_map)
assert set(py_dict.keys()) == set(p_map.keys())

for i, key in enumerate(keys):
    assert py_dict[key] == i
    assert p_map[key] == i  # Fails here when `i` is a big number, __getitem__ gets the wrong value
@brunonicko
Copy link
Author

Sorry, I didn't introduce myself. My name is Bruno and I work with python development for visual effects pipelines.
I really like pyrsistent and I think it's an awesome tool for developing modern Python applications. I came across this bug recently, and I am trying to figure out why it's behaving that way. I also noticed this is not a problem when running Python 2.7 and pyrsistent 0.16.1, so maybe it's because of some behaviour introduced in newer python versions that I'm not aware of. I'll keep investigating to see if I can find more information about what's going on.
Cheers.

@brunonicko
Copy link
Author

Thinking more about it, I guess my example breaks the rules of how hashing and equality should work in good python code in the first place. So even though pmap seems to behave differently than a regular dict in this case, maybe it doesn't matter since this particular example is bad and should not be implemented that way anyway.
Feel free to just reject the request, and sorry for the confusion :)

@tobgu
Copy link
Owner

tobgu commented Sep 30, 2020

Hi Bruno!

OK, no worries. Having two object that compare equal but result in different hashes seems odd to me. I'm a bit surprised about the difference between Python 2 and 3 though. I need to do some investigation. Hope to find some time in the next days.

@alchayward
Copy link

alchayward commented Jan 5, 2021

This difference comes from an implementation detail in the python dict c-code. Infact, the code in the example for the ordinary dict only passes because the keys are enumerated in the order they were added to the dict. If you include the lines
import random; random.shuffle(keys)
before the for loop, the assert will (usually) fail.

As this behaviour is not in the spec for dict, it may change in the future.

@pdivos
Copy link

pdivos commented Feb 21, 2021

just a suggestion, but eq returning True with hashcode returning nonequal values could throw an exception at runtime? to me that would feel the right thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants