Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try fixing pickle load issue #28

Merged
merged 2 commits into from
Feb 11, 2021
Merged

Conversation

mvdbeek
Copy link
Contributor

@mvdbeek mvdbeek commented Feb 10, 2021

We ran into the following traceback while unpickling
and object that contained a NestedMutableDict:

In [2]: dataset = pickle.load(open('/tmp/hda.pickle', 'rb'))
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-719a86be550f> in <module>
----> 1 dataset = pickle.load(open('/tmp/hda.pickle', 'rb'))
~/src/galaxy/.venv/lib/python3.8/site-packages/sqlalchemy_json/track.py in __setitem__(self, key, value)
    104
    105     def __setitem__(self, key, value):
--> 106         self.changed('__setitem__: %r=%r', key, value)
    107         super(TrackedDict, self).__setitem__(key, self.convert(value, self))
    108
~/src/galaxy/.venv/lib/python3.8/site-packages/sqlalchemy_json/track.py in changed(self, message, *args)
     36             logger.debug('%s: %s', self._repr(), message % args)
     37         logger.debug('%s: changed', self._repr())
---> 38         if self.parent is not None:
     39             self.parent.changed()
     40         elif isinstance(self, Mutable):
AttributeError: 'NestedMutableDict' object has no attribute 'parent'

This works as a quick fix for us. The pickle behavior of not running
though __init__ is decribed here

I'm not sure if this is the right fix, any help is appreciated @edelooff.

@edelooff
Copy link
Owner

Hi @mvdbeek, thanks for the description and proposed solution. I'm not sure moving the instance variable to the class definition is the correct one. It works and doesn't do anything untoward, but it looks like a bug due to class vs instance variable confusion.

I tried playing around with __getstate__ and __setstate__ a little, but that didn't lead to anything either. The documentation does contain a note that provided some direction though:

Note: At unpickling time, some methods like __getattr__(), __getattribute__(), or __setattr__() may be called upon the instance. In case those methods rely on some internal invariant being true, the type should implement __new__() to establish such an invariant, as __init__() is not called when unpickling an instance.

So what I tried locally is replacing the __init__ with a __new__ method, as follows:

    def __new__(cls, *args, **kwds):
        tracked = super().__new__(cls, *args, **kwds)
        tracked.parent = None
        return tracked

I've tried that locally with a few simple tests and everything seems to behave as it should, but if you could give it a try on your more real-world test case, I'd like to hear whether it works for you. For completeness I've included the test script:

import pickle

from sqlalchemy_json import NestedMutableDict


one = NestedMutableDict({"numbers": [1, 2, 3, 4]})
two = NestedMutableDict({"numbers": [5, 6, 7]})
one_reloaded = pickle.loads(pickle.dumps(one))
two_reloaded = pickle.loads(pickle.dumps(two))
assert one == one_reloaded
assert two == two_reloaded

one_reloaded["numbers"].append(5)
assert one_reloaded["numbers"] == [1, 2, 3, 4, 5]
assert one["numbers"] == [1, 2, 3, 4]

assert one_reloaded["numbers"].parent is one_reloaded
assert two_reloaded["numbers"].parent is two_reloaded
assert one_reloaded is not two_reloaded

print("Pickle -> unpickle works")

@mvdbeek
Copy link
Contributor Author

mvdbeek commented Feb 11, 2021

excellent, yes, that works too!

Many thanks @edeloof, this looks much better than adding parent to the
class variable.
@mvdbeek
Copy link
Contributor Author

mvdbeek commented Feb 11, 2021

I've added your proposed fix and added some unit tests going through the examples from the readme.

@edelooff edelooff merged commit 796d100 into edelooff:master Feb 11, 2021
@edelooff
Copy link
Owner

Thanks a bunch, especially for the tests! I'll look into actually doing a new release to PyPI based on the recent changes when I find the time for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants