Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

With NumPy 1.20, SymPy generated code cannot be serialized with dill #405

Open
cuihantao opened this issue Mar 4, 2021 · 15 comments
Open

Comments

@cuihantao
Copy link

This issue has been posted to the NumPy repository. Since it's related to dill, I'm reposting it here. The post to NumPy is at numpy/numpy#18547

I have been using SymPy to generate NumPy code through lambdify and using dill to serialize the code. Since upgraded to NumPy 1.20.1, some generated code cannot be serialized correctly due to a RecursionError. The example code works with NumPy 1.19.

I have posted this issue in the SymPy and NumPy Gitter rooms, and I'm posting my bisecting results here.

Reproducing code example:

Prerequisites: SymPy 1.7.1, dill 0.3.3, NumPy 1.20.1

import dill
import sympy
from sympy.abc import x

dill.settings['recurse'] = True

expr = sympy.sympify('re(x)')   # fails in 1.20.1, works in 1.19

# expr = sympy.sympify('log(x)')   # works in both 1.20.1 and 1.19

lfunc = sympy.lambdify(x, expr, 'numpy')

with open("out.pkl", 'wb') as f:
    dill.dump(lfunc, f)

Based on my nonexhausive testings, the error occurs when the function includes re or im. Other functions can be successfully serialized..

Error message:

Traceback (most recent call last):
  File "test_dill.py", line 12, in <module>
    dill.dump(lfunc, f)
  File "/Users/hcui7/miniconda3/envs/np120/lib/python3.8/site-packages/dill/_dill.py", line 267, in dump
    Pickler(file, protocol, **_kwds).dump(obj)
  File "/Users/hcui7/miniconda3/envs/np120/lib/python3.8/site-packages/dill/_dill.py", line 454, in dump
    StockPickler.dump(self, obj)
  File "/Users/hcui7/miniconda3/envs/np120/lib/python3.8/pickle.py", line 487, in dump
    self.save(obj)
  File "/Users/hcui7/miniconda3/envs/np120/lib/python3.8/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/Users/hcui7/miniconda3/envs/np120/lib/python3.8/site-packages/dill/_dill.py", line 1422, in save_function
    globs = globalvars(obj, recurse=True, builtin=True)
  File "/Users/hcui7/miniconda3/envs/np120/lib/python3.8/site-packages/dill/detect.py", line 220, in globalvars
    func.update(globalvars(nested_func, True, builtin))
  File "/Users/hcui7/miniconda3/envs/np120/lib/python3.8/site-packages/dill/detect.py", line 220, in globalvars
    func.update(globalvars(nested_func, True, builtin))
  File "/Users/hcui7/miniconda3/envs/np120/lib/python3.8/site-packages/dill/detect.py", line 220, in globalvars
    func.update(globalvars(nested_func, True, builtin))
  [Previous line repeated 981 more times]
  File "/Users/hcui7/miniconda3/envs/np120/lib/python3.8/site-packages/dill/detect.py", line 213, in globalvars
    func.update(nestedglobals(getattr(orig_func, func_code)))
  File "/Users/hcui7/miniconda3/envs/np120/lib/python3.8/site-packages/dill/detect.py", line 170, in nestedglobals
    dis.dis(func) #XXX: dis.dis(None) disassembles last traceback
  File "/Users/hcui7/miniconda3/envs/np120/lib/python3.8/dis.py", line 79, in dis
    _disassemble_recursive(x, file=file, depth=depth)
  File "/Users/hcui7/miniconda3/envs/np120/lib/python3.8/dis.py", line 373, in _disassemble_recursive
    disassemble(co, file=file)
  File "/Users/hcui7/miniconda3/envs/np120/lib/python3.8/dis.py", line 369, in disassemble
    _disassemble_bytes(co.co_code, lasti, co.co_varnames, co.co_names,
  File "/Users/hcui7/miniconda3/envs/np120/lib/python3.8/dis.py", line 401, in _disassemble_bytes
    for instr in _get_instructions_bytes(code, varnames, names,
  File "/Users/hcui7/miniconda3/envs/np120/lib/python3.8/dis.py", line 321, in _get_instructions_bytes
    labels = findlabels(code)
  File "/Users/hcui7/miniconda3/envs/np120/lib/python3.8/dis.py", line 437, in findlabels
    for offset, op, arg in _unpack_opargs(code):
  File "/Users/hcui7/miniconda3/envs/np120/lib/python3.8/dis.py", line 421, in _unpack_opargs
    for i in range(0, len(code), 2):
RecursionError: maximum recursion depth exceeded in comparison

NumPy/Python version information:

1.20.1 3.8.8 | packaged by conda-forge | (default, Feb 20 2021, 16:12:38)
[Clang 11.0.1 ]

@mmckerns
Copy link
Member

mmckerns commented Mar 9, 2021

If you haven't cross-posted to sympy, you may want to do that. Two things you may want to try, and report the results here: (1) rerun with dill.detect.trace(True), and (2) rerun with dill.settings['recurse'] = True.

@cuihantao
Copy link
Author

Thanks, @mmckerns. I believe dill.settings['recurse'] = True was used with my original run.

@mmckerns
Copy link
Member

Indeed. I didn't see that. Posting the trace may be insightful, however.

@cuihantao
Copy link
Author

Thanks. WIth dill.detect.trace(True), I got one additional line of warning at the beginning, but it does not seem to be informative...

F1: <function _lambdifygenerated at 0x7fe96469b3a0>
---------------------------------------------------------------------------
RecursionError                            Traceback (most recent call last)
<ipython-input-3-0c1e12e829b4> in <module>
     13 
     14 with open("out.pkl", 'wb') as f:
---> 15     dill.dump(lfunc, f)

~/miniconda3/envs/np120/lib/python3.8/site-packages/dill/_dill.py in dump(obj, file, protocol, byref, fmode, recurse, **kwds)
    265     _kwds = kwds.copy()
    266     _kwds.update(dict(byref=byref, fmode=fmode, recurse=recurse))
--> 267     Pickler(file, protocol, **_kwds).dump(obj)
    268     return
    269 

~/miniconda3/envs/np120/lib/python3.8/site-packages/dill/_dill.py in dump(self, obj)
    452             raise PicklingError(msg)
    453         else:
--> 454             StockPickler.dump(self, obj)
    455         stack.clear()  # clear record of 'recursion-sensitive' pickled objects
    456         return

~/miniconda3/envs/np120/lib/python3.8/pickle.py in dump(self, obj)
    485         if self.proto >= 4:
    486             self.framer.start_framing()
--> 487         self.save(obj)
    488         self.write(STOP)
    489         self.framer.end_framing()

~/miniconda3/envs/np120/lib/python3.8/pickle.py in save(self, obj, save_persistent_id)
    558             f = self.dispatch.get(t)
    559             if f is not None:
--> 560                 f(self, obj)  # Call unbound method with explicit self
    561                 return
    562 

~/miniconda3/envs/np120/lib/python3.8/site-packages/dill/_dill.py in save_function(pickler, obj)
   1420             # recurse to get all globals referred to by obj
   1421             from .detect import globalvars
-> 1422             globs = globalvars(obj, recurse=True, builtin=True)
   1423             # remove objects that have already been serialized
   1424            #stacktypes = (ClassType, TypeType, FunctionType)

~/miniconda3/envs/np120/lib/python3.8/site-packages/dill/detect.py in globalvars(func, recurse, builtin)
    218                    #func.remove(key) if key in func else None
    219                     continue  #XXX: globalvars(func, False)?
--> 220                 func.update(globalvars(nested_func, True, builtin))
    221     elif iscode(func):
    222         globs = vars(getmodule(sum)).copy() if builtin else {}

... last 1 frames repeated, from the frame below ...

~/miniconda3/envs/np120/lib/python3.8/site-packages/dill/detect.py in globalvars(func, recurse, builtin)
    218                    #func.remove(key) if key in func else None
    219                     continue  #XXX: globalvars(func, False)?
--> 220                 func.update(globalvars(nested_func, True, builtin))
    221     elif iscode(func):
    222         globs = vars(getmodule(sum)).copy() if builtin else {}

RecursionError: maximum recursion depth exceeded in comparison

@mmckerns
Copy link
Member

This says that it first pickles the function _lambdifygenerated, and then the next step would be to pickle the first argument to this function... and it quickly is bypassing dill and in L454, it goest into pickle. So the trace shows where to start. Thanks for that.

@cuihantao
Copy link
Author

Thanks. You might be interested in the discussion in the NumPy repository. Thanks for helping.

numpy/numpy#18547

@asmeurer
Copy link

Is the fact that the lambdified function is dynamically generated somehow tricking dill into doing the wrong thing for the real function used inside of it? As I pointed out on the NumPy issue, it doesn't happen if you use a normal function instead. Also if there is anything we can do to make lambdified functions serialize better we would be happy to make that change in SymPy.

@mmckerns
Copy link
Member

@asmeurer: I'll dedicate a little bit of time this week to investigate it, and see if I can come up with a solution.

@mmckerns
Copy link
Member

I can confirm I can reproduce the error. @asmeurer: I seem to remember a previous issue with serializing a real -- I'll see if I can dig that up, if it exists.

@mmckerns
Copy link
Member

mmckerns commented Jun 5, 2022

Confirming this is still an issue with:

>>> numpy.__version__
'1.22.4'
>>> dill.__version__
'0.3.6.dev0'
>>> sympy.__version__
'1.10.1'

Also, posting traceback with recurse = False:

>>> dill.settings['recurse'] = False
>>> dill.dumps(lfunc)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/mmckerns/lib/python3.9/site-packages/dill-0.3.6.dev0-py3.9.egg/dill/_dill.py", line 364, in dumps
    dump(obj, file, protocol, byref, fmode, recurse, **kwds)#, strictio)
  File "/Users/mmckerns/lib/python3.9/site-packages/dill-0.3.6.dev0-py3.9.egg/dill/_dill.py", line 336, in dump
    Pickler(file, protocol, **_kwds).dump(obj)
  File "/Users/mmckerns/lib/python3.9/site-packages/dill-0.3.6.dev0-py3.9.egg/dill/_dill.py", line 620, in dump
    StockPickler.dump(self, obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 487, in dump
    self.save(obj)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "/Users/mmckerns/lib/python3.9/site-packages/dill-0.3.6.dev0-py3.9.egg/dill/_dill.py", line 2114, in save_function
    _save_with_postproc(pickler, (_create_function, (
  File "/Users/mmckerns/lib/python3.9/site-packages/dill-0.3.6.dev0-py3.9.egg/dill/_dill.py", line 1284, in _save_with_postproc
    pickler._batch_setitems(iter(source.items()))
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 997, in _batch_setitems
    save(v)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/pickle.py", line 578, in save
    rv = reduce(self.proto)
TypeError: cannot pickle 'PyCapsule' object

This case may be fixed by: #477

@anivegesana
Copy link
Contributor

I think that the the recurse issue is related to #466. The function dill.detect.globalvars recurses indefinitely due to a cycle. It requires fixes to the globalvars function that correctly handles a global dict being shared between multiple functions.

>>> dill.detect.globalvars(lfunc)
dill.detect.globalvars <function _lambdifygenerated at 0x10ef8a050>
dill.detect.globalvars <function real at 0x10ebbc940>
dill.detect.globalvars <function _real_dispatcher at 0x10ebbc670>
dill.detect.globalvars <function real at 0x10ebbc8b0>
dill.detect.globalvars <built-in function asanyarray>
dill.detect.globalvars None
dill.detect.globalvars <function real at 0x10ebbc940>
dill.detect.globalvars <function _real_dispatcher at 0x10ebbc670>
dill.detect.globalvars <function real at 0x10ebbc8b0>
dill.detect.globalvars <built-in function asanyarray>
dill.detect.globalvars None

@mmckerns
Copy link
Member

mmckerns commented Jun 6, 2022

The function dill.detect.globalvars recurses indefinitely due to a cycle. It requires fixes to the globalvars function that correctly handles a global dict being shared between multiple functions.

Yeah... I'm not sure if it's due to the issue you quoted, but definitely globalvars needs some cycle-breaking.

@anivegesana
Copy link
Contributor

It turns out the recurse issue was magically fixed by #486 somehow. Nonetheless, globalvars still needs some cycle-breaking.

@cuihantao
Copy link
Author

cuihantao commented Aug 28, 2022

I seem to remember a previous issue with serializing a real

The issue still exists, and it might have to do with dill.detect.globalvars.

The code below works fine:

import numpy as np
import dill

dill.settings['recurse'] = False

def foo(a):
    return np.real(a) / np.real(a)

dill.detect.globalvars(foo)

But importing real from numpy will cause an error in resolving the globalvars. The code below fails:

from numpy import real

def bar(a):
    return real(a) / real(a)

dill.detect.globalvars(bar)  # maximum recursion depth reached

Although dill.dump() works for both foo and bar, the recursion issue causes weird behaviors in my program. I've not been able to generate an MWE, but if you are interested, you can install the andes package with pip install git+https://github.com/cuihantao/andes@develop and do

import andes

import dill 
dill.settings['recurse'] = True

from multiprocess import Pool

# `andes.system.example()` returns a new object every time

one_system = [andes.system.example()]

systems = [andes.system.example(), andes.system.example()]

first_system = [systems[0]]

def ex_runner(system):    
    return system
    pass


ret = pool.map(ex_runner, systems)  # fails: recursion issue
ret = pool.map(ex_runner, first_system)  # fails: recursion issue
ret = pool.map(ex_runner, one_system)  # works, but like `first_system`, `one_system` is a single-item list

dill.detect.globalvars(one_system[0].Fault.calls.s["gf"])  # fails due to recursion

I followed the discussions above and tried dill.settings['recurse'] = False. It seems to work around the issue.

@mmckerns
Copy link
Member

mmckerns commented Aug 29, 2022

@cuihantao: Your issue appears to be different than this issue. Please open a new issue and re-post.
Also include your traceback and your version of dill, python, and numpy... as I'm not able to reproduce your results (regardless of the value of dill.settings['recurse']). See the following:

Python 3.7.13 (default, May 10 2022, 11:13:40) 
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> import dill
>>> from numpy import real
>>> 
>>> def bar(a):
...     return real(a) / real(a)
... 
>>> dill.detect.globalvars(bar)
{'real': <function real at 0x1124a35f0>}
>>> 
>>> np.__version__
'1.21.6'
>>> dill.__version__
'0.3.6.dev0'
>>> 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants