Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cylic data dependencies cause infinite recursion #62

Closed
buckley-w-david opened this issue Jul 2, 2022 · 11 comments · Fixed by #139
Closed

Cylic data dependencies cause infinite recursion #62

buckley-w-david opened this issue Jul 2, 2022 · 11 comments · Fixed by #139
Labels
acknowledged bug Something isn't working enhancement New feature or request low-priority Low Priority

Comments

@buckley-w-david
Copy link

buckley-w-david commented Jul 2, 2022

  • Dataclass Wizard version: 0.22.1
  • Python version: 3.10.5
  • Operating System: Linux 5.18.7-arch1-1

Description

I am modeling some data that is self-referential, as in it may contain references to other object of its own type. In attempting to use dataclass-wizard to parse a dict of this data I received the following error

...
    return typing._eval_type(base_type, base_globals, _TYPING_LOCALS)
  File "/usr/lib/python3.10/typing.py", line 327, in _eval_type
    return t._evaluate(globalns, localns, recursive_guard)
  File "/usr/lib/python3.10/typing.py", line 693, in _evaluate
    type_ = _type_check(
  File "/usr/lib/python3.10/typing.py", line 164, in _type_check
    arg = _type_convert(arg, module=module, allow_special_forms=allow_special_forms)
  File "/usr/lib/python3.10/typing.py", line 141, in _type_convert
    if isinstance(arg, str):
RecursionError: maximum recursion depth exceeded while calling a Python object

What I Did

Here is a simple script to reproduce the issue

from typing import Optional
from dataclasses import dataclass
from dataclass_wizard import asdict, fromdict

@dataclass
class A:
    a: Optional['A'] = None

a = A(a=A(a=A(a=A())))
d = asdict(a)
print(d) # {'a': {'a': {'a': {'a': None}}}}
print(fromdict(A, d))

Any potential cyclic data dependency will trigger this issue, not just strictly self-referential ones.

from typing import Optional
from dataclasses import dataclass
from dataclass_wizard import asdict, fromdict

@dataclass
class A:
    b: Optional['B'] = None

@dataclass
class B:
    a: Optional['A'] = None

a = A(b=B(a=A(b=B(a=None))))
d = asdict(a)
print(d)  # {'b': {'a': {'b': {'a': None}}}}
print(fromdict(A, d))

I did a bit of digging in the code before submitting the issue to get an understanding of why this was happening, and I'm not sure it's an easy fix 😬

Not really expecting speedy resolution assuming you even want to support this kind of use case.

@rnag
Copy link
Owner

rnag commented Aug 6, 2022

Hi @buckley-w-david, thanks for opening this issue. I had thought about it much earlier for a while, like really thought about it, and I'm hopeful because I think the fix should be really simple. I feel like I was overcomplicating it in my head, because I've always struggled with understanding recursion in any form.

Just a stop-gap solution for now, I think once I look into the performance milestones I created, that will involve dynamically generating the (de)serialization code like for from_dict so it might be a little simpler down the road when I get around to tackling that (knock on wood).

The simple workaround I had alluded to involves a few things:

  • add a new Meta flag, recursive_dataclasses or something similar.
  • when this flag is enabled, don't immediately generate and cache the load function for a (nested) dataclass, as this will cause
    recursion if it's self-referential, either at top-level somewhere down the road as mentioned.
  • instead, we want to return a delayed load function (essentially a function to return a function), which will be generated on the first time it's called rather than immediately, and then preferably cached so that subsequent loads are faster.

Sorry if that all sounds confusing, but the good news is that I've effectively put together a hotfix for this in under half an hour (not completely there), so I can confirm that the recursion error is able to be resolved at least. I'll see if I can put together something soon that will hopefully shed a little more light on the direction I had in mind to solve the (tricky) self-referential problem.

@rnag rnag added bug Something isn't working enhancement New feature or request acknowledged labels Aug 6, 2022
@trulite
Copy link

trulite commented Aug 7, 2022

A little bump - I am facing the exact same issue. Happy to test a fix if you have one.
Thanks for an awesome library btw.

@trulite
Copy link

trulite commented Aug 7, 2022

Also a quick question. A work around for me would be to add a custom from_dict for a particular field. I ve read the docs a bit, but am unable to understand how to do this.

@dataclass
class A:
    b: Optional['B'] = None # Can I add a custom from_dict for this property?

@dataclass
class B:
    a: Optional['A'] = None

@rnag
Copy link
Owner

rnag commented Aug 15, 2022

I've checked in a WIP branch with the working changes. I was able to test with enabling a new meta field recursive_classes and confirmed that the use cases above now work as expected when de-serializing with fromdict. I also added new test cases, based on the above.

FWIW, I've also updated the first comment; probably a typo, but I think the expected dict object should be instead:

{'b': {'a': {'b': {'a': None}}}}

@rnag
Copy link
Owner

rnag commented Aug 15, 2022

@trulite I'll have to dig a bit deeper to confirm, but my initial suspicion is that it won't be possible currently to set a per-field parser to use in from_dict. This is probably a good idea for a feature request though - not saying that it would be needed in this particular case, but I imagine there are other situations where it could be quite useful.

@trulite
Copy link

trulite commented Aug 21, 2022

Thanks for now - I am leaving it as a dict and doing a post process to convert to the class I want

@alexander-belikov
Copy link

alexander-belikov commented Dec 19, 2023

Is there any update?

It looks like a basic issue : recursion should stop when there a None or an empty list is encountered.
In the example below for dataclass A the problem can be circumvented by introducing a custom classmethod from_dict(). However it does not catch on for AA that contains A and you would be getting the notorious recursionError: maximum recursion depth exceeded in __instancecheck__ problem.

from dataclasses import dataclass, field
from typing import List, Type

from dataclass_wizard import JSONWizard
from dataclass_wizard.abstractions import W
from dataclass_wizard.type_def import JSONObject


@dataclass
class A(JSONWizard):
    value: int
    children: List["A"] = field(default_factory=list)

    @classmethod
    def from_dict(cls: Type[W], o: JSONObject) -> W:
        value = o.pop("value")
        children = [A.from_dict(oo) for oo in o.pop("children", [])]
        return A(value=value, children=children)


@dataclass
class AA(JSONWizard):
    a: A


nested_dict = {
    "value": 1,
    "children": [{"value": 2, "children": []}, {"value": 3, "children": []}],
}

a_obj = A.from_dict(nested_dict)
aa_obj = AA.from_dict({"a": nested_dict})

print(a_obj)
print(aa_obj)

It would be great to have a solution soon! Do you have any updates @rnag ?

@rnag rnag added the low-priority Low Priority label Nov 4, 2024
@rnag
Copy link
Owner

rnag commented Nov 4, 2024

@alexander-belikov Yes I haven't had time to work on it, but the WIP branch seems to fix this issue. I'll need to revisit it with fresh eyes, see what I was doing, and then create a PR once I have resolved any merge conflicts of course! 😅

@dlenski
Copy link
Contributor

dlenski commented Nov 9, 2024

First of all, thanks for this amazing package! I found my way here thanks to this StackOverflow answer, and have been happily using dataclass-wizard to automate the conversion of JSON schemas to Python dataclasses… until I ran into this issue. 🥲

I checked out the old https://github.com/rnag/dataclass-wizard/tree/WIP-support-cyclic-references branch created by @rnag in 2022.

deleted an inaccurate finding

🆕 also rebased this onto the latest main in https://github.com/dlenski/dataclass-wizard/tree/WIP-support-cyclic-references-new.

@rnag
Copy link
Owner

rnag commented Nov 9, 2024

@dlenski Thanks so much! I will review the changes, and if all looks good, will publish it in the next minor release v0.27.0 along with my updates in #139, to drop support for Python versions that have reached EOL by now.

@rnag
Copy link
Owner

rnag commented Nov 10, 2024

Done! These changes are now live as of v0.27.0, which adds support for cyclic data dependencies. Also see Release Notes. 🙌 🎉

I also updated the documentation on usage, check out the section on Cyclic or “Recursive” Dataclasses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
acknowledged bug Something isn't working enhancement New feature or request low-priority Low Priority
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants