-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFE: @dataclasses.dataclass(slots=True) doesn't support methods using zero-arg super() #90562
Comments
We've received a report over at the attrs issue tracker about our test suite failing on Python 3.11. Here's the link: python-attrs/attrs#907 It turns out to be an issue with the no-arg
This is a known issue for which we have implemented workarounds. The workarounds aren't effective for 3.11 though. I have implemented a fix in attrs (python-attrs/attrs#910), but I still thought I'd post this here to maybe get the core devs opinion. Dataclasses exhibit the exact same issue when used with A comprehensive description of the issue follows: since it's impossible to add slotness (i.e. set So these closure cells need to be rewritten when the class is replaced. In Python versions prior to 3.11, the closure cells were immutable so extra effort was needed to rewrite them. The functions are here: https://github.com/python-attrs/attrs/blob/9727008fd1e40bc55cdc6aee71e0f61553f33127/src/attr/_compat.py#L145. In 3.11, our old closure cell rewriting doesn't work any more, but closure cells don't appear to be immutable either, so the fix in my attr PR linked above is simple. Still, it's another branch in the code to support a specific version. I don't know if there's anything actionable here for Python, apart from confirming or denying if this behavior is expected. |
bisected to here: 631f993 is the first new commit
|
I guess at least there should be a warning about this in dataclasses docs? import dataclasses
@dataclasses.dataclass(slots=True)
class A:
pass
@dataclasses.dataclass(slots=True)
class B(A):
def test(self):
super() B().test() |
In my opinion, we should keep it simple for attrs and dataclasses to fix closure cells when a class is replaced and therefore it seems to be correct to have it mutable as it currently is in 3.11. My plan is to implement the fix for dataclasses and some tests for these use cases so the behavior should not change in the future. For attrs it means one more branch in the code now but much simpler code when the support for older releases gets dropped. Any other opinions? |
@frenzy: I'm not sure what your fix would do. You could either describe it in rough terms (if you'd like a pre-PR opinion on the approach), or I'm happy to wait to see your PR. |
We have the same problem reported in attrs here in dataclasses and because it's not tested the way to manipulate __closure__ cells changes frequently. My plan is to implement something similar to this into dataclasses: https://github.com/python-attrs/attrs/blob/5c040f30e3e4b3c9c0f27c8ac6ff13d604c1818c/src/attr/_make.py#L895-L916 Basically, when a new dataclass is created (with slots=True), look for references to the original class and fix them. This fixes the problem reported to attrs in dataclasses and when we fix it and add some tests for it, the future behavior should be more stable. What do you think? |
I changed the issue title to focus this issue on enhance dataclasses to support @dataclasses.dataclass(slots=True) on methods using closures: it would be a new Python 3.11 feature. I created https://bugs.python.org/issue47143 "Add functools.copy_class() which updates closures". |
See also bpo-45520: "Frozen dataclass deep copy doesn't work with __slots__" which is related but a different issue. |
I had a go at implementing the fix @frenzymadness mentioned, @ericvsmith @vstinner if you're able to take a look: #104038 Apologies if you were still planning on implementing this @frenzymadness, as you may have been still waiting for a go-ahead. Since this had stalled for a bit I figured a PR was the easiest way to try and push this forwards. |
I updated the issue title to clarify that the problem here is specifically about the One observation about the proposed fix is that (because it modifies the contents of the cell in place) it "breaks" the original class, in case anyone had a reference to it, e.g. something like this:
With the fix proposed above, Perhaps this is acceptable collateral damage, breaking a rare case in order to fix the more common case? But it could break currently-working code. (Edited to add: I think it's quite unlikely in practice that anyone would have an example like this in current code, where they expect Even putting a new cell into the function's closure (rather than updating the cell content in place) wouldn't fix this, since the closure (and the function object itself) are also shared between the original and the new class. The only way to avoid it would be to make a new function object, using the same code object as the original one. But this is quite tricky given that the function object could be wrapped by arbitrary decorators etc. So perhaps we just have to accept that in order to make the new slotified class fully functional, we have to break the old class. If so, it still seems unfortunate that we have to go scrabbling through all the methods, hoping that at least one of them is not wrapped in a decorator we don't know how to introspect, so we can update the single cell that we actually need to update, just because Python hides it from us. I wonder if there's any reason the class cell shouldn't be left exposed as All in all, this issue mostly makes me think that slots support via class-copying was a mistake. FWIW, there would be another option, also ugly in its own way: you can actually add slots to a class post-creation (in C code), but it's only safe to do so if you know that there aren't any instances of the class yet. This is proxy-detectable via the reference count, since every instance holds a reference to its class. So (via a C helper) we could add slots to the existing class, and error out if you try to make a slotified dataclass out of a class you've already somehow created instances of. This limitation seems arguably better than the more complex limitations arising from this issue. |
At the core sprint last year we discussed the "add slots if no instances have been created" proposal. I think it would be the preferred way to go, but I'm concerned about the impact on other python implementations. |
I think the more general concept of "mark a class when its first instance is created" is reasonable to expect other implementations to be able to provide in some way? Even in CPython, if we found the refcount-based approach to be too subtle, we could pretty easily do a more explicit marker via type flag (though we may not prefer to spend a type flag on this, or do one extra check in instance creation.) |
Since dataclasses normally work by modifying the class, and the new class generation is an (albeit leaked and documented) implementation detail, I think it's expected that using the old class isn't really supported. Either way, it's no more broken than the actual dataclass is currently, which I think is a fair trade in brokenness!
Yeah, it's a pretty ugly solution. I think it's an improvement on the status quo, given the cases where it doesn't work should be fairly uncommon, and avoidable, but it's certainly not perfect so better options are definitely worth considering.
I'd also wondered this, although thought it would be a bit of a shame to have to add a new attribute just for this. Maybe it could be useful for other things though.
The no-instances constraint doesn't sound like an issue. I can't comment on impact on other implementations, but this does sound like the neatest solution from a user perspective. |
Co-authored-by: Adam Turner <[email protected]> Co-authored-by: Jelle Zijlstra <[email protected]> Co-authored-by: Erlend E. Aasland <[email protected]> Co-authored-by: Hugo van Kemenade <[email protected]>
) (cherry picked from commit e3ed574) Co-authored-by: Josh Cannon <[email protected]> Co-authored-by: Adam Turner <[email protected]> Co-authored-by: Jelle Zijlstra <[email protected]> Co-authored-by: Erlend E. Aasland <[email protected]> Co-authored-by: Hugo van Kemenade <[email protected]>
) (cherry picked from commit e3ed574) Co-authored-by: Josh Cannon <[email protected]> Co-authored-by: Adam Turner <[email protected]> Co-authored-by: Jelle Zijlstra <[email protected]> Co-authored-by: Erlend E. Aasland <[email protected]> Co-authored-by: Hugo van Kemenade <[email protected]>
…119350) Co-authored-by: Josh Cannon <[email protected]> Co-authored-by: Adam Turner <[email protected]> Co-authored-by: Jelle Zijlstra <[email protected]> Co-authored-by: Erlend E. Aasland <[email protected]> Co-authored-by: Hugo van Kemenade <[email protected]>
…119351) Co-authored-by: Josh Cannon <[email protected]> Co-authored-by: Adam Turner <[email protected]> Co-authored-by: Jelle Zijlstra <[email protected]> Co-authored-by: Erlend E. Aasland <[email protected]> Co-authored-by: Hugo van Kemenade <[email protected]>
I have a code fix that I think addresses this. Not sure I can get it into 3.13, but I'll try. |
Thanks @ericvsmith. Can you let us know if you have started working on this, and which approach you plan on taking or have taken? @carljm has helpfully reasoned through several options. Among them, adding slots to a class post-creation sounded awesome, until I realized that it's technically impossible to do because memory for member descriptors must be allocated at the time of type creation with the type itself as one block of memory to allow offset-based access: Line 3870 in 15d48ae
I therefore think exposing |
I have a mostly working version of a fix. I'll try to find it and post it here. |
Co-authored-by: Adam Turner <[email protected]> Co-authored-by: Jelle Zijlstra <[email protected]> Co-authored-by: Erlend E. Aasland <[email protected]> Co-authored-by: Hugo van Kemenade <[email protected]>
An issue I think is related: dataclasses cannot be generic (pre- or post-695) when using both
If the dataclass is only frozen or only slotted, no issue:
Would this case be covered by your fix @ericvsmith? 🙏 |
I'm not sure, but I'll check. I'm sprinting on it this week and will check this use case. |
@loic-simon : I can't duplicate this in an unmodified 3.12.3: Python 3.12.3 (main, Apr 9 2024, 08:09:14) [Clang 15.0.0 (clang-1500.1.0.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from dataclasses import dataclass
>>> @dataclass(frozen=True, slots=True)
... class C[T]:
... a:T
...
>>> C[str]
__main__.C[str]
>>> C[str]("foo")
C(a='foo') |
Oh, indeed! I just checked in 3.12.2 and 3.12.3, this has apparently been fixed in 3.12.3, possibly by GH-115165? Sorry for the noise anyway! |
No problem! I'll remove my test case for it. |
And to actually answer this question: I'm basically going with the approach in #104038 of fixing up the closure cells. I'm still working through tests and corner cases, and looking at @carljm's comments in #90562 (comment). I talked to @markshannon at this year's PyCon, and he dissuaded me from the "can modify |
gh-124455) Co-authored-by: @wookie184 Co-authored-by: Carl Meyer <[email protected]>
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
super()
without args calls fordataclasses
with slots #111538super()
in dataclasses whenslots=True
#124692The text was updated successfully, but these errors were encountered: