Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-90562: Improve zero argument support for super() in dataclasses when slots=True #124692

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

Bobronium
Copy link

@Bobronium Bobronium commented Sep 27, 2024

Added two tests for two cases that weren't covered in original PR:

  • Using custom descriptors
  • Wrapping function before making it a property

Copy link

cpython-cla-bot bot commented Sep 27, 2024

All commit authors signed the Contributor License Agreement.
CLA signed

@sobolevn sobolevn requested a review from carljm September 27, 2024 18:10
@Bobronium Bobronium force-pushed the issue-90562-dataclasses-zero-argument-super-fix branch from 2e0ca01 to 3b65acf Compare September 27, 2024 18:10
return True
return False


def _find_inner_functions(obj, _seen=None, _depth=0):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the leading underscores on _seen and _depth?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should indicate that these arguments are only used within the function (in recursive calls) and not outside of the function body.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the whole function is private, so it can only be used inside CPython by us :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, you've got a point. Removed the underscores.

_seen.add(id(obj))

_depth += 1
if _depth > 2:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are depth 1 and 2 special? What is this test (and _depth in general) trying to accomplish? Please add comments explaining what's going on here, it's not at all clear from the code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an explanation and few test cases for this behaviour. Caught two bugs in the process \o/

):
# we don't want to inspect custom descriptors just yet
# there's still a chance we'll encounter a pure function
# or a property
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment explaining why encountering a pure function or property means we don't have to inspect custom descriptors.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, there's already an explanation above:

cpython/Lib/dataclasses.py

Lines 1336 to 1339 in 3b65acf

# original class. We can break out of this loop as soon as we
# make an update, since all closures for a class will share a
# given cell. First we try to find a pure function/properties,
# and then fallback to inspecting custom descriptors.

Would you like me to add clarification for this case specifically?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you're right. Maybe change the comment to "... fallback to inspecting custom descriptors if no pure function or property is found".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I've added clarification in both places to be extra clear.

Lib/dataclasses.py Outdated Show resolved Hide resolved
Copy link
Member

@sobolevn sobolevn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great find, thanks a lot for your work! 👍

return True
return False


def _find_inner_functions(obj, _seen=None, _depth=0):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the whole function is private, so it can only be used inside CPython by us :)

Lib/dataclasses.py Outdated Show resolved Hide resolved
return None

for attr in dir(obj):
value = getattr(obj, attr, None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case is always complex, since getattr can evaluate properties and other objects. There are two ways of hanling this case:

  1. Allow all exceptions to raise
  2. Hide all exceptions from user here

I am not quite sure which one is better here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bringing this to my attention!

After some hacking, I've decided to go with the third option: removing any possibility of user defined code execution.

Lib/test/test_dataclasses/__init__.py Outdated Show resolved Hide resolved
Copy link
Member

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks OK to me; I'll let @ericvsmith give it final approval/merge.

I don't love that we've opened the door to an infinite fractal of complexity here; at some point it would probably be better to bite the bullet and introduce a __classcell__ attribute on classes that allows reliably finding this cell in O(1) time.

Lib/dataclasses.py Show resolved Hide resolved
@ericvsmith
Copy link
Member

I agree that the complexity is unfortunate. __classcell__ would be a good change. In the meantime, could inspect.getattr_static or inspect.getmembers_static be used instead of _safe_get_attributes?

This uses builtins.dir, which will trigger custom __getattibute__ if any, and will trigger __get__ on __dict__ descriptor.
@Bobronium
Copy link
Author

In the meantime, could inspect.getattr_static or inspect.getmembers_static be used instead of _safe_get_attributes?

Good idea. I've changed the implementation to use inspect.getmembers_static() instead.

@Bobronium
Copy link
Author

@carljm, friendly ping :)

@carljm
Copy link
Member

carljm commented Nov 4, 2024

As I mentioned above, I would rather have @ericvsmith approve/merge this, since it was his review comments you addressed.

@ericvsmith
Copy link
Member

I've been busy, but it's on my list of things to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants