Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: Support __new__ signatures for classses #320

Open
Viicos opened this issue Aug 30, 2024 · 14 comments
Open

feature: Support __new__ signatures for classses #320

Viicos opened this issue Aug 30, 2024 · 14 comments
Assignees
Labels
feature New feature or request fund Issue priority can be boosted

Comments

@Viicos
Copy link

Viicos commented Aug 30, 2024

Is your feature request related to a problem? Please describe.

The Class.parameters property uses __init__ by default:

griffe/src/_griffe/models.py

Lines 1605 to 1615 in 58eb9f4

def parameters(self) -> Parameters:
"""The parameters of this class' `__init__` method, if any.
This property fetches inherited members,
and therefore is part of the consumer API:
do not use when producing Griffe trees!
"""
try:
return self.all_members["__init__"].parameters # type: ignore[union-attr]
except KeyError:
return Parameters()

It would be nice if __new__ could be supported as well. Behavior when both are defined should probably be considered. The typing specification regarding constructors might help in defining a strategy.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Boost priority

  • Boost priority in our backlog through Polar.sh. Higher pledge, higher priority.
  • Minimum pledge by user/organization is $5, minimum amount for boost is $30.
  • View all issues with pledges.
  • We receive the funds once the issue is completed and confirmed by you.
  • Features with the insiders label are released to sponsors first, and tied to a funding goal.
Fund with Polar
@pawamoy
Copy link
Member

pawamoy commented Aug 30, 2024

Thanks for the request. Could you give a concrete example where this would be needed/useful?

@Viicos
Copy link
Author

Viicos commented Aug 30, 2024

For instance: https://github.com/pydantic/pydantic-core/pull/1420/files#diff-7e296e0875fa785fb0d0f6904dc84ce9145cdd30e12b96ef6cb37d48752135e1R77.

But more generally, people can define __new__ methods without providing any __init__. Pushing things a bit more, the metaclass' __call__ is the one used first and could be supported as well. The typing specification relies on the runtime behavior, so that's why I linked it as a reference.

The logic to determine the correct signature can get overly complicated. Maybe what could be done for now (and to avoid introducing breaking changes):

If no __init__ is defined (i.e. __init__ is object.__init__ iirc), fallback to a __new__.

This does not reflect the runtime behavior but is probably the past of least resistance here. Wdyt? Happy to push a PR.

@pawamoy
Copy link
Member

pawamoy commented Aug 30, 2024

Thanks 🙂

Sounds like this can quickly become super complicated indeed. How does runtime behave with class inheritance and metaclasses 😵?

From the docs you linked ( 🙏 ):

At runtime, a call to a class’ constructor typically results in the invocation of three methods in the following order:

  1. The __call__ method of the metaclass (which is typically supplied by the type class but can be overridden by a custom metaclass and which is responsible for calling the next two methods)
  2. The __new__ static method of the class
  3. The __init__ instance method of the class

Do I understand correctly that when doing MyClass(...), first the closest metaclass __call__ method (according to MRO) is called, then the closest __new__ method is called, then the closest __init__ method is called?

@pawamoy
Copy link
Member

pawamoy commented Aug 30, 2024

What about the following?

class A:
    def __init__(self, ...): ...

class B(A):
    def __new__(cls, ...): ...

class C(B): ...

Should we use the parameters from B.__new__ or from A.__init__? 🤔

@pawamoy
Copy link
Member

pawamoy commented Aug 30, 2024

>>> class A:
...     def __init__(self, a, b): ...
... 
>>> class B(A):
...     def __new__(cls, c, d): ...
... 
>>> b = B()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: B.__new__() missing 2 required positional arguments: 'c' and 'd'

@pawamoy
Copy link
Member

pawamoy commented Aug 30, 2024

Looks like __new__ always takes precedence if defined:

>>> class A:
...     def __new__(cls, a, b): ...
... 
>>> class B(A):
...     def __init__(self, c, d): ...
... 
>>> b = B()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: A.__new__() missing 2 required positional arguments: 'a' and 'b'

Fortunately that seems to make the change easy: simply try to get params from all_members["__new__"] first, then fallback to __init__, then return empty params.

Metaclasses can wait for now 🙈

@Viicos
Copy link
Author

Viicos commented Aug 30, 2024

Note that default C implementations of __call__, __new__ and __init__ exist at runtime. The typing specification describes in which cases they should be skipped. That's why I also mentioned checking for __init__ is object.__init__.

Fortunately that seems to make the change easy: simply try to get params from all_members["__new__"] first, then fallback to __init__, then return empty params.

Looks good, however, I'm a bit afraid that this may break existing setups. It is common for classes to implement a "catch-all" __new__(*args, **kwargs) to do some special logic, while still having the actual arguments described and typed in __init__. That's why I suggested having a non breaking approach first.


Metaclasses can wait for now

I don't expect anyone to request support for it anyway. Mypy doesn't even support it and it is very uncommon to implement type.__call__.

@pawamoy
Copy link
Member

pawamoy commented Aug 30, 2024

Note that default C implementations of call, new and init exist at runtime. The typing specification describes in which cases they should be skipped. That's why I also mentioned checking for init is object.init.

The thing is, our logic happens after visiting AST or inspecting runtime objects. At that point we don't see any of the default C implementations. We only see what was declared in the loaded modules.

It is common for classes to implement a "catch-all" new(*args, **kwargs) to do some special logic, while still having the actual arguments described and typed in init.

I'm not sure to understand. If you define both __new__ and __init__, only __new__ is called, so why describing parameters in __init__, which isn't called?

>>> class A:
...     def __new__(cls, *args, **kwargs):
...         print("in new")
...         print(args, kwargs)
...     def __init__(self, *args, **kwargs):
...         print("in init")
...         print(args, kwargs)
... 
>>> a = A()
in new
() {}

EDIT: ah, unless we're supposed to explicitly call __init__ from __new__ (as you can see I'm not at all comfortable with __new__ 😅)

@pawamoy
Copy link
Member

pawamoy commented Aug 30, 2024

Ah, nope:

If new() is invoked during object construction and it returns an instance of cls, then the new instance’s init() method will be invoked like init(self[, ...]), where self is the new instance and the remaining arguments are the same as were passed to the object constructor.

@pawamoy
Copy link
Member

pawamoy commented Aug 30, 2024

So, yeah, it seems to make sense to use the __init__ signature when it is defined, and the __new__ one otherwise. But that now complicates the logic a bit:

  • if __init__ is declared in this class, use it
  • otherwise if __new__ is declared in this class, use it
  • otherwise if __init__ is declared in a parent class, use it
  • otherwise if __new__ is declared in a parent class, use it

Not even sure about the last two points, as maybe we should loop on the first two while following the MRO...

All this to support the case where devs don't want to duplicate correct signature from __init__ into __new__ 🤔? But yeah, backward compatibility 😅

@pawamoy
Copy link
Member

pawamoy commented Aug 30, 2024

What about the following:

class A:
    def __init__(self, param1: str): ...

class B(A):
    def __new__(cls, *args, **kwargs):
        # some logic
        return super().__new__(cls)

Surely users would like the parameters data to be fetched from A.__init__ and not B.__new__?

Sounds like a case where we won't be able to satisfy everyone. I'm wondering if this should live in an extension, but I'm not sure the extension system will support it in its current state.

@Viicos
Copy link
Author

Viicos commented Aug 30, 2024

I'm not sure to understand. If you define both __new__ and __init__, only __new__ is called, so why describing parameters in __init__, which isn't called?

Usually you call super().__new__ in your user defined __new__ (which will probably be the default C implementation. I can't recall if objec.__new__ is the one responsible for calling __init__, or if this is done by type.__call__ instead).

Sounds like a case where we won't be able to satisfy everyone.

We can check how Sphinx does it. I'm not entirely sure about the current logic, but PR can be found here: sphinx-doc/sphinx#7384. It supports metaclass' __call__, but of course can be skipped.

@pawamoy
Copy link
Member

pawamoy commented Aug 30, 2024

Awesome, thanks for the link!

Looks like they give precedence to __new__, like runtime behavior:

        # Now we check if the 'obj' class has a '__new__' method
        new = get_user_defined_function_or_method(self.object, '__new__')
        if new is not None:
            self.env.app.emit('autodoc-before-process-signature', new, True)
            try:
                return inspect.signature(new, bound_method=True)
            except ValueError:
                pass

        # Finally, we should have at least __init__ implemented
        init = get_user_defined_function_or_method(self.object, '__init__')
        if init is not None:
            ...

If neither __new__ or __init__ are "user-defined", they rely on inspect.getsignature (sphinx-autodoc almost only uses dynamic analysis IIRC).

In Griffe we just need to decide whether we give precedence to __new__ or __init__. I'd be inclined to give precedence to __new__ (like sphinx-autodoc), because it matches runtime behavior, makes for an easy update, and still lets projects that declare __new__(cls, *args, **kwargs) fix their signature (or, well, forces them to do so 😄). Also, it would match the concrete use-case that is described here (the classes/stubs in Pydantic linked above).

I'd be interested to see how common __new__(cls, *args, **kwargs) is. Do you have examples of popular projects that do that?

@Viicos
Copy link
Author

Viicos commented Aug 31, 2024

I'd be interested to see how common __new__(cls, *args, **kwargs) is. Do you have examples of popular projects that do that?

Couldn't find anything relevant in a Github search. But maybe this isn't too much of a big deal, seems like Sphinx introduced the functionality as a non breaking change and no one complained about it :)

@pawamoy pawamoy added the fund Issue priority can be boosted label Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request fund Issue priority can be boosted
Projects
None yet
Development

No branches or pull requests

2 participants