Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corner case: What about built-in generics' subclasses? #9

Open
bswck opened this issue Dec 23, 2023 · 16 comments
Open

Corner case: What about built-in generics' subclasses? #9

bswck opened this issue Dec 23, 2023 · 16 comments

Comments

@bswck
Copy link
Contributor

bswck commented Dec 23, 2023

To ensure full compatibility, I think that

from typing import ForwardRef
from eval_type_backport import eval_type_backport

class DictSubclass(dict):
    pass

eval_type_backport(ForwardRef('DictSubclass[str, int]'), globals())

should not raise an error on Python 3.8, since it doesn't on Python 3.9.

Normally you would re-bind the Generic signature, using

KT = TypeVar('KT')
VT = TypeVar('VT')

class DictSubclass(dict[KT, VT], Generic[KT, VT]):
    ...

but we know that in Python 3.8 it's simply impossible without subclassing typing.Dict.

What do you think?

@bswck
Copy link
Contributor Author

bswck commented Dec 23, 2023

The easiest (and my personal) approach would be to assume the following:

  • If the developer simply wants to use the same type parameters, simply implement the behavior for such cases.
  • If the developer subclasses a built-in generic and changes the generic signature (for example adds a new type parameter), they need to use typing.Dict because it's a rare case and they should explicitly inform what type parameters do what.

We can make the new_generic_types object look up the MRO to find a match, like functools.singledispatch registry (https://github.com/python/cpython/blob/3.8/Lib/functools.py#L770-L794). That should be easy, we could even use the functools.singledispatch API.

@alexmojaki
Copy link
Owner

It's an interesting point that I hadn't thought about. I imagine a fix would have to replace DictSubclass (in DictSubclass[str, int]) with a dynamic class C(DictSubclass, Generic[KT, VT]). That might produce surprising behaviour in some cases, but it's probably OK. You'd need to be careful about not leaking memory by generating classes though.

@bswck
Copy link
Contributor Author

bswck commented Dec 23, 2023

It's an interesting point that I hadn't thought about. I imagine a fix would have to replace DictSubclass (in DictSubclass[str, int]) with a dynamic class C(DictSubclass, Generic[KT, VT]). That might produce surprising behaviour in some cases, but it's probably OK. You'd need to be careful about not leaking memory by generating classes though.

This is a working idea in most cases, but a bit hackish. If we look up the MRO of the origin, like functools.singledispatch does on simple types, there's no need to make a subclass if we assume the type parameters are the same and add an Annotated
pydantic validator that down-casts to the final type.

So, for example:

eval_type_backport(ForwardRef('DictSubclass[str, int]'), globals())

would give

Annotated[typing.Dict[str, int], AfterValidator(DictSubclass)]

@alexmojaki
Copy link
Owner

Replacing DictSubclass[K, V] with typing.Dict[K, V] loses the information about DictSubclass.

You'll have to elaborate your pydantic validator idea. But it sounds like it's adding worrying complexity.

@bswck
Copy link
Contributor Author

bswck commented Dec 23, 2023

You'll have to elaborate your pydantic validator idea. But it sounds like it's adding worrying complexity.

Sorry, I submitted the message before finishing it 😄

@bswck
Copy link
Contributor Author

bswck commented Dec 23, 2023

Replacing DictSubclass[K, V] with typing.Dict[K, V] loses the information about DictSubclass.

Hence the AfterValidator idea. Or maybe it should be BeforeValidator?

@bswck
Copy link
Contributor Author

bswck commented Dec 23, 2023

Alright, I think I have the best idea without Annotated and without dynamic subclasses. Give me 15 minutes.
I can only tell it dynamically changes the MRO of the target class using MRO entries of the typing type alias, typing.Dict in this case.

@alexmojaki
Copy link
Owner

What about list[DictSubclass[str, int]]? Or DictSubclass[str, DictSubclass[str, int]]?

In any case, I think it should simply return an accurate type, not some combination of pieces like this. Even within pydantic, this should work smoothly outside of just validation. And really this package shouldn't know about pydantic at all.

It'd also be acceptable to just raise an informative error in this case. The user just has to fix one or maybe a handful of class definitions, and we can tell them exactly how. The inconvenience generally won't 'scale up' the way it does with annotating each field with Dict vs dict.

@bswck
Copy link
Contributor Author

bswck commented Dec 23, 2023

I found a solution that works for this example, doesn't know pydantic, and scales well.

from __future__ import annotations

import typing
from typing import TYPE_CHECKING, TypeVar

from eval_type_backport.eval_type_backport import new_generic_types

if TYPE_CHECKING:
    from typing import Any

_T = TypeVar('_T')


def get_mro_entries(obj: Any, bases: tuple[type[Any], ...]) -> tuple[type[Any], ...]:
    """Return the MRO entries for an object."""
    if hasattr(obj, '__mro_entries__'):
        return obj.__mro_entries__(bases)
    if isinstance(obj, type):
        return (obj,)
    raise TypeError(f'{obj!r} is not a type')


def backport(subclass: type[_T]) -> type[_T]:
    """Backport a built-in generic type to Python 3.10+."""
    bases = subclass.__bases__
    for base in bases:
        generic_base_name = new_generic_types.get(base)
        if generic_base_name is not None:
            generic_base = getattr(typing, generic_base_name)
            # get_args() and get_origin() are too high-level for this
            subclass.__origin__ = generic_base.__origin__
            subclass.__parameters__ = generic_base.__parameters__
            self_index = bases.index(base)
            new_bases = get_mro_entries(
                generic_base,
                (*bases[:self_index], generic_base, *bases[self_index + 1 :]),
            )
            subclass.__bases__ = new_bases
            break
    return subclass


@backport
class DictSubclass(dict):
    pass


print(DictSubclass.__bases__)  # (<class 'dict'>, <class 'typing.Generic'>)
print(DictSubclass[int, str])  # __main__.DictSubclass[int, str]

Tested on Python 3.8.
The only requirement is to use the @backport decorator. But we can do it implicitly—would it break the user end?

@bswck
Copy link
Contributor Author

bswck commented Dec 23, 2023

In order to support built-in generics outside typing, new_generic_types would have to be changed to store references to classes, not names.

@alexmojaki
Copy link
Owner

It feels dangerous to dynamically modify __bases__ (and more) on a class without the user's knowledge. And it doesn't seem worth it to tell the user to add @backport instead of telling them to add typing.Dict[K, V] to the bases.

This whole library is already questionable on several levels, let's not get carried away on this corner case. I'd rather just produce a helpful error message than try to magically solve every possible problem for the user.

@alexmojaki
Copy link
Owner

Also this program:

class D(dict):
    pass

d: D[str, int] = D()

produces this error on the latest pyright, regardless of Python version: Expected no type arguments for class "D". So there's already pressure to just use typing.Dict to make type checkers happy.

@bswck
Copy link
Contributor Author

bswck commented Dec 23, 2023

I fully agree. Monkey-patching the class isn't good a direction.

Let's discuss how the traceback should look like.

This is my suggestion:

  1. If we hit a match from the user's class MRO in new_generic_types, try to find where the annotation lives and where the generic subclass is defined and the line offset of the declaration of bases.
  2. Produce something more or less like this, if the terminal size allows:
    Traceback (most recent call last):
      ...  # some lines from pydantic & eval_type_backport
      File "user_module.py", line 500, in <Model.ann type hint>
        ann: DictSubclass[str, int]
      File "user_module.py", line 49, in <module>
        class DictSubclass(dict):
                           ^
                           Use typing.Dict here to use type parametrization!
    TypeError: cannot parametrize dict subclass, use typing.Dict as a base
    
    Otherwise just:
    Traceback (most recent call last):
      ...  # some lines from pydantic & eval_type_backport
      File "user_module.py", line 500, in <Model.ann type hint>
        ann: DictSubclass[str, int]
      File "user_module.py", line 49, in <module>
        class DictSubclass(dict):
    TypeError: cannot parametrize dict subclass, use typing.Dict as a base
    

Well, unless the other one isn't clear enough. That would be just a little less work.
But I am strongly opting for altering the traceback to display the class declaration line and the type hint line that triggered it.

@bswck
Copy link
Contributor Author

bswck commented Dec 23, 2023

I am wondering what would happen if the user uses a subclass that comes from somewhere else, for example from some other library... that quickly gets pretty complicated. But this is like a corner case of a corner case... probably?

@bswck
Copy link
Contributor Author

bswck commented Dec 23, 2023

It would be nice to write a troubleshooting document of eval_type_backport and leave a URL to it under the traceback. Adding it to the #5 checklist.

@alexmojaki
Copy link
Owner

Start with a PR that just does the bare minimum, raising an exception with an appropriate message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants