Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update after Erik's feedback #1

Merged
merged 21 commits into from
Aug 4, 2023
Merged
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
183 changes: 116 additions & 67 deletions pep-0722.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
PEP: 722
Title: Stricter Type Guards
Author: Rich Chiodo <rchiodo at microsoft.com>, Eric Traut <eric at traut.com>
Author: Rich Chiodo <rchiodo at microsoft.com>, Eric Traut <erictr at microsoft.com>, Erik De Bonte <erikd at microsoft.com>
rchiodo marked this conversation as resolved.
Show resolved Hide resolved
rchiodo marked this conversation as resolved.
Show resolved Hide resolved
Sponsor: <real name of sponsor>
PEP-Delegate: <PEP delegate's real name>
Discussions-To: https://github.com/python/typing/discussions/1013
Expand All @@ -17,87 +17,136 @@ Resolution:
Abstract
========

This PEP further refines `TypeGuards <typeguards_>`__ to
indicate when negative type narrowing is deemed safe.

[I'd suggest mentioning PEP 647 explicitly here rather than having the opaque link. You can link to a PEP in RST using :pep:`647` ]
[I think more context is needed here for readers to understand what "negative" means. Maybe one sentence explaining what typeguards currently do and then another about the negative issue.]
:pep:`647` created a special return type annotation ``TypeGuard`` that allowed
type checkers to narrow types.
rchiodo marked this conversation as resolved.
Show resolved Hide resolved

This PEP further refines :pep:`647` by allowing type checkers to narrow types
even further when a ``TypeGuard`` function returns false.
rchiodo marked this conversation as resolved.
Show resolved Hide resolved

Motivation
==========

`TypeGuards <typeguards_>`__ are used throughout python
libraries but cannot be used to determine the negative case:
`TypeGuards <typeguards_>`__ are used throughout Python libraries to allow a
type checker to narrow what the type of something is when the ``TypeGuard``
rchiodo marked this conversation as resolved.
Show resolved Hide resolved
returns true.

.. code-block:: python

def is_str(val: str | int) -> TypeGuard[str]:
return isinstance(val, str)

def func(val: str | int):
if is_str(val):
# Type checkers can assume val is a 'str' in this branch

[Again, more context is needed for the user to understand what "the negative case" means.]
[Also what does "determine the negative case" mean? Maybe something like "narrow the type in the negative case" would be more clear? Also see the use of that phrase below the code block.]
[python should be capitalized]
However, in the ``else`` clause, :pep:`647` didn't prescribe what the type might
be:

::
[I'm wondering if `::` is equivalent to `.. code-block:: python` -- You may need the latter to get proper colorization. Check after you build your RST to HTML.]
.. code-block:: python

def is_str(val: str | int) -> TypeGuard[str]:
return isinstance(val, str)

def func(val: str | int):
if is_str(val):
reveal_type(val) # str
# Type checkers can assume val is a 'str' in this branch
else:
reveal_type(val) # str | int
# Type here is not narrowed. It is still 'str | int'


This inability to determine the negative case makes ``TypeGuard`` not as useful as
it could be.
This PEP proposes that when the type argument of the ``TypeGuard`` is a subtype
of the type of the first input argument, then the false case can be further
rchiodo marked this conversation as resolved.
Show resolved Hide resolved
rchiodo marked this conversation as resolved.
Show resolved Hide resolved
narrowed.

This PEP proposes that in cases where the output type is a *strict* subtype of
the input type, the negative case can be computed. This changes the example so
that the ``int`` case is possible:
["output type" -- might need to define this term or use something else. I don't see that term used in PEP 647.]
["This changes the example" -- maybe rephrase this to clarify that the code of the example is unchanged, but type checkers can interpret it differently?]
["is possible" seems pretty vague]
[What does strict subtype mean? And why is it italicized?]
This changes the example above like so:

::
.. code-block:: python

def is_str(val: str | int) -> TypeGuard[str]:
return isinstance(val, str)

def func(val: str | int):
if is_str(val):
reveal_type(val) # str
# Type checkers can assume val is a 'str' in this branch
else:
reveal_type(val) # int
# Type checkers can assume val is an 'int' in this branch
debonte marked this conversation as resolved.
Show resolved Hide resolved

Since the output type is a *strict* subtype of the
input, a type checker can determine that the only possible type in the ``else`` is the
other input type(s).
["the other input type(s)" -- There's only one input type. It's a Union. Suggest rephrasing this. I'm not sure if talking about the types using set theory (input -- output) would make this more clear (or more generic) or worse.]
Since the ``TypeGuard`` type (or output type) is a subtype of the input argument
type, a type checker can determine that the only possible type in the ``else``
is the other type in the Union. In this example, it is safe to assume that if
rchiodo marked this conversation as resolved.
Show resolved Hide resolved
``is_str`` returns false, then type of the ``val`` argument is an ``int``.

If the output type is not a *strict* subtype of the input type,
the negative cannot be assumed to be the intuitive opposite:
["intuitive opposite" -- opposite is the incorrect term here and I think intuition doesn't belong in a PEP :)]
Unsafe Narrowing
rchiodo marked this conversation as resolved.
Show resolved Hide resolved
--------------------

::
There are cases where this further type narrowing is not possible. Here's an
example:

.. code-block:: python

def is_str_list(val: list[int | str]) -> TypeGuard[list[str]]
return all(isinstance(x, str) for x in val)

def func(val: list[int | str]):
if is_str_list(val):
reveal_type(val) # list[str]
# Type checker assumes list[str] here
else:
# Type checker cannot assume list[int] here

Since ``list`` is invariant, it doesn't have any subtypes. This means type
checkers cannot narrow the type to ``list[int]`` in the false case.
``list[str]`` is not a subtype of ``list[str | int]``.
rchiodo marked this conversation as resolved.
Show resolved Hide resolved

Type checkers should not assume any narrowing in the false case when the
``TypeGuard`` type is not a subtype of the input argument type.
rchiodo marked this conversation as resolved.
Show resolved Hide resolved

However, narrowing in the true case is still possible. In the example above, the
type checker can assume the list is a ``list[str]`` if the ``TypeGuard``
function returns true.

Creating invalid narrowing
--------------------------

The new ``else`` case for a ``TypeGuard`` can be setup incorrectly. Here's an example:

.. code-block:: python

def is_positive_int(val: int | str) -> TypeGuard[int]:
return isinstance(val, int) and val > 0

def func(val: int | str):
if is_positive_int(val):
# Type checker assumes int here
else:
# Type checker assumes str here

A type checker will assume for the else case that the value is ``str``. This
is a change in behavior from :pep:`647` but as that pep stated `here <https://peps.python.org/pep-0647/#enforcing-strict-narrowing>`__
there are many ways a determined or uninformed developer can subvert type safety.

A better way to handle this example would be something like so:

.. code-block:: python

PosInt = NewType('PosInt', int)

def is_positive_int(val: PosInt | int | str) -> TypeGuard[PosInt]:
return isinstance(val, int) and val > 0

def func(val: int | str):
if is_positive_int(val):
# Type checker assumes PosInt here
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type checker assumes PosInt here

This is true even though val is int | str (i.e. it doesn't include PosInt)?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's true. This works because PosInt is an int.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to try it yourself in the latest pylance.

Create a pyrightconfig.json with this in it:

{
    "enableExperimentalFeatures" : true
}

Then the TypeGuard will behave as I described here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that that's how Pyright works. I guess I'm wondering:

a) Is this correct?

b) What would be the behavior if PosInt was unrelated to any of the types in the val union? Consider the following example. Within the if block, is val considered to be datetime because is_datetime returned true? Or is it Never because the intersection of datetime and int | str is empty? Oh, I guess this might be the multiple inheritance thing again. Maybe the val passed into func is derived from both int and datetime?

def is_datetime(val: datetime | int | str) -> TypeGuard[datetime]:
    return isinstance(val, datetime)

def func(val: int | str):
    if is_datetime(val):
        # Is val a datetime here or Never?

c) Should the PEP call more attention to these issues?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the TypeGuard pep says that if the TypeGuard returns something nonsensical, that's the user's fault. It will just happily presume that's the actual type if the TypeGuard returns true.

This is exactly what Pyright does at the moment.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else:
reveal_type(val) # list[str | int]
# Type checker assumes str | int here


Since ``list`` is invariant, it doesn't have any subtypes, so type checkers
can't narrow the type in the negative case.

Specification
=============

This PEP requires no new changes to the language. It is merely modifying the
definition of ``TypeGuard`` for type checkers. The runtime should already be
behaving in this way.
["should" -- "The runtime" sounds singular, so if you mean CPython alone, I'd remove "should". If you mean that all Python runtimes should be behaving this way, I'd clarify that.]
definition of ``TypeGuard`` for type checkers. Runtimes are already behaving
in this way.

Existing ``TypeGuard`` usage may change though, as described below.

Expand All @@ -108,49 +157,50 @@ Backwards Compatibility
For preexisting code this should require no changes, but should simplify this
use case here:
rchiodo marked this conversation as resolved.
Show resolved Hide resolved

::
.. code-block:: python

A = TypeVar("A")
B = TypeVar("B")
class A():
pass
class B():
pass

def is_A(x: A | B) -> TypeGuard[A]:
raise NotImplementedError
return is_instance(x, A)


def after_is_A(x: A | B) -> TypeGuard[B]:
return True
def is_B(x: A | B) -> TypeGuard[B]:
return is_instance(x, B)


def test(x: A | B):
if is_A(x):
reveal_type(x)
# Do stuff assuming x is an 'A'
return
assert after_is_A(x)
assert is_B(x)

reveal_type(x)
# Do stuff assuming x is a 'B'
return

["after_is_A" is confusing me -- is there a better name? "is_not_A"?]
[Can/should you use PEP 695 syntax for the TypeVars?]

becomes this instead
["becomes this instead" is not a grammatically correct continuation of the sentence before the first code block. Maybe rephrase the sentence to "Preexisting code should require no changes, but code like this...can be simplified to this:"]
[Add comments in these code blocks showing the expected inferred type as you did above? I think then you won't need the reveal_type calls?]
This use case becomes this instead:
rchiodo marked this conversation as resolved.
Show resolved Hide resolved

::
.. code-block:: python

A = TypeVar("A")
B = TypeVar("B")
class A():
pass
class B():
pass

def is_A(x: A | B) -> TypeGuard[A]:
return isinstance(x, A)
return is_instance(x, A)


def test(x: A | B):
if is_A(x):
reveal_type(x)
# Do stuff assuming x is an 'A'
return
reveal_type(x)

# Do stuff assuming x is a 'B'
return


Expand All @@ -164,8 +214,7 @@ first place. Meaning this change should make ``TypeGuard`` easier to teach.
Reference Implementation
========================

A reference implementation of this idea exists in Pyright.
[Would there be value in pointing the reader to the implementation?]
A reference `implementation <https://github.com/microsoft/pyright/commit/9a5af798d726bd0612cebee7223676c39cf0b9b0>`__ of this idea exists in Pyright.


Rejected Ideas
Expand All @@ -179,8 +228,8 @@ would validate that the output type was a subtype of the input type.
See this comment: `StrictTypeGuard proposal <https://github.com/python/typing/discussions/1013#discussioncomment-1966238>`__

This was rejected because for most cases it's not necessary. Most people assume
the negative case for ``TypeGuard`` anyway, so why not just change the specification
to match their assumptions?
the negative case for ``TypeGuard`` anyway, so why not just change the
specification to match their assumptions?

Footnotes
=========
Expand All @@ -190,4 +239,4 @@ Copyright
=========

This document is placed in the public domain or under the CC0-1.0-Universal
license, whichever is more permissive.
license, whichever is more permissive.