Skip to content

Commit

Permalink
Merge pull request #1 from rchiodo/rchiodo/pep-722-strictertypeguard
Browse files Browse the repository at this point in the history
Update after Erik's feedback
  • Loading branch information
rchiodo authored Aug 4, 2023
2 parents 9da9233 + a2e0b22 commit 66bdf14
Showing 1 changed file with 122 additions and 73 deletions.
195 changes: 122 additions & 73 deletions pep-0722.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
PEP: 722
Title: Stricter Type Guards
Author: Rich Chiodo <rchiodo at microsoft.com>, Eric Traut <eric at traut.com>
Author: Rich Chiodo <rchiodo at microsoft.com>, Eric Traut <erictr at microsoft.com>, Erik De Bonte <erikd at microsoft.com>
Sponsor: <real name of sponsor>
PEP-Delegate: <PEP delegate's real name>
Discussions-To: https://github.com/python/typing/discussions/1013
Expand All @@ -17,140 +17,190 @@ Resolution:
Abstract
========

This PEP further refines `TypeGuards <typeguards_>`__ to
indicate when negative type narrowing is deemed safe.

[I'd suggest mentioning PEP 647 explicitly here rather than having the opaque link. You can link to a PEP in RST using :pep:`647` ]
[I think more context is needed here for readers to understand what "negative" means. Maybe one sentence explaining what typeguards currently do and then another about the negative issue.]
:pep:`647` introduced the concept of ``TypeGuard`` functions which return true
if their input parameter matches their target type. For example, a function that
returns ``TypeGuard[str]`` is assumed to return ``true`` if and only if it's
input parameter is a ``str``. This allows type checkers to narrow types in this
positive case.

This PEP further refines :pep:`647` by allowing type checkers to also narrow types
when a ``TypeGuard`` function returns false.

Motivation
==========

`TypeGuards <typeguards_>`__ are used throughout python
libraries but cannot be used to determine the negative case:
`TypeGuards <typeguards_>`__ are used throughout Python libraries to allow a
type checker to narrow the type of something when the ``TypeGuard``
returns true.

[Again, more context is needed for the user to understand what "the negative case" means.]
[Also what does "determine the negative case" mean? Maybe something like "narrow the type in the negative case" would be more clear? Also see the use of that phrase below the code block.]
[python should be capitalized]
However, in the ``else`` clause, :pep:`647` didn't prescribe what the type might
be:

::
[I'm wondering if `::` is equivalent to `.. code-block:: python` -- You may need the latter to get proper colorization. Check after you build your RST to HTML.]
.. code-block:: python
def is_str(val: str | int) -> TypeGuard[str]:
return isinstance(val, str)
def func(val: str | int):
if is_str(val):
reveal_type(val) # str
# Type checkers can assume val is a 'str' in this branch
else:
reveal_type(val) # str | int
# Type here is not narrowed. It is still 'str | int'
This inability to determine the negative case makes ``TypeGuard`` not as useful as
it could be.
This PEP proposes that when the type argument of the ``TypeGuard`` is a subtype
of the type of the first input parameter, then the false return case can be
further narrowed.

This PEP proposes that in cases where the output type is a *strict* subtype of
the input type, the negative case can be computed. This changes the example so
that the ``int`` case is possible:
["output type" -- might need to define this term or use something else. I don't see that term used in PEP 647.]
["This changes the example" -- maybe rephrase this to clarify that the code of the example is unchanged, but type checkers can interpret it differently?]
["is possible" seems pretty vague]
[What does strict subtype mean? And why is it italicized?]
This changes the example above like so:

::
.. code-block:: python
def is_str(val: str | int) -> TypeGuard[str]:
return isinstance(val, str)
def func(val: str | int):
if is_str(val):
reveal_type(val) # str
# Type checkers can assume val is a 'str' in this branch
else:
reveal_type(val) # int
# Type checkers can assume val is an 'int' in this branch
Since the ``TypeGuard`` type (or output type) is a subtype of the first input
parameter type, a type checker can determine that the only possible type in the
``else`` is the other type in the ``Union``. In this example, it is safe to
assume that if ``is_str`` returns false, then type of the ``val`` argument is an
``int``.


Specification
=============

Since the output type is a *strict* subtype of the
input, a type checker can determine that the only possible type in the ``else`` is the
other input type(s).
["the other input type(s)" -- There's only one input type. It's a Union. Suggest rephrasing this. I'm not sure if talking about the types using set theory (input -- output) would make this more clear (or more generic) or worse.]
This PEP requires no new changes to the language. It is merely modifying the
definition of ``TypeGuard`` for type checkers. Runtimes are already behaving
in this way.


Existing ``TypeGuard`` usage may change though, as described in
`Backwards Compatibility`_.

Unsafe Narrowing
--------------------

If the output type is not a *strict* subtype of the input type,
the negative cannot be assumed to be the intuitive opposite:
["intuitive opposite" -- opposite is the incorrect term here and I think intuition doesn't belong in a PEP :)]
There are cases where this further type narrowing is not possible though. Here's
an example:

::
.. code-block:: python
def is_str_list(val: list[int | str]) -> TypeGuard[list[str]]
return all(isinstance(x, str) for x in val)
def func(val: list[int | str]):
if is_str_list(val):
reveal_type(val) # list[str]
# Type checker assumes list[str] here
else:
reveal_type(val) # list[str | int]
# Type checker cannot assume list[int] here
Since ``list`` is invariant, it doesn't have any subtypes, so type checkers
can't narrow the type in the negative case.
Since ``list`` is invariant, it doesn't have any subtypes. ``list[str]`` is not
a subtype of ``list[str | int]``. This means type checkers cannot narrow the
type to ``list[int]`` in the false case.

Specification
=============
Type checkers should not assume any narrowing in the false case when the
``TypeGuard`` type argument is not a subtype of the first input parameter type.

This PEP requires no new changes to the language. It is merely modifying the
definition of ``TypeGuard`` for type checkers. The runtime should already be
behaving in this way.
["should" -- "The runtime" sounds singular, so if you mean CPython alone, I'd remove "should". If you mean that all Python runtimes should be behaving this way, I'd clarify that.]
However, narrowing in the true case is still possible. In the example above, the
type checker can assume the list is a ``list[str]`` if the ``TypeGuard``
function returns true.

Existing ``TypeGuard`` usage may change though, as described below.
User error
--------------------------

The new ``else`` case for a ``TypeGuard`` can be setup incorrectly. Here's an
example:

.. code-block:: python
def is_positive_int(val: int | str) -> TypeGuard[int]:
return isinstance(val, int) and val > 0
def func(val: int | str):
if is_positive_int(val):
# Type checker assumes int here
else:
# Type checker assumes str here
A type checker will assume for the else case that the value is ``str``. This
is a change in behavior from :pep:`647` but as that pep stated `here <https://peps.python.org/pep-0647/#enforcing-strict-narrowing>`__
there are many ways a determined or uninformed developer can subvert
type safety.

A better way to handle this example would be something like so:

.. code-block:: python
PosInt = NewType('PosInt', int)
def is_positive_int(val: PosInt | int | str) -> TypeGuard[PosInt]:
return isinstance(val, int) and val > 0
def func(val: int | str):
if is_positive_int(val):
# Type checker assumes PosInt here
else:
# Type checker assumes str | int here
Backwards Compatibility
=======================

For preexisting code this should require no changes, but should simplify this
use case here:
For preexisting code this PEP should require no changes.

However, some use cases such as the one below can be simplified:

::
.. code-block:: python
A = TypeVar("A")
B = TypeVar("B")
class A():
pass
class B():
pass
def is_A(x: A | B) -> TypeGuard[A]:
raise NotImplementedError
return is_instance(x, A)
def after_is_A(x: A | B) -> TypeGuard[B]:
return True
def is_B(x: A | B) -> TypeGuard[B]:
return is_instance(x, B)
def test(x: A | B):
if is_A(x):
reveal_type(x)
# Do stuff assuming x is an 'A'
return
assert after_is_A(x)
assert is_B(x)
reveal_type(x)
# Do stuff assuming x is a 'B'
return
["after_is_A" is confusing me -- is there a better name? "is_not_A"?]
[Can/should you use PEP 695 syntax for the TypeVars?]
becomes this instead
["becomes this instead" is not a grammatically correct continuation of the sentence before the first code block. Maybe rephrase the sentence to "Preexisting code should require no changes, but code like this...can be simplified to this:"]
[Add comments in these code blocks showing the expected inferred type as you did above? I think then you won't need the reveal_type calls?]
With this proposed change, the code above continues to work but could be
simplified by removing the assertion that x is of type B in the negative case:

::
.. code-block:: python
A = TypeVar("A")
B = TypeVar("B")
class A():
pass
class B():
pass
def is_A(x: A | B) -> TypeGuard[A]:
return isinstance(x, A)
return is_instance(x, A)
def test(x: A | B):
if is_A(x):
reveal_type(x)
# Do stuff assuming x is an 'A'
return
reveal_type(x)
# Do stuff assuming x is a 'B'
return
Expand All @@ -164,8 +214,7 @@ first place. Meaning this change should make ``TypeGuard`` easier to teach.
Reference Implementation
========================

A reference implementation of this idea exists in Pyright.
[Would there be value in pointing the reader to the implementation?]
A reference `implementation <https://github.com/microsoft/pyright/commit/9a5af798d726bd0612cebee7223676c39cf0b9b0>`__ of this idea exists in Pyright.


Rejected Ideas
Expand All @@ -179,8 +228,8 @@ would validate that the output type was a subtype of the input type.
See this comment: `StrictTypeGuard proposal <https://github.com/python/typing/discussions/1013#discussioncomment-1966238>`__

This was rejected because for most cases it's not necessary. Most people assume
the negative case for ``TypeGuard`` anyway, so why not just change the specification
to match their assumptions?
the negative case for ``TypeGuard`` anyway, so why not just change the
specification to match their assumptions?

Footnotes
=========
Expand All @@ -190,4 +239,4 @@ Copyright
=========

This document is placed in the public domain or under the CC0-1.0-Universal
license, whichever is more permissive.
license, whichever is more permissive.

0 comments on commit 66bdf14

Please sign in to comment.