Merge pull request #1 from rchiodo/rchiodo/pep-722-strictertypeguard

Update after Erik's feedback
rchiodo · Aug 4, 2023 · 66bdf14 · 66bdf14
2 parents 9da9233 + a2e0b22
commit 66bdf14
Showing 1 changed file with 122 additions and 73 deletions.
diff --git a/pep-0722.rst b/pep-0722.rst
@@ -1,6 +1,6 @@
 PEP: 722
 Title: Stricter Type Guards
-Author: Rich Chiodo <rchiodo at microsoft.com>, Eric Traut <eric at traut.com>
+Author: Rich Chiodo <rchiodo at microsoft.com>, Eric Traut <erictr at microsoft.com>, Erik De Bonte <erikd at microsoft.com>
 Sponsor: <real name of sponsor>
 PEP-Delegate: <PEP delegate's real name>
 Discussions-To: https://github.com/python/typing/discussions/1013
@@ -17,140 +17,190 @@ Resolution:
 Abstract
 ========
 
-This PEP further refines `TypeGuards <typeguards_>`__ to
-indicate when negative type narrowing is deemed safe.
-
-[I'd suggest mentioning PEP 647 explicitly here rather than having the opaque link. You can link to a PEP in RST using :pep:`647` ]
-[I think more context is needed here for readers to understand what "negative" means. Maybe one sentence explaining what typeguards currently do and then another about the negative issue.]
+:pep:`647` introduced the concept of ``TypeGuard`` functions which return true
+if their input parameter matches their target type. For example, a function that
+returns ``TypeGuard[str]`` is assumed to return ``true`` if and only if it's
+input parameter is a ``str``. This allows type checkers to narrow types in this
+positive case.
 
+This PEP further refines :pep:`647` by allowing type checkers to also narrow types
+when a ``TypeGuard`` function returns false.
 
 Motivation
 ==========
 
-`TypeGuards <typeguards_>`__ are used throughout python
-libraries but cannot be used to determine the negative case:
+`TypeGuards <typeguards_>`__ are used throughout Python libraries to allow a
+type checker to narrow the type of something when the ``TypeGuard``
+returns true.
 
-[Again, more context is needed for the user to understand what "the negative case" means.]
-[Also what does "determine the negative case" mean? Maybe something like "narrow the type in the negative case" would be more clear? Also see the use of that phrase below the code block.]
-[python should be capitalized]
+However, in the ``else`` clause, :pep:`647` didn't prescribe what the type might
+be:
 
-::
-[I'm wondering if `::` is equivalent to `.. code-block:: python` -- You may need the latter to get proper colorization. Check after you build your RST to HTML.]
+.. code-block:: python
 
     def is_str(val: str | int) -> TypeGuard[str]:
         return isinstance(val, str)
 
     def func(val: str | int):
         if is_str(val):
-            reveal_type(val) # str
+            # Type checkers can assume val is a 'str' in this branch
         else:
-            reveal_type(val) # str | int
+            # Type here is not narrowed. It is still 'str | int' 
+
 
-This inability to determine the negative case makes ``TypeGuard`` not as useful as
-it could be.
+This PEP proposes that when the type argument of the ``TypeGuard`` is a subtype
+of the type of the first input parameter, then the false return case can be
+further narrowed. 
 
-This PEP proposes that in cases where the output type is a *strict* subtype of
-the input type, the negative case can be computed. This changes the example so
-that the ``int`` case is possible:
-["output type" -- might need to define this term or use something else. I don't see that term used in PEP 647.]
-["This changes the example" -- maybe rephrase this to clarify that the code of the example is unchanged, but type checkers can interpret it differently?]
-["is possible" seems pretty vague]
-[What does strict subtype mean? And why is it italicized?]
+This changes the example above like so:
 
-::
+.. code-block:: python
 
     def is_str(val: str | int) -> TypeGuard[str]:
         return isinstance(val, str)
 
     def func(val: str | int):
         if is_str(val):
-            reveal_type(val) # str
+            # Type checkers can assume val is a 'str' in this branch
         else:
-            reveal_type(val) # int
+            # Type checkers can assume val is an 'int' in this branch
+
+Since the ``TypeGuard`` type (or output type) is a subtype of the first input
+parameter type, a type checker can determine that the only possible type in the
+``else`` is the other type in the ``Union``. In this example, it is safe to
+assume that if ``is_str`` returns false, then type of the ``val`` argument is an
+``int``.
+
+
+Specification
+=============
 
-Since the output type is a *strict* subtype of the
-input, a type checker can determine that the only possible type in the ``else`` is the
-other input type(s).
-["the other input type(s)" -- There's only one input type. It's a Union. Suggest rephrasing this. I'm not sure if talking about the types using set theory (input -- output) would make this more clear (or more generic) or worse.]
+This PEP requires no new changes to the language. It is merely modifying the
+definition of ``TypeGuard`` for type checkers. Runtimes are already behaving
+in this way.
+
+
+Existing ``TypeGuard`` usage may change though, as described in 
+`Backwards Compatibility`_.
+
+Unsafe Narrowing
+--------------------
 
-If the output type is not a *strict* subtype of the input type,
-the negative cannot be assumed to be the intuitive opposite:
-["intuitive opposite" -- opposite is the incorrect term here and I think intuition doesn't belong in a PEP :)]
+There are cases where this further type narrowing is not possible though. Here's
+an example:
 
-::
+.. code-block:: python
 
     def is_str_list(val: list[int | str]) -> TypeGuard[list[str]]
         return all(isinstance(x, str) for x in val)
 
     def func(val: list[int | str]):
         if is_str_list(val):
-            reveal_type(val) # list[str]
+            # Type checker assumes list[str] here
         else:
-            reveal_type(val) # list[str | int] 
+            # Type checker cannot assume list[int] here
 
-Since ``list`` is invariant, it doesn't have any subtypes, so type checkers 
-can't narrow the type in the negative case.
+Since ``list`` is invariant, it doesn't have any subtypes. ``list[str]`` is not
+a subtype of ``list[str | int]``. This means type checkers cannot narrow the
+type to ``list[int]`` in the false case.
 
-Specification
-=============
+Type checkers should not assume any narrowing in the false case when the
+``TypeGuard`` type argument is not a subtype of the first input parameter type. 
 
-This PEP requires no new changes to the language. It is merely modifying the
-definition of ``TypeGuard`` for type checkers. The runtime should already be
-behaving in this way.
-["should" -- "The runtime" sounds singular, so if you mean CPython alone, I'd remove "should". If you mean that all Python runtimes should be behaving this way, I'd clarify that.]
+However, narrowing in the true case is still possible. In the example above, the
+type checker can assume the list is a ``list[str]`` if the ``TypeGuard``
+function returns true.
 
-Existing ``TypeGuard`` usage may change though, as described below.
+User error
+--------------------------
+
+The new ``else`` case for a ``TypeGuard`` can be setup incorrectly. Here's an
+example:
+
+.. code-block:: python
+
+    def is_positive_int(val: int | str) -> TypeGuard[int]:
+        return isinstance(val, int) and val > 0
+
+    def func(val: int | str):
+        if is_positive_int(val):
+            # Type checker assumes int here
+        else:
+            # Type checker assumes str here
+
+A type checker will assume for the else case that the value is ``str``. This
+is a change in behavior from :pep:`647` but as that pep stated `here <https://peps.python.org/pep-0647/#enforcing-strict-narrowing>`__
+there are many ways a determined or uninformed developer can subvert 
+type safety.
+
+A better way to handle this example would be something like so:
+
+.. code-block:: python
+
+    PosInt = NewType('PosInt', int)
+
+    def is_positive_int(val: PosInt | int | str) -> TypeGuard[PosInt]:
+        return isinstance(val, int) and val > 0
+
+    def func(val: int | str):
+        if is_positive_int(val):
+            # Type checker assumes PosInt here
+        else:
+            # Type checker assumes str | int here
 
 
 Backwards Compatibility
 =======================
 
-For preexisting code this should require no changes, but should simplify this
-use case here:
+For preexisting code this PEP should require no changes.
+
+However, some use cases such as the one below can be simplified:
 
-:: 
+.. code-block:: python
 
-    A = TypeVar("A")
-    B = TypeVar("B")
+    class A():
+        pass
+    class B():
+        pass
 
     def is_A(x: A | B) -> TypeGuard[A]:
-        raise NotImplementedError
+        return is_instance(x, A)
 
 
-    def after_is_A(x: A | B) -> TypeGuard[B]:
-        return True
+    def is_B(x: A | B) -> TypeGuard[B]:
+        return is_instance(x, B)
 
 
     def test(x: A | B):
         if is_A(x):
-            reveal_type(x)
+            # Do stuff assuming x is an 'A'
             return
-        assert after_is_A(x)
+        assert is_B(x)
 
-        reveal_type(x)
+        # Do stuff assuming x is a 'B'
         return
 
-["after_is_A" is confusing me -- is there a better name? "is_not_A"?]
-[Can/should you use PEP 695 syntax for the TypeVars?]
 
-becomes this instead
-["becomes this instead" is not a grammatically correct continuation of the sentence before the first code block. Maybe rephrase the sentence to "Preexisting code should require no changes, but code like this...can be simplified to this:"]
-[Add comments in these code blocks showing the expected inferred type as you did above? I think then you won't need the reveal_type calls?]
+With this proposed change, the code above continues to work but could be
+simplified by removing the assertion that x is of type B in the negative case:
 
-::
+.. code-block:: python
 
-    A = TypeVar("A")
-    B = TypeVar("B")
+    class A():
+        pass
+    class B():
+        pass
 
     def is_A(x: A | B) -> TypeGuard[A]:
-        return isinstance(x, A)
+        return is_instance(x, A)
 
 
     def test(x: A | B):
         if is_A(x):
-            reveal_type(x)
+            # Do stuff assuming x is an 'A'
             return
-        reveal_type(x)
+
+        # Do stuff assuming x is a 'B'
         return
 
 
@@ -164,8 +214,7 @@ first place. Meaning this change should make ``TypeGuard`` easier to teach.
 Reference Implementation
 ========================
 
-A reference implementation of this idea exists in Pyright.
-[Would there be value in pointing the reader to the implementation?]
+A reference `implementation <https://github.com/microsoft/pyright/commit/9a5af798d726bd0612cebee7223676c39cf0b9b0>`__ of this idea exists in Pyright.
 
 
 Rejected Ideas
@@ -179,8 +228,8 @@ would validate that the output type was a subtype of the input type.
 See this comment: `StrictTypeGuard proposal <https://github.com/python/typing/discussions/1013#discussioncomment-1966238>`__
 
 This was rejected because for most cases it's not necessary. Most people assume
-the negative case for ``TypeGuard`` anyway, so why not just change the specification
-to match their assumptions?
+the negative case for ``TypeGuard`` anyway, so why not just change the
+specification to match their assumptions?
 
 Footnotes
 =========
@@ -190,4 +239,4 @@ Copyright
 =========
 
 This document is placed in the public domain or under the CC0-1.0-Universal
-license, whichever is more permissive.
+license, whichever is more permissive.