diff --git a/pep-0653.rst b/pep-0653.rst index 41c540a4816..c2fc756ebfb 100644 --- a/pep-0653.rst +++ b/pep-0653.rst @@ -14,9 +14,9 @@ Abstract This PEP proposes a semantics for pattern matching that respects the general concept of PEP 634, but is more precise, easier to reason about, and should be faster. -The object model will be extended with a special (dunder) attribute, ``__match_kind__``, -in addition to the ``__match_args__`` attribute from PEP 634, to support pattern matching. -The ``__match_kind__`` attribute must be an integer. +The object model will be extended with two special (dunder) attributes, ``__match_container__`` and +``__match_class__``, in addition to the ``__match_args__`` attribute from PEP 634, to support pattern matching. +Both of these new attributes must be integers and ``__match_args__`` is required to be a tuple. With this PEP: @@ -97,29 +97,30 @@ A match statement performs a sequence of pattern matches. In general, matching a 2. When deconstructed, does the value match this particular pattern? 3. Is the guard true? -To determine whether a value can match a particular kind of pattern, we add the ``__match_kind__`` attribute. -This allows the kind of a value to be determined once and in a efficient fashion. +To determine whether a value can match a particular kind of pattern, we add the ``__match_container__`` +and ``__match_class__`` attributes. +This allows the kind of a value to be determined in a efficient fashion. Specification ============= - Additions to the object model ----------------------------- -A ``__match_kind__`` attribute will be added to ``object``. -It should be overridden by classes that want to match mapping or sequence patterns, -or want change the default behavior when matching class patterns. -It must be an integer and should be exactly one of these:: +The ``__match_container__ ``and ``__match_class__`` attributes will be added to ``object``. +``__match_container__`` should be overridden by classes that want to match mapping or sequence patterns. +``__match_class__`` should be overridden by classes that want to change the default behavior when matching class patterns. + +``__match_container__`` must be an integer and should be exactly one of these:: 0 MATCH_SEQUENCE MATCH_MAPPING -bitwise ``or``\ ed with exactly one of these:: +``__match_class__`` must be an integer and should be exactly one of these:: 0 - MATCH_DEFAULT + MATCH_ATTRIBUTES MATCH_SELF .. note:: @@ -127,15 +128,21 @@ bitwise ``or``\ ed with exactly one of these:: Symbolic constants will be provided both for Python and C, and once defined they will never be changed. -Classes inheriting from ``object`` will inherit ``__match_kind__ = MATCH_DEFAULT`` and ``__match_args__ = ()`` +``object`` will have the following values for the special attributes:: + + __match_container__ = 0 + __match_class__= MATCH_ATTRIBUTES + __match_args__ = () + +These special attributes will be inherited as normal. If ``__match_args__`` is overridden, then it is required to hold a tuple of strings. It may be empty. .. note:: ``__match_args__`` will be automatically generated for dataclasses and named tuples, as specified in PEP 634. -The pattern matching implementation is *not* required to check that ``__match_args__`` behaves as specified. -If the value of ``__match_args__`` is not as specified, then +The pattern matching implementation is *not* required to check that any of these attributes behave as specified. +If the value of ``__match_container__``, ``__match_class__`` or ``__match_args__`` is not as specified, then the implementation may raise any exception, or match the wrong pattern. Of course, implementations are free to check these properties and provide meaningful error messages if they can do so efficiently. @@ -163,14 +170,13 @@ All additional code listed below that is not present in the original source will Preamble '''''''' -Before any patterns are matched, the expression being matched is evaluated and its kind is determined:: +Before any patterns are matched, the expression being matched is evaluated:: match expr: translates to:: $value = expr - $kind = type($value).__match_kind__ Capture patterns '''''''''''''''' @@ -234,6 +240,7 @@ A pattern not including a star pattern:: translates to:: + $kind = type($value).__match_container__ if $kind & MATCH_SEQUENCE == 0: FAIL if len($value) != len($VARS): @@ -248,6 +255,7 @@ A pattern including a star pattern:: translates to:: + $kind = type($value).__match_container__ if $kind & MATCH_SEQUENCE == 0: FAIL if len($value) < len($VARS): @@ -265,6 +273,7 @@ A pattern not including a double-star pattern:: translates to:: + $kind = type($value).__match_container__ if $kind & MATCH_MAPPING == 0: FAIL if not $value.keys() >= $KEYWORD_PATTERNS.keys(): @@ -281,6 +290,7 @@ A pattern including a double-star pattern:: translates to:: + $kind = type($value).__match_container__ if $kind & MATCH_MAPPING == 0: FAIL if not $value.keys() >= $KEYWORD_PATTERNS.keys(): @@ -308,7 +318,7 @@ translates to:: .. note:: ``case ClsName():`` is the only class pattern that can succeed if - ``($kind & (MATCH_SELF|MATCH_DEFAULT)) == 0`` + ``($kind & (MATCH_SELF|MATCH_ATTRIBUTES)) == 0`` Class pattern with a single positional pattern:: @@ -317,6 +327,7 @@ Class pattern with a single positional pattern:: translates to:: + $kind = type($value).__match_class__ if $kind & MATCH_SELF: if not isinstance($value, ClsName): FAIL @@ -333,7 +344,8 @@ translates to:: if not isinstance($value, ClsName): FAIL - if $kind & MATCH_DEFAULT: + $kind = type($value).__match_class__ + if $kind & MATCH_ATTRIBUTES: $attrs = ClsName.__match_args__ if len($attr) < len($VARS): raise TypeError(...) @@ -355,7 +367,8 @@ translates to:: if not isinstance($value, ClsName): FAIL - if $kind & MATCH_DEFAULT: + $kind = type($value).__match_class__ + if $kind & MATCH_ATTRIBUTES: try: for $KEYWORD in $KEYWORD_PATTERNS: $tmp = getattr($value, QUOTE($KEYWORD)) @@ -375,7 +388,8 @@ translates to:: if not isinstance($value, ClsName): FAIL - if $kind & MATCH_DEFAULT: + $kind = type($value).__match_class__ + if $kind & MATCH_ATTRIBUTES: $attrs = ClsName.__match_args__ if len($attr) < len($VARS): raise TypeError(...) @@ -408,6 +422,7 @@ For example, the pattern:: translates to:: + $kind = type($value).__match_class__ if $kind & MATCH_SEQUENCE == 0: FAIL if len($value) != 2: @@ -433,45 +448,49 @@ translates to:: FAIL -Non-conforming ``__match_kind__`` +Non-conforming special attributes ''''''''''''''''''''''''''''''''' -All classes should ensure that the the value of ``__match_kind__`` follows the specification. +All classes should ensure that the the values of ``__match_container__``, ``__match_class__`` +and ``__match_args__`` follow the specification. Therefore, implementations can assume, without checking, that the following are true:: - (__match_kind__ & (MATCH_SEQUENCE | MATCH_MAPPING)) != (MATCH_SEQUENCE | MATCH_MAPPING) - (__match_kind__ & (MATCH_SELF | MATCH_DEFAULT)) != (MATCH_SELF | MATCH_DEFAULT) + (__match_container__ & (MATCH_SEQUENCE | MATCH_MAPPING)) != (MATCH_SEQUENCE | MATCH_MAPPING) + (__match_class__ & (MATCH_SELF | MATCH_ATTRIBUTES)) != (MATCH_SELF | MATCH_ATTRIBUTES) -Thus, implementations can assume that ``__match_kind__ & MATCH_SEQUENCE`` implies ``(__match_kind__ & MATCH_MAPPING) == 0``, and vice-versa. -Likewise for ``MATCH_SELF`` and ``MATCH_DEFAULT``. +Thus, implementations can assume that ``__match_container__ & MATCH_SEQUENCE`` implies ``(__match_container__ & MATCH_MAPPING) == 0``, and vice-versa. +Likewise for ``__match_class__``, ``MATCH_SELF`` and ``MATCH_ATTRIBUTES``. -If ``__match_kind__`` does not follow the specification, -then implementations may treat any of the expressions of the form ``$kind & MATCH_...`` above as having any value. +Values of the special attributes for classes in the standard library +-------------------------------------------------------------------- -Implementation of ``__match_kind__`` in the standard library ------------------------------------------------------------- +For the core builtin container classes ``__match_container__`` will be: -``object.__match_kind__`` will be ``MATCH_DEFAULT``. +* ``list``: ``MATCH_SEQUENCE`` +* ``tuple``: ``MATCH_SEQUENCE`` +* ``dict``: ``MATCH_MAPPING`` +* ``bytearray``: 0 +* ``bytes``: 0 +* ``str``: 0 -For common builtin classes ``__match_kind__`` will be: +Named tuples will have ``__match_container__`` set to ``MATCH_SEQUENCE``. -* ``bool``: ``MATCH_SELF`` -* ``bytearray``: ``MATCH_SELF`` -* ``bytes``: ``MATCH_SELF`` -* ``float``: ``MATCH_SELF`` -* ``frozenset``: ``MATCH_SELF`` -* ``int``: ``MATCH_SELF`` -* ``set``: ``MATCH_SELF`` -* ``str``: ``MATCH_SELF`` -* ``list``: ``MATCH_SEQUENCE | MATCH_SELF`` -* ``tuple``: ``MATCH_SEQUENCE | MATCH_SELF`` -* ``dict``: ``MATCH_MAPPING | MATCH_SELF`` +* All other standard library classes for which ``issubclass(cls, collections.abc.Mapping)`` is true will have ``__match_container__`` set to ``MATCH_MAPPING``. +* All other standard library classes for which ``issubclass(cls, collections.abc.Sequence)`` is true will have ``__match_container__`` set to ``MATCH_SEQUENCE``. -Named tuples will have ``__match_kind__`` set to ``MATCH_SEQUENCE | MATCH_DEFAULT``. - -* All other standard library classes for which ``issubclass(cls, collections.abc.Mapping)`` is true will have ``__match_kind__`` set to ``MATCH_MAPPING``. -* All other standard library classes for which ``issubclass(cls, collections.abc.Sequence)`` is true will have ``__match_kind__`` set to ``MATCH_SEQUENCE``. +For the following builtin classes ``__match_class__`` will be set to ``MATCH_SELF``: +* ``bool`` +* ``bytearray`` +* ``bytes`` +* ``float`` +* ``frozenset`` +* ``int`` +* ``set`` +* ``str`` +* ``list`` +* ``tuple`` +* ``dict`` Legal optimizations ------------------- @@ -497,9 +516,9 @@ Implementations are allowed to make the following assumptions: * ``isinstance(obj, cls)`` can be freely replaced with ``issubclass(type(obj), cls)`` and vice-versa. * ``isinstance(obj, cls)`` will always return the same result for any ``(obj, cls)`` pair and repeated calls can thus be elided. -* Reading ``__match_args__`` and calling ``__deconstruct__`` are pure operations, and may be cached. -* Sequences, that is any class for which ``MATCH_SEQUENCE`` is true, are not modified by iteration, subscripting or calls to ``len()``, - and thus those operations can be freely substituted for each other where they would be equivalent when applied to an immuable sequence. +* Reading any of ``__match_container__``, ``__match_class__`` or ``__match_args__`` is a pure operation, and may be cached. +* Sequences, that is any class for which ``__match_container__&MATCH_SEQUENCE`` is not zero, are not modified by iteration, subscripting or calls to ``len()``. + Consequently, those operations can be freely substituted for each other where they would be equivalent when applied to an immutable sequence. In fact, implementations are encouraged to make these assumptions, as it is likely to result in signficantly better performance. @@ -631,9 +650,11 @@ Summary of differences between this PEP and PEP 634 The changes to the semantics can be summarized as: -* Selecting the kind of pattern uses ``cls.__match_kind__`` instead of - ``issubclass(cls, collections.abc.Mapping)`` and ``issubclass(cls, collections.abc.Sequence)`` - and allows classes a bit more control over which kinds of pattern they match. +* Requires ``__match_args__`` to be a *tuple* of strings, not just a sequence. + This make pattern matching a bit more robust and optimizable as ``__match_args__`` can be assumed to be immutable. +* Selecting the kind of container patterns that can be matched uses ``cls.__match_container__`` instead of + ``issubclass(cls, collections.abc.Mapping)`` and ``issubclass(cls, collections.abc.Sequence)``. +* Allows classes to opt out of deconstruction altogether, if neccessary, but setting ``__match_class__ = 0``. * The behavior when matching patterns is more precisely defined, but is otherwise unchanged. There are no changes to syntax. All examples given in the PEP 636 tutorial should continue to work as they do now. @@ -644,7 +665,7 @@ Rejected Ideas Using attributes from the instance's dictionary ----------------------------------------------- -An earlier version of this PEP only used attributes from the instance's dictionary when matching a class pattern with ``__match_kind__ == MATCH_DEFAULT``. +An earlier version of this PEP only used attributes from the instance's dictionary when matching a class pattern with ``MATCH_ATTRIBUTES``. The intent was to avoid capturing bound-methods and other synthetic attributes. However, this also mean that properties were ignored. For the class:: @@ -659,7 +680,7 @@ For the class:: ... Ideally we would match the attributes "a" and "p", but not "m". -However, there is no general way to do that, so this PEP now follows the semantics of PEP 634 for ``MATCH_DEFAULT``. +However, there is no general way to do that, so this PEP now follows the semantics of PEP 634 for ``MATCH_ATTRIBUTES``. Lookup of ``__match_args__`` on the subject not the pattern ----------------------------------------------------------- @@ -672,6 +693,13 @@ This has been rejected for a few reasons:: * Using the class specified in the pattern has the potential to provide better error reporting is some cases. * Neither approach is perfect, both have odd corner cases. Keeping the status quo minimizes disruption. +Combining ``__match_class__`` and ``__match_container__`` into a single value +----------------------------------------------------------------------------- + +An earlier version of this PEP combined ``__match_class__`` and ``__match_container__`` into a single value, ``__match_kind__``. +Using a single value has a small advantage in terms of performance, +but is likely to result in unintended changes to container matching when overriding class matching behavior, and vice versa. + Deferred Ideas ============== @@ -706,7 +734,7 @@ Code examples :: class Symbol: - __match_kind__ = MATCH_SELF + __match_class__ = MATCH_SELF .. [2] @@ -716,6 +744,7 @@ This:: translates to:: + $kind = type($value).__match_container__ if $kind & MATCH_SEQUENCE == 0: FAIL if len($value) != 2: @@ -732,6 +761,7 @@ This:: translates to:: + $kind = type($value).__match_container__ if $kind & MATCH_SEQUENCE == 0: FAIL if len($value) < 2: @@ -746,6 +776,7 @@ This:: translates to:: + $kind = type($value).__match_container__ if $kind & MATCH_MAPPING == 0: FAIL if $value.keys() != {"x", "y"}: @@ -763,6 +794,7 @@ This:: translates to:: + $kind = type($value).__match_container__ if $kind & MATCH_MAPPING == 0: FAIL if not $value.keys() >= {"x", "y"}: @@ -782,7 +814,8 @@ translates to:: if not isinstance($value, ClsName): FAIL - if $kind & MATCH_DEFAULT: + $kind = type($value).__match_class__ + if $kind & MATCH_ATTRIBUTES: $attrs = ClsName.__match_args__ if len($attr) < 2: FAIL @@ -804,7 +837,8 @@ translates to:: if not isinstance($value, ClsName): FAIL - lif $kind & MATCH_DEFAULT: + $kind = type($value).__match_class__ + lif $kind & MATCH_ATTRIBUTES: try: x = $value.a y = $value.b @@ -824,7 +858,8 @@ translates to:: if not isinstance($value, ClsName): FAIL - if $kind & MATCH_DEFAULT: + $kind = type($value).__match_class__ + if $kind & MATCH_ATTRIBUTES: $attrs = ClsName.__match_args__ if len($attr) < 1: raise TypeError(...) @@ -844,7 +879,7 @@ translates to:: :: class Basic: - __match_kind__ = MATCH_POSITIONAL + __match_class__ = MATCH_POSITIONAL def __deconstruct__(self): return self._args