-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid panic in invalid pattern recovery #10966
Conversation
TokenKind::Lpar | TokenKind::Lsqb => { | ||
let pattern = self.parse_parenthesized_or_sequence_pattern(); | ||
|
||
if matches!( | ||
pattern, | ||
Pattern::MatchAs(ast::PatternMatchAs { | ||
pattern: Some(_), | ||
name: Some(_), | ||
.. | ||
}) | ||
) { | ||
// If the pattern is an `as` pattern with both a pattern and a name, | ||
// then it's a parenthesized `as` pattern. For example: | ||
// | ||
// ```python | ||
// match subject: | ||
// case (x as y): | ||
// pass | ||
// ``` | ||
// | ||
// In this case, we should return early as it's not a valid value | ||
// for the class and complex literal pattern. This would lead to | ||
// panic when trying to convert it to an expression because there's | ||
// no way to represent it in the AST. | ||
// | ||
// We exit early in this case, while in others we continue parsing, | ||
// even if it's invalid because we can recover from it. | ||
return pattern; | ||
} | ||
|
||
pattern | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't required now that we don't panic.
// SAFETY: The `parse_match_pattern_lhs` function can only return | ||
// an `as` pattern if it's parenthesized and the recovery context | ||
// makes sure that this closure is never called in that case. | ||
// This is because `(` is not in the `MAPPING_PATTERN_START_SET`. | ||
recovery::pattern_to_expr(pattern) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, we don't panic so no need for the comment.
|
code | total | + violation | - violation | + fix | - fix |
---|---|---|---|---|---|
PLW0642 | 52 | 0 | 52 | 0 | 0 |
E302 | 5 | 5 | 0 | 0 | 0 |
Formatter (stable)
✅ ecosystem check detected no format changes.
Formatter (preview)
✅ ecosystem check detected no format changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind opening an issue for reworking the pattern recovery? I'm okay removing the panic branch for the release but I think we need to rething the pattern recovery holistically and either recover less (e.g. return the left hand side if positioned at a -
or +
and the left isn't a number), or change the AST structure to allow for better recovery.
arguments: Arguments { | ||
range: 492..500, | ||
args: [ | ||
Name( | ||
ExprName { | ||
range: 493..499, | ||
id: "", | ||
ctx: Invalid, | ||
}, | ||
), | ||
], | ||
keywords: [], | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised that the parsing both drops the a
and as b
parts. Is this intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we're replacing the as
pattern with an invalid expression node when the conversion takes place in pattern error recovery. If not, the parser would panic.
## Summary This PR avoids the panic when trying to recover from an invalid pattern. The panic occurs because there's no way to represent `x as y` syntax as a Python expression. There were few cases which were not accounted for when working on this. The cases where it would panic are the ones where only a subset of pattern is valid in a position but all patterns are valid in a nested position of the mentioned top-level pattern. For example, consider a mapping pattern and the following points: * Key position can only be a `MatchValue` and `MatchSingleton` * `MatchClass` can contain a `MatchAs` as an argument * `MatchAs` pattern with both pattern and name (`a as b`) cannot be converted to a valid expression Considering this, if a mapping pattern has a class pattern with an `as` pattern as one of it's argument, the parser would panic. ```py match subject: case {Foo(a as b): 1}: ... # ^^^^^^^^^^^^^^^^ mapping pattern # ^^^^^^^^^^^ mapping pattern key, class pattern # ^^^^^^ as pattern with both pattern and name ``` The same case is for complex literal patterns as well. I'm not sure what the recovery should look like but for now it's better to avoid the panic and create an invalid expression node in it's place. ## Test Plan Add new test cases and update the snapshots.
## Summary This PR avoids the panic when trying to recover from an invalid pattern. The panic occurs because there's no way to represent `x as y` syntax as a Python expression. There were few cases which were not accounted for when working on this. The cases where it would panic are the ones where only a subset of pattern is valid in a position but all patterns are valid in a nested position of the mentioned top-level pattern. For example, consider a mapping pattern and the following points: * Key position can only be a `MatchValue` and `MatchSingleton` * `MatchClass` can contain a `MatchAs` as an argument * `MatchAs` pattern with both pattern and name (`a as b`) cannot be converted to a valid expression Considering this, if a mapping pattern has a class pattern with an `as` pattern as one of it's argument, the parser would panic. ```py match subject: case {Foo(a as b): 1}: ... # ^^^^^^^^^^^^^^^^ mapping pattern # ^^^^^^^^^^^ mapping pattern key, class pattern # ^^^^^^ as pattern with both pattern and name ``` The same case is for complex literal patterns as well. I'm not sure what the recovery should look like but for now it's better to avoid the panic and create an invalid expression node in it's place. ## Test Plan Add new test cases and update the snapshots.
## Summary This PR avoids the panic when trying to recover from an invalid pattern. The panic occurs because there's no way to represent `x as y` syntax as a Python expression. There were few cases which were not accounted for when working on this. The cases where it would panic are the ones where only a subset of pattern is valid in a position but all patterns are valid in a nested position of the mentioned top-level pattern. For example, consider a mapping pattern and the following points: * Key position can only be a `MatchValue` and `MatchSingleton` * `MatchClass` can contain a `MatchAs` as an argument * `MatchAs` pattern with both pattern and name (`a as b`) cannot be converted to a valid expression Considering this, if a mapping pattern has a class pattern with an `as` pattern as one of it's argument, the parser would panic. ```py match subject: case {Foo(a as b): 1}: ... # ^^^^^^^^^^^^^^^^ mapping pattern # ^^^^^^^^^^^ mapping pattern key, class pattern # ^^^^^^ as pattern with both pattern and name ``` The same case is for complex literal patterns as well. I'm not sure what the recovery should look like but for now it's better to avoid the panic and create an invalid expression node in it's place. ## Test Plan Add new test cases and update the snapshots.
## Summary This PR avoids the panic when trying to recover from an invalid pattern. The panic occurs because there's no way to represent `x as y` syntax as a Python expression. There were few cases which were not accounted for when working on this. The cases where it would panic are the ones where only a subset of pattern is valid in a position but all patterns are valid in a nested position of the mentioned top-level pattern. For example, consider a mapping pattern and the following points: * Key position can only be a `MatchValue` and `MatchSingleton` * `MatchClass` can contain a `MatchAs` as an argument * `MatchAs` pattern with both pattern and name (`a as b`) cannot be converted to a valid expression Considering this, if a mapping pattern has a class pattern with an `as` pattern as one of it's argument, the parser would panic. ```py match subject: case {Foo(a as b): 1}: ... # ^^^^^^^^^^^^^^^^ mapping pattern # ^^^^^^^^^^^ mapping pattern key, class pattern # ^^^^^^ as pattern with both pattern and name ``` The same case is for complex literal patterns as well. I'm not sure what the recovery should look like but for now it's better to avoid the panic and create an invalid expression node in it's place. ## Test Plan Add new test cases and update the snapshots.
(Supersedes #9152, authored by @LaBatata101) ## Summary This PR replaces the current parser generated from LALRPOP to a hand-written recursive descent parser. It also updates the grammar for [PEP 646](https://peps.python.org/pep-0646/) so that the parser outputs the correct AST. For example, in `data[*x]`, the index expression is now a tuple with a single starred expression instead of just a starred expression. Beyond the performance improvements, the parser is also error resilient and can provide better error messages. The behavior as seen by any downstream tools isn't changed. That is, the linter and formatter can still assume that the parser will _stop_ at the first syntax error. This will be updated in the following months. For more details about the change here, refer to the PR corresponding to the individual commits and the release blog post. ## Test Plan Write _lots_ and _lots_ of tests for both valid and invalid syntax and verify the output. ## Acknowledgements - @MichaReiser for reviewing 100+ parser PRs and continuously providing guidance throughout the project - @LaBatata101 for initiating the transition to a hand-written parser in #9152 - @addisoncrump for implementing the fuzzer which helped [catch](#10903) [a](#10910) [lot](#10966) [of](#10896) [bugs](#10877) --------- Co-authored-by: Victor Hugo Gomes <[email protected]> Co-authored-by: Micha Reiser <[email protected]>
Summary
This PR avoids the panic when trying to recover from an invalid pattern.
The panic occurs because there's no way to represent
x as y
syntax as a Python expression. There were few cases which were not accounted for when working on this.The cases where it would panic are the ones where only a subset of pattern is valid in a position but all patterns are valid in a nested position of the mentioned top-level pattern.
For example, consider a mapping pattern and the following points:
MatchValue
andMatchSingleton
MatchClass
can contain aMatchAs
as an argumentMatchAs
pattern with both pattern and name (a as b
) cannot be converted to a valid expressionConsidering this, if a mapping pattern has a class pattern with an
as
pattern as one of it's argument, the parser would panic.The same case is for complex literal patterns as well.
I'm not sure what the recovery should look like but for now it's better to avoid the panic and create an invalid expression node in it's place.
Test Plan
Add new test cases and update the snapshots.