-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
python.gram: Incorrect column numbers in SyntaxError #47
Comments
SyntaxError expects 1-indexed column numbers, but tokens are 0-indexed. Fixes we-like-parsers#47
Sorry for dropping the ball here. I tried adding tests against actual Python, and found lots of matches, but also many discrepancies. Here's an example on Python 3.10.0: >>> {**c, a: *b}
File "<stdin>", line 1
{**c, a: *b}
^^
SyntaxError: cannot use a starred expression in a dictionary value By contrast, pegen thinks the error involves one more character: File "<unknown>", line 1
{**c, a: *b}
^^^
SyntaxError: cannot use a starred expression in a dictionary value There's also discrepancies between Python versions. For example, here's a Python 3.9.7 error: >>> {a: 1, c: 2, b}
File "<stdin>", line 1
{a: 1, c: 2, b}
^
SyntaxError: invalid syntax And here's the corresponding Python 3.10.0 error: >>> {a: 1, c: 2, b}
File "<stdin>", line 1
{a: 1, c: 2, b}
^
SyntaxError: ':' expected after dictionary key And here's pegen's error: File "<unknown>", line 1
{a: 1, b}
^^
SyntaxError: ':' expected after dictionary key Overall, there are 68 discrepancies from Python 3.10 (and 72 from Python 3.9). Presumably these are errors that should be fixed, but I'm not sure whether you want to merge in a broken test suite. I could use some guidance as to what's best. Perhaps I should add a pytest command-line option (and possibly mark these tests) and only run them when the command-line option is specified? |
Discrepancies between 3.9 and 3.10 are somewhat expected I think. @pablogsal could confirm it but I think several improvements have been made to error reporting in 3.10. |
Yeah, In general, custom error messages and error positions may change between minor versions as some of them cannot be backported for stability reasons. In general, we should match the latest stable version against pegen |
Python's
tokenize.TokenInfo
structure has 0-indexed column numbers, butSyntaxError
wants 1-indexed column numbers. See this discussion.This code from
python.gram
is thus incorrect:pegen/data/python.gram
Lines 306 to 309 in d130038
On the other hand, this code from
parser.py
is correct:pegen/src/pegen/parser.py
Line 262 in d130038
Could be fixed as part of #41. @MatthieuDartiailh let me know what you prefer; also happy to help add
1+
in the two necessary spots.The text was updated successfully, but these errors were encountered: