Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

f-string position is miscomputed in multiple cases #5004

Closed
Tracked by #4972
addisoncrump opened this issue Jun 10, 2023 · 3 comments
Closed
Tracked by #4972

f-string position is miscomputed in multiple cases #5004

addisoncrump opened this issue Jun 10, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@addisoncrump
Copy link
Contributor

addisoncrump commented Jun 10, 2023

The position of errors is miscomputed in f-strings:

$ ruff --no-cache test2.py --show-source
error: Failed to parse test2.py:1:5: f-string: single '}' is not allowed
test2.py:1:5: E999 SyntaxError: f-string: single '}' is not allowed
  |
1 | f' }'
  |     ^ E999
  |

Found 1 error.

Moreover, it is not sound in the presence of carriage return + newline:

$ $ cat -e test3.py 
f'''^M$
^M$
^M$
^M$
^M$
 }'''^M$
$ ruff test3.py --fix --show-source --no-cache
error: Failed to parse test3.py:4:2: f-string: single '}' is not allowed
test3.py:4:2: E999 SyntaxError: f-string: single '}' is not allowed
  |
4 |   
5 | | 
  | |_^ E999
6 |    }'''
  |

Found 1 error.

In rare cases, this may lead to a panic if multi-byte UTF-8 characters are present (carriage return + newline followed by UTF-8 char before mismatched }):

$ ruff --no-cache minimized-from-8e22bc8c0f8f652f62a9a858e232b0bbf18e702b
warning: Linting panicked minimized-from-8e22bc8c0f8f652f62a9a858e232b0bbf18e702b: This indicates a bug in `ruff`. If you could open an issue at:

https://github.com/astral-sh/ruff/issues/new?title=%5BLinter%20panic%5D

with the relevant file contents, the `pyproject.toml` settings, and the following stack trace, we'd be very appreciative!

panicked at 'byte index 10 is not a char boundary; it is inside '〈' (bytes 8..11) of `f'''

〈}'''
`'

Locator::after slices a source code snippet at the byte level. This may lead to an invalid UTF-8 slice if the index is in the middle of a code point, which occurs during f-string computation. This occurs in multiple f-string error types.

I am still searching for a root cause.

@addisoncrump addisoncrump changed the title f-string single right brace position is miscomputed f-string position is miscomputed Jun 10, 2023
@addisoncrump addisoncrump changed the title f-string position is miscomputed f-string position is miscomputed in multiple cases Jun 10, 2023
@dhruvmanila dhruvmanila added the bug Something isn't working label Jun 10, 2023
charliermarsh pushed a commit that referenced this issue Jun 12, 2023
Improves the `ruff_parse_simple` fuzz harness by adding checks for
parsed locations to ensure they all lie on UTF-8 character boundaries.
This will allow for faster identification of issues like #5004.

This also adds additional details for Apple M1 users and clarifies the
importance of using `init-fuzzer.sh` (thanks for the feedback,
@jasikpark 🙂).
konstin pushed a commit that referenced this issue Jun 13, 2023
Improves the `ruff_parse_simple` fuzz harness by adding checks for
parsed locations to ensure they all lie on UTF-8 character boundaries.
This will allow for faster identification of issues like #5004.

This also adds additional details for Apple M1 users and clarifies the
importance of using `init-fuzzer.sh` (thanks for the feedback,
@jasikpark 🙂).
@charliermarsh
Copy link
Member

Any idea if this is still true?

@MichaReiser
Copy link
Member

I believe this is fixed. At least the offsets in the playground look correct https://play.ruff.rs/957af831-fd05-416e-8cff-e23cab5cca2a

@addisoncrump
Copy link
Contributor Author

I don't have an example testcase presently, but if I come across one I'll reopen this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants