Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

F504: Carriage return prevents expression extraction #4899

Closed
Tracked by #4972
addisoncrump opened this issue Jun 6, 2023 · 8 comments · Fixed by #7062
Closed
Tracked by #4972

F504: Carriage return prevents expression extraction #4899

addisoncrump opened this issue Jun 6, 2023 · 8 comments · Fixed by #7062
Labels
bug Something isn't working

Comments

@addisoncrump
Copy link
Contributor

addisoncrump commented Jun 6, 2023

In Python, a standalone carriage return character (0xd or \r) is considered whitespace. One could easily argue that this is quite problematic, given that carriage returns can potentially be used to hide parts of source code. That said, it does trigger a failure in F504 (\r replaced literally):

"" % {
  'test1': '',\r  'test2': '',
}

And a downloadable version to test against.

I would simply mark F504 as sometimes fixable, but this solution doesn't feel right. Instead, this feels indicative of incorrect handling of carriage returns as plain whitespace. At the same time, standalone carriage returns should probably be marked as invalid anyways.

I leave this to the discretion of the developers of Ruff 🙂 We should find a fix one way or another to avoid future issues with this being marked as a failure case in the fuzzer.

Discovered by #4822.

@charliermarsh
Copy link
Member

Perhaps a bug in LibCST, which is failing to parse it?

@addisoncrump
Copy link
Contributor Author

I'm not so clear on that. Is this because libCST is used to parse the expression for repair?

@charliermarsh
Copy link
Member

Yeah

@charliermarsh
Copy link
Member

The LibCST parse call throws an error, but it could be our fault, not theirs. We may need to wrap the expression in parentheses before parsing.

@dhruvmanila
Copy link
Member

We may need to wrap the expression in parentheses before parsing.

I did a quick experiment and this doesn't help either, it still throws an error:

libcst_native::parse_expression(&format!("({})", &contents))
Failed to parse file: Internal error while parsing whitespace: Found newline at (2, 15) but it's not EOL

@MichaReiser
Copy link
Member

MichaReiser commented Aug 21, 2023

This is a bug in LibCST. It seems they don't support mixed newlines:

https://github.com/Instagram/LibCST/blob/c91655fbba7c27db05423673609f10203316b9c0/native/libcst/src/tokenizer/whitespace_parser.rs#L67-L88

Changing the split to split on all new lines seems trivial enough, but I worry that there's other code that simply doesn't work if you use \r

For example:

https://github.com/Instagram/LibCST/blob/c91655fbba7c27db05423673609f10203316b9c0/native/libcst/src/tokenizer/whitespace_parser.rs#L90-L94

@MichaReiser
Copy link
Member

MichaReiser commented Aug 30, 2023

Upstream fix in LibCST Instagram/LibCST#1007

@MichaReiser
Copy link
Member

The upstream fix is merged. All that we need to do now is to update libcst 🫣

dhruvmanila added a commit that referenced this issue Sep 3, 2023
## Summary

This PR updates the revision of `LibCST` dependency to Instagram/LibCST@9c263aa
inorder to fix #4899

## Test Plan

The test case including the carriage return (`\r`) character was added for
`F504` and then `cargo test`.

fixes: #4899
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants