Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve multiline strings in code generator #10000

Closed
wants to merge 3 commits into from

Conversation

sanxiyn
Copy link
Contributor

@sanxiyn sanxiyn commented Feb 15, 2024

This is a draft PR for #7799 posted for discussion. Minimally tested with cargo dev round-trip.

Extending ruff_python_literal to know more about Python string literals (raw, triple, etc.) should be uncontroversial. Changes here are incomplete (no raw, universal newline, bytes...) but completing shouldn't be difficult.

I am less sure about ruff_python_codegen changes. AST doesn't have enough information. Instead of adding needed information (after all, it's AST not CST), in this PR Generator has an optional Locator. When Locator is available, string kind information is taken from original source and passed to ruff_python_literal. Behavior is unchanged if Locator is unavailable.

Copy link
Contributor

github-actions bot commented Feb 15, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

@sanxiyn sanxiyn force-pushed the codegen-multiline-string branch from bd55daf to 3f64ab2 Compare February 15, 2024 15:31
@AlexWaygood
Copy link
Member

@sanxiyn -- in #10298, I modified ruff's AST so that we now track many stylistic details in the AST that were previously not tracked:

  • Whether a string is raw or not
  • The quoting style (double or single quotes)
  • Whether a string is triple-quoted or not

Would you be interested in updating your PR to use the information newly available in our AST? If not, I can pick this up and list you as a co-author

@AlexWaygood AlexWaygood force-pushed the codegen-multiline-string branch from dfc8fe9 to ca44a83 Compare March 13, 2024 17:47
@MichaReiser
Copy link
Member

Using the Locator in the Generator is clever to avoid tracking additional data in the AST. However, it won't work for AST nodes with no corresponding element in the source (AST nodes that have been built or modified manually). We could dedect such nodes by testing if the range is 0..0, but I'm somewhat hesitant about making the Generator depend on the Locator.

@charliermarsh
Copy link
Member

I'm going to close this PR for now as @AlexWaygood is planning to take a different approach by leveraging the augmented AST.

@charliermarsh
Copy link
Member

Thanks @sanxiyn!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants