Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pyupgrade] Do not report when a UTF-8 comment is followed by a non-UTF-8 one (UP009) #14728

Merged
merged 13 commits into from
Dec 11, 2024

Conversation

InSyncWithFoo
Copy link
Contributor

Summary

Resolves #14704.

Test Plan

cargo nextest run and cargo insta test.

Copy link
Contributor

github-actions bot commented Dec 2, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should address the case when there are multiple coding comments (that might also be on lines other than the first two)

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, thanks for implementing the count_lines method!

There's one more edge case that needs handling

#! /usr/bin/env/python
# -*- coding: utf-8 -*-
# -*- coding: ascii -*-

The second comment gets flagged as unnecessary, which is incorrect because removing it would change the encoding to ascii

@InSyncWithFoo
Copy link
Contributor Author

Cases to handle, as a table:

1 2 3 E
- - -
- - UTF-8
- - Other
- UTF-8 - x
- UTF-8 UTF-8 x
- UTF-8 Other
- Other -
- Other UTF-8
- Other Other
UTF-8 - - x
UTF-8 - UTF-8 x
UTF-8 - Other x
UTF-8 UTF-8 - x
UTF-8 UTF-8 UTF-8 x
UTF-8 UTF-8 Other x
UTF-8 Other -
UTF-8 Other UTF-8
UTF-8 Other Other
Other - -
Other - UTF-8
Other - Other
Other UTF-8 -
Other UTF-8 UTF-8
Other UTF-8 Other
Other Other -
Other Other UTF-8
Other Other Other

@MichaReiser
Copy link
Member

@InSyncWithFoo what's the status of this PR? Is it ready for re-review?

@InSyncWithFoo
Copy link
Contributor Author

@MichaReiser It is. Please do.

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the feedback.

The rule currently incorrectly flags the following comments:

x = 10
# -*- coding: utf-8 -*-
# -*- coding: utf-8 -*-
x = 10
# -*- coding: utf-8 -*-

I think we need to make the comment_ranges iterator more stateful. It should never return Some after it has seen any non-comment. It might be worth implementing a custom iterator and tracking some internal state.

@MichaReiser MichaReiser added rule Implementing or modifying a lint rule bug Something isn't working labels Dec 11, 2024
@MichaReiser MichaReiser enabled auto-merge (squash) December 11, 2024 10:25
@MichaReiser MichaReiser merged commit c8d505c into astral-sh:main Dec 11, 2024
20 checks passed
@InSyncWithFoo InSyncWithFoo deleted the UP009 branch December 11, 2024 12:52
TheBits pushed a commit to TheBits/ruff that referenced this pull request Dec 11, 2024
…-UTF-8 one (`UP009`) (astral-sh#14728)

## Summary

Resolves astral-sh#14704.

## Test Plan

`cargo nextest run` and `cargo insta test`.

---------

Co-authored-by: Micha Reiser <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working rule Implementing or modifying a lint rule
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UP009 fix changes a file from UTF-8 to a different declared encoding
3 participants