Implement many-to-one mapping between codes and rules #2517

not-my-profile · 2023-02-03T05:56:39Z

Implements #2186.

charliermarsh · 2023-02-03T21:13:37Z

This is really exciting.

I've read through some of the code and commits but not the entirety of the change (yet). From my testing, it seems like every rule has one "primary" code. So useless-object-inheritance is mapped to PLR0205, PIE792, and UP004. And no matter how you --select it, it's always reported as PIE792. Similarly, you have to use PIE792 to # noqa it. I thinks this could be confusing, to --select UP and see PIE792 violations. How do you view it? Is that an accurate description of current behavior?

charliermarsh · 2023-02-03T21:24:25Z

I'm not sure what I would expect as a user. I could imagine a few things:

If I --select UP --select PIE, I see the error twice, once under both codes. (If I --select UP, I see the error once, under UP004; if I --select PIE, I see the error once, under PIE792.)
If I --select UP --select PIE, I see the error once, under the "first" (?) code, or under the preferred code as is implicitly implemented here. (If I --select UP, I see the error once, under UP004; if I --select PIE, I see the error once, under PIE792.)

I don't know how I'd expect noqa violations to work: should any matching code ignore the violation despite the selector that was used to enable it? Or should I be required to add a noqa for every matching code individually?

not-my-profile · 2023-02-04T03:20:36Z

From my testing, it seems like every rule has one "primary" code. I thinks this could be confusing, to --select UP and see PIE792 violations. How do you view it? Is that an accurate description of current behavior?

Yes that's how it currently works. I'd expect us to switch to human-friendly rule names #1773, so no matter which code you used to select in the future ruff will always only report the human-friendly rule name. Until we make that switch the behavior will be a bit confusing, but I don't see a good way around that.

see the error twice

I am positive that we absolutely do not want to report any error multiple times under any circumstance.

Note that it's totally possible that a rule is enabled via multiple codes, I don't think we want to report multiple codes for a single violation, so we have to pick one ... until we have come up with rule naming guidelines and renamed our rules accordingly.

should any matching code ignore the violation despite the selector that was used to enable it?

Yes. A rule can be identified via multiple codes. Which code you used to enable a rule doesn't matter.

charliermarsh · 2023-02-04T03:46:24Z

I worry that running --select UP and seeing a PIE792 violation is a confusing enough experience that I'd want to really consider whether we enable these at all prior to migrating to human-friendly rule names.

charliermarsh · 2023-02-12T05:22:12Z

I'm hesitant to merge this as-is due to some of the confusing behaviors that users will experience around aliasing (e.g., the --select UP-to-PIE scenario described above).

I don't have great answers for them yet, but I know that if we ship this as-is, it will be confusing for users, and we'll get a lot of feedback and questions stemming from that confusion.

If we want to merge this while giving ourselves time to figure out the best solution for aliasing, what we could do is merge this change but not yet implement any of the actual aliases (apart from, perhaps, deprecating rules, like BLE001).

not-my-profile · 2023-02-12T06:04:41Z

Right that makes sense. I think the course of action is:

Rename our rules to match the naming convention ... this does entail the merging of several rules (see #2714).
(We do this first since after the next step renaming rules will be a breaking change.)
Allow rules to be selected by their name and report violations by their name.
Enable the many-to-one mapping.

I already rebased this PR yesterday, which was quite work-intensive since #2583 had landed in between. Further rebasing shouldn't take that much effort, so I'd be alright with leaving this PR open ... but yeah it would be nice to merge this just so that I don't have to deal with merging other changes that may crop up.

I could add an assert statement to the map_codes proc macro to assert that at most one code is mapped to one rule ... then we could merge this without any UX changes¹ and simply remove the assert statements once we get to the above mentioned 3rd step.

Aside from the C, C9, T, T1, T2 prefix deprecations in the 7th commit. ↩

sbrugman · 2023-02-13T18:45:44Z

(Heads-up: I will rename the pathlib rules in #2348 to os-path - the thing we detect)

not-my-profile · 2023-02-14T04:44:51Z

I think that landing this PR, will take some coordination that this will be merged before any other registry.rs changes, since this PR changes the format of the frequently edited registry.rs file.

We want to remove the variants denoting whole Linters from the RuleCodePrefix enum, so we have to introduce a new RuleSelector::Linter variant.

Post this commit series several codes can be mapped to a single rule, this commit therefore renames Rule::code to Rule::noqa_code, which is the code that --add-noqa will add to ignore a rule.

charliermarsh · 2023-02-14T05:02:22Z

👍 I’m happy to merge this next assuming it doesn’t change the UX right now — is that the case? I have to re-read the code but I’ll review tomorrow after the JetBrains webinar, and hopefully can merge tomorrow afternoon or evening.

not-my-profile · 2023-02-14T05:06:05Z

Ok great. Yes after I have dropped the last commit demonstrating the mapping and added the aforementioned assert statement there shouldn't be any UX changes.

Rule::noqa_code previously return a single &'static str, which was possible because we had one enum listing all rule code prefixes. This commit series will however split up the RuleCodePrefix enum into several enums ... so we'll end up with two &'static str ... this commit wraps the return type of Rule::noqa_code into a newtype so that we can easily change it to return two &'static str in the 6th commit of this series.

Same reasoning as for the previous commit ... one &'static str becomes two &'static str because we split the RuleCodePrefix enum. Note that the .unwrap() we have to add now, will actually be removed in the 6th commit.

Currently the define_rule_mapping! macro generates both the Rule enum as well as the RuleCodePrefix enum and the mapping between the two. After this commit series the macro will only generate the Rule enum and the RuleCodePrefix enum and the mapping will be generated by a new map_codes proc macro, so we rename the macro now to fit its new purpose.

# This commit was generated by running the following Python code: # (followed by `sed -Ei 's/(mod registry;)/\1mod codes;/' crates/ruff/src/lib.rs` # and `cargo fmt`). import json import re import subprocess def parse_registry(): file = open('crates/ruff/src/registry.rs') rules = [] while next(file) != 'ruff_macros::register_rules!(\n': continue while (line := next(file)) != ');\n': line = line.strip().rstrip(',') if line.startswith('//') or line.startswith('#['): rules.append(line) continue code, path = line.split(' => ') name = path.rsplit('::')[-1] rules.append((code, name)) while (line := next(file)) != 'pub enum Linter {\n': continue prefixes = [] prefix2linter = [] while (line := next(file).strip()) != '}': if line.startswith('//'): continue if line.startswith('#[prefix = '): prefixes.append(line.split()[-1].strip('"]')) else: for prefix in prefixes: prefix2linter.append((prefix, line.rstrip(','))) prefixes.clear() prefix2linter.sort(key = lambda t: len(t[0]), reverse=True) return rules, prefix2linter rules, prefix2linter = parse_registry() def parse_code(code): prefix = re.match('[A-Z]+', code).group() if prefix in ('E', 'W'): return 'Pycodestyle', code for prefix, linter in prefix2linter: if code.startswith(prefix): return linter, code[len(prefix) :] assert False text = ''' use crate::registry::{Linter, Rule}; pub fn code_to_rule(linter: Linter, code: &str) -> Option<Rule> { #[allow(clippy::enum_glob_use)] use Linter::*; Some(match (linter, code) { ''' for entry in rules: if isinstance(entry, str): if entry.startswith('//'): text += '\n' + entry else: text += entry else: namespace, code = parse_code(entry[0]) text += f'({namespace}, "{code}") => Rule::{entry[1]},' text += '\n' text += ''' _ => return None, }) } ''' with open('crates/ruff/src/codes.rs', 'w') as f: f.write(text)

This commit was generated by running: fastmod --accept-all '[A-Z]+[0-9]+ => ' '' crates/ruff/src/registry.rs

charliermarsh · 2023-02-14T20:43:14Z

(Returning to this now.)

not-my-profile · 2023-02-14T20:54:55Z

Let me know if you have questions.

ruff.schema.json

crates/ruff_cli/src/printer.rs

crates/ruff/src/settings/pyproject.rs

charliermarsh · 2023-02-14T21:10:38Z

Cool this looks good to me -- merging...

not-my-profile force-pushed the many-to-one branch 2 times, most recently from 6dd9837 to 94cbacf Compare February 3, 2023 06:28

ngnpope mentioned this pull request Feb 3, 2023

rule command: show all code aliases for a rule #2531

Open

ngnpope mentioned this pull request Feb 9, 2023

Human-friendly rule names #1773

Open

1 task

charliermarsh mentioned this pull request Feb 10, 2023

Merge rules checking the same violation #2714

Open

7 tasks

not-my-profile force-pushed the many-to-one branch 6 times, most recently from c7f549a to 00b2bd7 Compare February 10, 2023 21:17

sbrugman mentioned this pull request Feb 11, 2023

[flake8-use-pathlib] autofix and new rules #2348

Closed

not-my-profile mentioned this pull request Feb 12, 2023

ruff autocompletion for --select rules #2808

Closed

sbrugman mentioned this pull request Feb 13, 2023

Implement SIM104 as an alias of UP028 #2866

Closed

not-my-profile added 2 commits February 14, 2023 05:54

many-to-one 0/9: Introduce RuleSelector::Linter variant

016e3b0

We want to remove the variants denoting whole Linters from the RuleCodePrefix enum, so we have to introduce a new RuleSelector::Linter variant.

many-to-one 1/9: Rename Rule::code to Rule::noqa_code

8f90f95

Post this commit series several codes can be mapped to a single rule, this commit therefore renames Rule::code to Rule::noqa_code, which is the code that --add-noqa will add to ignore a rule.

not-my-profile added 4 commits February 14, 2023 06:57

many-to-one 3/9: Update RuleSelector::short_code

ce488b1

Same reasoning as for the previous commit ... one &'static str becomes two &'static str because we split the RuleCodePrefix enum. Note that the .unwrap() we have to add now, will actually be removed in the 6th commit.

not-my-profile added 4 commits February 14, 2023 06:57

many-to-one 6/9: Implement ruff_macros::map_codes

54bf613

many-to-one 7/9: Update JSON schema

796c9d0

many-to-one 8/9: Drop codes from registry

95087ed

This commit was generated by running: fastmod --accept-all '[A-Z]+[0-9]+ => ' '' crates/ruff/src/registry.rs

many-to-one 9/9: Update table generation

0d5ed29

not-my-profile force-pushed the many-to-one branch from 00b2bd7 to 9b92e55 Compare February 14, 2023 06:12

Disable many-to-one mapping for now

fdc71d0

not-my-profile force-pushed the many-to-one branch from 9b92e55 to fdc71d0 Compare February 14, 2023 06:13

charliermarsh reviewed Feb 14, 2023

View reviewed changes

ruff.schema.json Show resolved Hide resolved

charliermarsh reviewed Feb 14, 2023

View reviewed changes

crates/ruff_cli/src/printer.rs Show resolved Hide resolved

charliermarsh reviewed Feb 14, 2023

View reviewed changes

crates/ruff/src/settings/pyproject.rs Show resolved Hide resolved

charliermarsh merged commit 3179fc1 into astral-sh:main Feb 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement many-to-one mapping between codes and rules #2517

Implement many-to-one mapping between codes and rules #2517

not-my-profile commented Feb 3, 2023 •

edited

Loading

charliermarsh commented Feb 3, 2023

charliermarsh commented Feb 3, 2023

not-my-profile commented Feb 4, 2023

charliermarsh commented Feb 4, 2023

charliermarsh commented Feb 12, 2023

not-my-profile commented Feb 12, 2023

sbrugman commented Feb 13, 2023

not-my-profile commented Feb 14, 2023 •

edited

Loading

charliermarsh commented Feb 14, 2023

not-my-profile commented Feb 14, 2023

charliermarsh commented Feb 14, 2023

not-my-profile commented Feb 14, 2023

charliermarsh commented Feb 14, 2023

Implement many-to-one mapping between codes and rules #2517

Implement many-to-one mapping between codes and rules #2517

Conversation

not-my-profile commented Feb 3, 2023 • edited Loading

charliermarsh commented Feb 3, 2023

charliermarsh commented Feb 3, 2023

not-my-profile commented Feb 4, 2023

charliermarsh commented Feb 4, 2023

charliermarsh commented Feb 12, 2023

not-my-profile commented Feb 12, 2023

Footnotes

sbrugman commented Feb 13, 2023

not-my-profile commented Feb 14, 2023 • edited Loading

charliermarsh commented Feb 14, 2023

not-my-profile commented Feb 14, 2023

charliermarsh commented Feb 14, 2023

not-my-profile commented Feb 14, 2023

charliermarsh commented Feb 14, 2023

not-my-profile commented Feb 3, 2023 •

edited

Loading

not-my-profile commented Feb 14, 2023 •

edited

Loading