-
Notifications
You must be signed in to change notification settings - Fork 471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong replacement of "complies" (v. comply) with "compiles" (v. compile) #2273
Comments
Yes, I wish we had a dictionary that has these types of terms but isn't utilized by default when running codespell. |
@peternewman I've encountered the false positive more than actual hits. But I still think this is valid. But it doesn't belong in code dictionary. Thoughts? |
So this is the same issue as #2281 . I'd prefer not to put them in the main code dictionary, as currently all the "typos" in there are invalid dictionary words, so we can validate that the typo doesn't exist in the dictionary at least. We talked about splitting rare into rare and really rare, depending on whether the typo exists in the core aspell dictionaries or something, and hence how likely the typo is to be the intended word, but these would still match that (so maybe a new dictionary like that should be disabled by default as there's a good chance you actually meant the typo, compared to the more obscure stuff in the really rare dictionary). Although I guess the difference is in a code context, there's a good chance crate is what you meant (for rust at least), whereas for this, most people probably mean compiles in a code context. i.e. whether you want the typo in the dictionary or not is opposite for the two. Perhaps the better way to think is because they're "popular" rare words, we can be less confident that they're actually typos and so the dictionary shouldn't be enabled by default (for corrections at least, but possibly generally). |
Also see valid word |
Today codespell check started complaining about the |
same here on the Apache License the suggestion to stop using the "rare" dictionary should work, but it seems like it should also be removed from the default dictionaries for general usability |
See codespell-project/codespell#2273 Signed-off-by: Sunjay Bhatia <[email protected]>
Okay so clearly the licence thing blows my comment above out of the water regarding code. Which leaves us with:
Suggestions very much welcome for names and descriptions for the two rare dictionaries. Running the split along aspell dictionary lines seems like the obvious thing as it makes the CI easy to implement and manage what lives in each (although I guess ideally a more restricted/core aspell dictionary than we might want for corrections. |
I honestly think that suggesting to replace one legitimate word with another without analyzing the context is not correct. |
My 1st suggestion. My 2nd suggestion in plain English description My 3rd suggestion in plain English description |
I'm not well versed into how dictionaries are setup but if the problem we're all having is just with complies/compiles replacement, and nobody claims that this replacement has been useful to them, I'd suggest just removing it. I don't see a need for disabling other useful replacements by defaults just because one of them is often faulty. The faulty one should just be removed (or can be moved to a dedicated dictionary disabled by default, but it should be one specifically with well known potential false positives). |
@peternewman @larsoner @luzpaz |
IMHO most people will appreciate a spell checker flagging chancel->cancel for example, although there will be a tiny proportion who will be annoyed it complains when they write about part of a church. As always it's a case of where you draw the line! In an ideal world we'd either come up with an alternative correction too, or perhaps better yet flag those dictionaries to not be auto-fixed even if they have single corrections so complies never silently becomes compiles.
I quite like these two options, we'll just need some clear labelling as to whether rare words or rare typos go in the rare dictionary (and vice versa).
At least one person must have found it likely as it's made it into the dictionary in the first place. Indeed it looks like it was @luzpaz in #2012 .
There's no requirement for that anyway, the one they don't want can be ignored on the command line or in a word list file.
I would suggest we still need to fix this properly, or at least not close this issue until it's fixed. Unless the word appears in one of our dictionaries, someone will just re-add it and we'll be back at the start again. I'd also strongly encourage people to run their CI against a git checkout of our latest, rather than releases (or at least our latest dictionary), I've been doing that for ages, it means you catch more typos quicker and we avoid big bangs like this. I'd also like to note that ironically complie->compile is missing from our dictionaries, despite it being the one genuine only typo amongst them all! |
I vote for quick fix and figure it out in a less time-sensitive manner. |
This is a real problem for people so I'll remove it today and cut a 2.2.1, but we should sort this out soon so I agree about not closing the issue |
See codespell-project/codespell#2273 Signed-off-by: Sunjay Bhatia <[email protected]>
whats left here ? |
@larsoner I think this issue can be closed. |
I don't believe so. @larsoner removed complies->compiles etc in #2462 as a temporary measure, but they need reintroducing to a dictionary (possibly a new one). Alternatively if its decided we don't want them anywhere due to issues around code, then we really need to tackle the slightly harder issue of a block list, so the word complies as a typo can't ever get into our dictionaries again, or issues like this will keep repeating themselves. I'd also assume this is still true, but I haven't checked:
|
Seen while testing e47c98c on the https://github.com/godotengine/godot codebase:
While it's true that "compiles" is probably more common than "complies" in codebases, the verb "comply" is still commonly used when talking about specifications. Maybe this hit should be downgraded to a hint of potential typo instead of an autofix one?
The text was updated successfully, but these errors were encountered: