Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong replacement of "complies" (v. comply) with "compiles" (v. compile) #2273

Open
akien-mga opened this issue Feb 10, 2022 · 18 comments
Open
Labels
dictionary Changes to the dictionary

Comments

@akien-mga
Copy link
Contributor

Seen while testing e47c98c on the https://github.com/godotengine/godot codebase:

diff --git a/doc/classes/EditorPaths.xml b/doc/classes/EditorPaths.xml
index c4d4c92afe..afd55d15f3 100644
--- a/doc/classes/EditorPaths.xml
+++ b/doc/classes/EditorPaths.xml
@@ -6,7 +6,7 @@
	<description>
		This editor-only singleton returns OS-specific paths to various data folders and files. It can be used in editor plugins to ensure files are saved in the correct location on each operating system.
		[b]Note:[/b] This singleton is not accessible in exported projects. Attempting to access it in an exported project will result in a script error as the singleton won't be declared. To prevent script errors in exported projects, use [method Engine.has_singleton] to check whether the singleton is available before using it.
-		[b]Note:[/b] Godot complies with the [url=https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html]XDG Base Directory Specification[/url] on [i]all[/i] platforms. You can override environment variables following the specification to change the editor and project data paths.
+		[b]Note:[/b] Godot compiles with the [url=https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html]XDG Base Directory Specification[/url] on [i]all[/i] platforms. You can override environment variables following the specification to change the editor and project data paths.
	</description>
	<tutorials>
		<link title="File paths in Godot projects">https://docs.godotengine.org/en/latest/tutorials/io/data_paths.html</link>

While it's true that "compiles" is probably more common than "complies" in codebases, the verb "comply" is still commonly used when talking about specifications. Maybe this hit should be downgraded to a hint of potential typo instead of an autofix one?

@luzpaz
Copy link
Collaborator

luzpaz commented Feb 11, 2022

Yes, I wish we had a dictionary that has these types of terms but isn't utilized by default when running codespell.

@luzpaz
Copy link
Collaborator

luzpaz commented Mar 2, 2022

@peternewman I've encountered the false positive more than actual hits. But I still think this is valid. But it doesn't belong in code dictionary. Thoughts?

@peternewman peternewman added the dictionary Changes to the dictionary label Mar 3, 2022
@peternewman
Copy link
Collaborator

So this is the same issue as #2281 .

I'd prefer not to put them in the main code dictionary, as currently all the "typos" in there are invalid dictionary words, so we can validate that the typo doesn't exist in the dictionary at least.

We talked about splitting rare into rare and really rare, depending on whether the typo exists in the core aspell dictionaries or something, and hence how likely the typo is to be the intended word, but these would still match that (so maybe a new dictionary like that should be disabled by default as there's a good chance you actually meant the typo, compared to the more obscure stuff in the really rare dictionary).

Although I guess the difference is in a code context, there's a good chance crate is what you meant (for rust at least), whereas for this, most people probably mean compiles in a code context. i.e. whether you want the typo in the dictionary or not is opposite for the two.

Perhaps the better way to think is because they're "popular" rare words, we can be less confident that they're actually typos and so the dictionary shouldn't be enabled by default (for corrections at least, but possibly generally).

@vikivivi
Copy link
Contributor

Also see valid word complies in Apache License.
https://www.apache.org/licenses/LICENSE-2.0
distribution of the Work otherwise complies with the conditions stated in this License.

@TheMostDiligent
Copy link

Today codespell check started complaining about the complies in the license files.

@sunjayBhatia
Copy link

same here on the Apache License

the suggestion to stop using the "rare" dictionary should work, but it seems like it should also be removed from the default dictionaries for general usability

@peternewman
Copy link
Collaborator

whereas for this, most people probably mean compiles in a code context. i.e. whether you want the typo in the dictionary or not is opposite for the two.

Okay so clearly the licence thing blows my comment above out of the water regarding code.

Which leaves us with:

We talked about splitting rare into rare and really rare, depending on whether the typo exists in the core aspell dictionaries or something, and hence how likely the typo is to be the intended word, but these would still match that (so maybe a new dictionary like that should be disabled by default as there's a good chance you actually meant the typo, compared to the more obscure stuff in the really rare dictionary).

Perhaps the better way to think is because they're "popular" rare words, we can be less confident that they're actually typos and so the dictionary shouldn't be enabled by default (for corrections at least, but possibly generally).

Suggestions very much welcome for names and descriptions for the two rare dictionaries. Running the split along aspell dictionary lines seems like the obvious thing as it makes the CI easy to implement and manage what lives in each (although I guess ideally a more restricted/core aspell dictionary than we might want for corrections.

@TheMostDiligent
Copy link

I honestly think that suggesting to replace one legitimate word with another without analyzing the context is not correct.

@vikivivi
Copy link
Contributor

vikivivi commented Aug 18, 2022

Suggestions very much welcome for names and descriptions for the two rare dictionaries.

My 1st suggestion.
Simply rename rare dictionary to likely and disable it by default.

My 2nd suggestion in plain English description
likely1: valid words that are likely to be errors (default: enabled)
likely2: valid words that are likely to be errors (default: disabled)

My 3rd suggestion in plain English description
likely: valid words that are likely to be errors (default: enabled)
plausible: valid words that seems likely to be errors (default: disabled)

@akien-mga
Copy link
Contributor Author

akien-mga commented Aug 18, 2022

I'm not well versed into how dictionaries are setup but if the problem we're all having is just with complies/compiles replacement, and nobody claims that this replacement has been useful to them, I'd suggest just removing it.

I don't see a need for disabling other useful replacements by defaults just because one of them is often faulty. The faulty one should just be removed (or can be moved to a dedicated dictionary disabled by default, but it should be one specifically with well known potential false positives).

@vikivivi
Copy link
Contributor

@peternewman @larsoner @luzpaz
A quick fix is most likely needed by removing complies->compiles in dictionary_rare.txt and make a new release quickly.
Yesterday, a new codespell stable release v2.2.0 was made. Many CI/CD checks is installing the latest stable release.
This new release leads to many users start reporting complies issue.

@peternewman
Copy link
Collaborator

I honestly think that suggesting to replace one legitimate word with another without analyzing the context is not correct.

IMHO most people will appreciate a spell checker flagging chancel->cancel for example, although there will be a tiny proportion who will be annoyed it complains when they write about part of a church. As always it's a case of where you draw the line!

In an ideal world we'd either come up with an alternative correction too, or perhaps better yet flag those dictionaries to not be auto-fixed even if they have single corrections so complies never silently becomes compiles.

Suggestions very much welcome for names and descriptions for the two rare dictionaries.

My 1st suggestion. Simply rename rare dictionary to likely and disable it by default.
My 3rd suggestion in plain English description likely: valid words that are likely to be errors (default: enabled) plausible: valid words that seems likely to be errors (default: disabled)

I quite like these two options, we'll just need some clear labelling as to whether rare words or rare typos go in the rare dictionary (and vice versa).

I'm not well versed into how dictionaries are setup but if the problem we're all having is just with complies/compiles replacement, and nobody claims that this replacement has been useful to them, I'd suggest just removing it.

At least one person must have found it likely as it's made it into the dictionary in the first place. Indeed it looks like it was @luzpaz in #2012 .

I don't see a need for disabling other useful replacements by defaults just because one of them is often faulty. The faulty one should just be removed (or can be moved to a dedicated dictionary disabled by default, but it should be one specifically with well known potential false positives).

There's no requirement for that anyway, the one they don't want can be ignored on the command line or in a word list file.

@peternewman @larsoner @luzpaz A quick fix is most likely needed by removing complies->compiles in dictionary_rare.txt and make a new release quickly. Yesterday, a new codespell stable release v2.2.0 was made. Many CI/CD checks is installing the latest stable release. This new release leads to many users start reporting complies issue.

I would suggest we still need to fix this properly, or at least not close this issue until it's fixed. Unless the word appears in one of our dictionaries, someone will just re-add it and we'll be back at the start again.

I'd also strongly encourage people to run their CI against a git checkout of our latest, rather than releases (or at least our latest dictionary), I've been doing that for ages, it means you catch more typos quicker and we avoid big bangs like this.

I'd also like to note that ironically complie->compile is missing from our dictionaries, despite it being the one genuine only typo amongst them all!

@luzpaz
Copy link
Collaborator

luzpaz commented Aug 18, 2022

I vote for quick fix and figure it out in a less time-sensitive manner.

@larsoner
Copy link
Member

I would suggest we still need to fix this properly, or at least not close this issue until it's fixed.

This is a real problem for people so I'll remove it today and cut a 2.2.1, but we should sort this out soon so I agree about not closing the issue

@larsoner
Copy link
Member

#2462

skriss pushed a commit to projectcontour/contour-operator that referenced this issue Aug 18, 2022
@luzpaz
Copy link
Collaborator

luzpaz commented Sep 26, 2022

whats left here ?

@kurtmckee
Copy link
Contributor

@larsoner I think this issue can be closed.

@peternewman
Copy link
Collaborator

I don't believe so. @larsoner removed complies->compiles etc in #2462 as a temporary measure, but they need reintroducing to a dictionary (possibly a new one).

Alternatively if its decided we don't want them anywhere due to issues around code, then we really need to tackle the slightly harder issue of a block list, so the word complies as a typo can't ever get into our dictionaries again, or issues like this will keep repeating themselves.

I'd also assume this is still true, but I haven't checked:

I'd also like to note that ironically complie->compile is missing from our dictionaries, despite it being the one genuine only typo amongst them all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dictionary Changes to the dictionary
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants