Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The whitelist does not work when words have mixed case #1860

Closed
xsmq opened this issue Jan 21, 2021 · 5 comments
Closed

The whitelist does not work when words have mixed case #1860

xsmq opened this issue Jan 21, 2021 · 5 comments
Labels

Comments

@xsmq
Copy link

xsmq commented Jan 21, 2021

1. Unused whitelist

$ codespell -q 7 mindspore/mindspore/lite/tools/converter/parser/tf
mindspore/mindspore/lite/tools/converter/parser/tf/tf_merge_parser.cc:39: MergeT ==> merge
mindspore/mindspore/lite/tools/converter/parser/tf/tf_activation_parser.cc:66: shoud ==> should

image

2. Add whitelist

(1) MergeT

$ cat codespell.txt
MergeT
shoud

$ codespell -q 7 -I codespell.txt mindspore/mindspore/lite/tools/converter/parser/tf
mindspore/mindspore/lite/tools/converter/parser/tf/tf_merge_parser.cc:39: MergeT ==> merge

image

(2) nNumber

$ cat codespell.allow
MergeT
nNumber

$ codespell -q 7 -I codespell.allow mindspore/mindspore/ccsrc/minddata/dataset/engine/connector.h
mindspore/mindspore/ccsrc/minddata/dataset/engine/connector.h:151: nNumber ==> number
mindspore/mindspore/ccsrc/minddata/dataset/engine/connector.h:152: nNumber ==> number

image

(3) REALEASE

$ cat codespell.allow
pyhton
REALEASE

$ codespell -q 7 -I codespell.allow mindspore/mindspore/lite/examples/train_lenet/README.md
mindspore/mindspore/lite/examples/train_lenet/README.md:67: REALEASE ==> RELEASE
mindspore/mindspore/lite/examples/train_lenet/README.md:68: betweeen ==> between
mindspore/mindspore/lite/examples/train_lenet/README.md:72: followings ==> following
mindspore/mindspore/lite/examples/train_lenet/README.md:78: paramaters ==> parameters

image

@r3econ
Copy link

r3econ commented Apr 26, 2021

I'm experiencing the same problem. Makes using this tool in production impossible

@peternewman
Copy link
Collaborator

Hi @xsmq ,

As mentioned to @r3econ in codespell-project/actions-codespell#29 , from the main codespell help ( https://github.com/codespell-project/codespell#readme ):

Important note: The list passed to -I is case-sensitive based on how it is listed in the codespell dictionaries.

2. Add whitelist

(1) MergeT

grep -iIR merget codespell_lib/data/
codespell_lib/data/dictionary.txt:merget->merge

(2) nNumber

$ cat codespell.allow
MergeT
nNumber

grep -iIR nnumber codespell_lib/data/
codespell_lib/data/dictionary.txt:nnumber->number

(3) REALEASE

$ cat codespell.allow
pyhton
REALEASE

grep -iIR ^realease codespell_lib/data/
codespell_lib/data/dictionary.txt:realease->release
codespell_lib/data/dictionary.txt:realeased->released
codespell_lib/data/dictionary.txt:realeases->releases

So you want these in your codespell.allow:

merget
nnumber
realease

@xsmq
Copy link
Author

xsmq commented Jun 28, 2021

OK,thanks.

@peterjc
Copy link
Contributor

peterjc commented Aug 29, 2021

I just struggled with this with an all capital term, having initially expected this to mean case sensitive to match the input file. That does seem a more intuitive behaviour - although a change to the tool.

@xsmq xsmq closed this as completed Nov 30, 2021
@peternewman
Copy link
Collaborator

I just struggled with this with an all capital term, having initially expected this to mean case sensitive to match the input file.

@peterjc we'd welcome suggestions for how to make the existing help file more clear/less confusing:

Important note: The list passed to -I is case-sensitive based on how it is listed in the codespell dictionaries.

That does seem a more intuitive behaviour - although a change to the tool.

If it had to match the input file, then if I had this input:

SPELING TOOLS
There are lots of tools to catch spelings available

Then I'd need to add two ignore entries for the two cases that appear in the input.

I'm not entirely sure of the history of why we don't just do a case-insensitive comparison, but it would prevent some classes of typos being entered and therefore detected (e.g. names being in lower case).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants