You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I understand the word frequency method doesn't do so well with common misspellings (since it is likely the source data is contaminated with common typos) but is there any way to add to a 'blacklist' of common misspellings, easily sourced from: https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines
ex. taht tiem are both not considered misspellings by pyspellchecker.
The text was updated successfully, but these errors were encountered:
I found a workaround which was pretty straightforward - making wikipedia's list of common misspellings into a dictionary and checking through that afterwards. It would be nice if it was incorporated into pyspellchecker itself, though.
There is a way to fix these issues in future builds of the dictionary. Words added to scripts/data/{lang}_exclude.txt will remove those words from the next build of the dictionaries.
As always, PR's or code to generate the list of common typos to add to this file is always welcome. Thanks!
I understand the word frequency method doesn't do so well with common misspellings (since it is likely the source data is contaminated with common typos) but is there any way to add to a 'blacklist' of common misspellings, easily sourced from: https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines
ex. taht tiem are both not considered misspellings by pyspellchecker.
The text was updated successfully, but these errors were encountered: