-
Notifications
You must be signed in to change notification settings - Fork 471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🚧 Ignore escaped characters #1422
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a real code base this PR takes my run from 1.097 sec to 1.508 sec. If I change your function to:
for key, rep in find_dict.items():
text = text.replace(key, rep)
return text
it only increases to 1.161 sec. So maybe regex
is less than ideal here.
Can you check on PyVista and see if you observe the same?
Same problem here. I've noticed that |
We're now using replace. As for making the substitution dictionary customizable, would you prefer to have a file containing the key/value pairs, or a string that they read in? |
I find putting characters like |
Agreed, I was encountering that issue as well when testing it out. I'm thinking simple key value pairs in a csv, where the pairs are separated by line breaks.
|
Might make more sense to match the dictionary format thing->replacement |
As we’re going to have to use spaces, are quotes permitted in your dictionary format?
|
I would just make it that anything to the left of
|
Should be good to review now. |
Looks good. One last thing: we probably want to provide and use a default file, just like we do for the dictionary. Maybe it should just contain |
Perhaps |
Fine with me |
@@ -232,6 +232,10 @@ def parse_options(args): | |||
help='Comma separated list of words to be ignored ' | |||
'by codespell. Words are case sensitive based on ' | |||
'how they are written in the dictionary file') | |||
parser.add_argument('-P', '--sub-pairs', type=str, metavar='FILE', | |||
help='Custom substitution text file that contains ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit unclear from the help detail what this is for, is it to "fix up" the dictionary to deal with it matching escape sequences? To actually do sed type runs on my codebase or something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this linked to #233 ?
Also in the related PR #174 it was noted that the |
Why the close? I see this is quite old 🤔 |
Just cleaning up old PRs. I'd like to work on this, but this project hasn't been updated in quite some time (last release 6 months ago). |
Hi @akaszynski , Despite the lack of releases, coding is still happening (although mostly in the dictionary), also see #1923: You didn't seem to have respond to the review comments myself and @larsoner had left if you were expecting it to have been merged. |
True, there's still work to be done. I'll work on this. |
Today, codespell looks at "\tRead" which is "tab" followed by "Read" as "tRead", and flags this as a spelling mistake. See: codespell-project/codespell#1422 which was closed without merging. Once that is resolved upstream - we can undo this. Signed-off-by: Robin Getz <[email protected]>
Today, codespell looks at "\tRead" which is "tab" followed by "Read" as "tRead", and flags this as a spelling mistake. See: codespell-project/codespell#1422 which was closed without merging. Once that is resolved upstream - we can undo this. Signed-off-by: Robin Getz <[email protected]>
Today, codespell looks at "\tRead" which is "tab" followed by "Read" as "tRead", and flags this as a spelling mistake. See: codespell-project/codespell#1422 which was closed without merging. Once that is resolved upstream - we can undo this. Signed-off-by: Robin Getz <[email protected]>
This take a stab at ignoring escaped characters. It's a basic (and probably inefficient) implementation, so if there's a better way to get this done, please let me know.
Also, the escape dictionary should be modifiable by the user.