Default word regex and snake_case checking #2979

DimitriPapadopoulos · 2023-07-28T08:11:12Z

The underscore (_) is part of \w. From https://docs.python.org/3/library/re.html#regular-expression-syntax:

\w

For Unicode (str) patterns:
Matches Unicode word characters; this includes alphanumeric characters (as defined by str.isalnum()) as well as the underscore (_). If the ASCII flag is used, only [a-zA-Z0-9_] is matched.

For 8-bit (bytes) patterns:
Matches characters considered alphanumeric in the ASCII character set; this is equivalent to [a-zA-Z0-9_]. If the LOCALE flag is used, matches characters considered alphanumeric in the current locale and the underscore.

Is there an easy way to get \w except _ in the non-ASCII case? It would help checking snake_case.

codespell/codespell_lib/_codespell.py

Line 31 in ec0a5b9

word_regex_def = "[\\w\\-'’`]+"

The text was updated successfully, but these errors were encountered:

DimitriPapadopoulos · 2023-07-28T08:21:34Z

Unicode regexes with set operations might help, but they are not available in Python yet. From https://docs.python.org/3/library/re.html#regular-expression-syntax:

Support of nested sets and set operations as in Unicode Technical Standard #18 might be added in the future. This would change the syntax, so to facilitate this change a FutureWarning will be raised in ambiguous cases for the time being. That includes sets starting with a literal '[' or containing literal character sequences '--', '&&', '~~', and '||'. To avoid a warning escape them with a backslash.

DimitriPapadopoulos · 2023-07-28T08:29:22Z

DimitriPapadopoulos · 2023-07-28T11:47:23Z

Duplicate of #2730.

DimitriPapadopoulos mentioned this issue Jul 28, 2023

Fix the name of the extra word lists we load #2976

Merged

DimitriPapadopoulos changed the title ~~Default word regex and snake_case checking~~ Default word regex and **snake_case** checking Jul 28, 2023

DimitriPapadopoulos changed the title ~~Default word regex and **snake_case** checking~~ Default word regex and snake_case checking Jul 28, 2023

DimitriPapadopoulos added enhancement duplicate and removed duplicate labels Jul 28, 2023

DimitriPapadopoulos closed this as completed Jul 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default word regex and snake_case checking #2979

Default word regex and snake_case checking #2979

DimitriPapadopoulos commented Jul 28, 2023 •

edited

Loading

DimitriPapadopoulos commented Jul 28, 2023

DimitriPapadopoulos commented Jul 28, 2023

DimitriPapadopoulos commented Jul 28, 2023

Default word regex and snake_case checking #2979

Default word regex and snake_case checking #2979

Comments

DimitriPapadopoulos commented Jul 28, 2023 • edited Loading

DimitriPapadopoulos commented Jul 28, 2023

DimitriPapadopoulos commented Jul 28, 2023

DimitriPapadopoulos commented Jul 28, 2023

DimitriPapadopoulos commented Jul 28, 2023 •

edited

Loading