-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: specifying allowed characters when fuzzy-matching #338
Comments
Original comment by Matthew Barnett (Bitbucket: mrabarnett, GitHub: mrabarnett). Could you mock up an example? |
Original comment by Peter Holm (Bitbucket: [Peter Holm](https://bitbucket.org/Peter Holm), ). Say I have a large corpus of news articles, blogposts, etc. and I am looking for misspellings of "matthew mcconaughey". His last name is notoriously difficult to get right, so it is expected that a lot of variations will pop up. With a regex like |
Original comment by Matthew Barnett (Bitbucket: mrabarnett, GitHub: mrabarnett). Your example looks wrong because it has |
Original comment by Peter Holm (Bitbucket: [Peter Holm](https://bitbucket.org/Peter Holm), ). Sorry, you’re correct I’ve edited it now :) |
Original comment by Matthew Barnett (Bitbucket: mrabarnett, GitHub: mrabarnett). Done in regex 2019.08.19. I've chosen to use ":" instead of a "|", so your example would be |
Original comment by Peter Holm (Bitbucket: [Peter Holm](https://bitbucket.org/Peter Holm), ). Awesome! I like that solution a lot. |
Original report by Anonymous.
It would be a neat feature if one, in addition to the type of fuzziness, could specify character classes when performing a fuzzy match. For instance, trying to fuzzy match a word with spelling errors it hardly makes sense to allow introduction of [^a-z] characters.
The text was updated successfully, but these errors were encountered: