-
-
Notifications
You must be signed in to change notification settings - Fork 21.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix RegEx search_all
for zero length matches/lookahead
#85783
Conversation
search_all
for zero length matches/lookahead
Damn, just experimented and noticed that this doesn't handle zero length lookbehinds very well. To be fair, the previous code also breaks at zero length lookbehinds, my code just breaks it in a different (and better?) way. I'll see if I can amend this PR to take that into account, along with some more unit tests. |
41edc0c
to
7ff4163
Compare
And pushed. Managed to find a solution that worked for zero length results, whether lookahead or lookbehind. Added lookbehind tests too. While the original bug report didn't explicitly state lookbehind as an issue, it still falls under the same root issue that this PR is trying to fix (ie. Zero length results not being handled gracefully by 'search_all') |
7ff4163
to
9b1ffb2
Compare
Came across another bug that this PR fixes already (this one: #73920 ). Gonna add another test case to this PR the covers the circumstance that the bug lists. |
8279da2
to
6534f7d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested all the test cases with https://regex101.com/. It matches.
And I approve the change of logic of RegEx::search_all()
, it makes more sense now and it shows how the previous logic was flawed. It could only return 1 result if it got any zero-length match.
Should the documentation for RegEx be amended to clarify the change in behavior? |
No? From what I read, the documentation is OK, it's just that it was bugged for some usecases, which this PR fixes. I'll take a look if it's possible to cherry-pick this PR for 4.1. |
search_all
for zero length matches/lookaheadsearch_all
for zero length matches/lookahead
6534f7d
to
7b2fd34
Compare
Thanks! |
hey thanks for the fix! 🎉 |
Cherry-picked for 4.2.2. |
Fixes #85605, fixes #73920
'search_all' in RegEx function wasn't handling zero length matches well. When one was found, it would stop and not search for any more matches. My fix simply increments the position in the string after a zero length match and proceeds as normal otherwise.
This fixes the lookahead problem in the linked issue, as lookaheads are one way of having zero lenght matches that still return a value.
Of note, I've also attached a few extra RegEx tests that check for lookahead functions, include the specific example in the linked issue.