Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix mishandling of ^ inside an expression #61

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

marler8997
Copy link

@marler8997 marler8997 commented Mar 6, 2021

Currently, if a ^ character appears inside a regex pattern and is neither the first character of the expression nor the first character in a character class, then it will be "compiled" to a BEGIN object, but will be interpreted by the pattern matching code as a CHAR object (in the matchone function). This means it will try to match the text against the BEGIN objects u.ch field which will be uninitialized memory.

For the fix I checked what the Python regex implementation did in this case, it appears that it does not match the ^ character inside an expression, here's some python code that demonstrates this:

import re
r = re.compile("a^")
for i in range(0, 255):
    text = "a" + chr(i)
    print("checking '{}'".format(text))
    assert(not r.match(text))

Along with the fix I also added a regression test that ensures that a^ never matches any sequence of two characters starting with a.

@xavieryin xavieryin mentioned this pull request Mar 8, 2021
@marler8997
Copy link
Author

marler8997 commented Apr 13, 2021

ping @kokke, if you have time I'd like to get some of the trivial PRs reviewed. This one is a single-line bug fix to the implementation along with some additional testing. Another trivial bug fix is #64

If you review/merge these 2 PRs, I can then maybe try to identify the other bugs that are preventing the random tests from working so we can enforce that they pass on the CI (see #66). Getting the tests working again is the first thing I'd like to help with since it's the sure fire way to enforce that any other changes are correct.

matyalatte added a commit to matyalatte/tiny-str-match that referenced this pull request Jul 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant