fix mishandling of ^ inside an expression #61
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently, if a
^
character appears inside a regex pattern and is neither the first character of the expression nor the first character in a character class, then it will be "compiled" to aBEGIN
object, but will be interpreted by the pattern matching code as aCHAR
object (in thematchone
function). This means it will try to match the text against theBEGIN
objectsu.ch
field which will be uninitialized memory.For the fix I checked what the Python regex implementation did in this case, it appears that it does not match the
^
character inside an expression, here's some python code that demonstrates this:Along with the fix I also added a regression test that ensures that
a^
never matches any sequence of two characters starting witha
.