-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix PCRE crashing on invalid UTF-8 #13240
Fix PCRE crashing on invalid UTF-8 #13240
Conversation
I'll be a broken record, but I think it's best to leave PCRE alone and only introduce the change in PCRE2 EDIT: This bug is also present in PCRE, even if it's only reported in PCRE2 (thx Johannes!) |
it seems the generated compiler is not able to compile regexes, if I'm understanding the errors correctly |
No the segfault like https://github.com/crystal-lang/crystal/actions/runs/4554373556/jobs/8032148978?pr=13240 happens already in |
The libpcre2 version in the build image is 10.34 while locally I'm using 10.42. The same memory error happens with 10.34 on alpine. I added a workaround to detect the library version and skip this spec when it's below 10.35 🤞 (I was planning to add a way to query the version at runtime anyway in a follow-up, so this is not as much overhead for this simple spec as it may seem) |
Next problem: So I guess we'll need a version guard for using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks as good as it can, given the constraints 👍
It seems the 1.7.3 CI is now permanently stuck on generating the docs |
This reverts commit ec097ca.
|
Fixes #13237
Regex::Error
whenpcre_exec
errors. This was previously ignored in the PCRE bindings and the methods just returnedfalse
.NO_UTF_CHECK
for both compile and exec/match functions in PCRE and PCRE2MATCH_INVALID_UTF
for PCRE2 (this is unavailable in PCRE)Due to
MATCH_INVALID_UTF
in PCRE2 we get different behaviour in both engine versions when the subject string contains invalid UTF-8. PCRE will always raise. PCRE2 will try to match but invalid bytes can never be part of a match.