-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PCRE2 JIT compilation causes invalid memory access #13013
Comments
I could reproduce this by simply matching many regexes: 5000.times { "".match Regex.new("a") } The same snippet produces SIGSEGV even on Linux and macOS. The crash goes away if the GC is disabled, or if the |
module A
@[ThreadLocal]
@@x = Bytes.new(256, &.to_u8!)
# this should trigger a GC cycle (explicit `GC.collect` doesn't work)
10000.times do
Bytes.new(64)
end
puts @@x.hexdump
end
The only other place that uses module A
@[ThreadLocal]
@@x = Bytes.new(256, &.to_u8!)
# for each thread???
LibGC.add_roots(pointerof(@@x), pointerof(@@x) + 1)
end
module Regex::PCRE2
@[ThreadLocal]
class_getter jit_stack : LibPCRE2::JITStack do
jit_stack = LibPCRE2.jit_stack_create(32_768, 1_048_576, Regex::PCRE2.general_context)
if jit_stack.null?
raise "Error allocating JIT stack"
end
jit_stack
end
# for each thread???
LibGC.add_roots(pointerof(@@jit_stack), pointerof(@@jit_stack) + 1)
end This indeed fixes the crash, but apparently this is not the only way; Nim had this issue too and they turned on some kind of TLS emulation. |
Where are we using ThreadLocal? I think a bunch of us knew it doesn't work well and we avoid it. There's another mechanism that to have thread local stuff, I think Ctystal::ThreadLocalVar or something. We only use ThreadLocal in a single place on Crystal because it's generally broken. |
|
Right, that's what I meant. We use My question is: are we using |
There's this: af251b5 |
If you change |
That said, I can't remember why using |
I think we can remove the recursion too if we use |
...or maybe not, or at least we have to use a plain array rather than a
|
But I don't think there's an issue with using Could you try it? |
I tried it, and indeed I don't think there is an easy way out for that |
Maybe we should also move that annotation inside the |
Alternatively, clearly document what it does and what not. |
Actually, it's defined by the compiler and part of the interface for stdlib. It's not in the stdlib API. |
I think we remove it from the language reference: Its only use is in |
JIT compilation with PCRE2 can causes an invalid memory access. This was discovered while running stdlib specs with PCRE2 (https://github.com/crystal-lang/crystal/actions/runs/4008539255/jobs/6883750327).
I'm not entirely sure what exactly triggers the error condition. So far I have only been able to reproduce it with running the entire
std_spec
suite. The error happens pretty much immediately after starting the spec run, before any output is printed (even with--verbose
).The invalid memory access happens somewhere inside
pcre2_match
.I have established that there are indeed a couple of regular expression matches happening before it errors.
The first match that failed was the equivalent of
Regex.new("^(-?)0x([0-9A-Fa-f]+)(?:\\.([0-9A-Fa-f]+))?p([+-]?)([0-9]+)$").match("0x1.FFFFFEp+62")
. But when skipping that one, a completely different regex fails. So I assume the error condition is not directly related to the specific match environment, and probably more systematic. I suppose there could be something wrong with how we're setting up the JIT compilation option (#12866).I have not been able to reproduce the error with smaller sets of specs yet.
On all other platforms, JIT seems to be working good (or we're failing to hit some error conditions 🙈).
The text was updated successfully, but these errors were encountered: