-
Notifications
You must be signed in to change notification settings - Fork 165
Instance::run panics if another instance trapped in the same thread #323
Comments
The issue disappears if every instance is created in a separate thread:
|
Hi @roman-kashitsyn, thanks for the report! Unfortunately I'm so far unable to replicate this issue, at least on my Linux machine. Can you share more about your OS environment? Also, the "0.1.1" versions of the Lucet packages on crates.io are pretty old at this point, it may also be worth trying the current versions from |
Sorry, I forgot to mention that I observe the issue on macOS 10.14 and 10.15. I'll try to reproduce it on current |
I got access to a Mac to test this out, and can confirm that it's also an issue with It appears there's something strange going on with the semantics of alternate signal handler stacks on Mac. When a Lucet guest traps, a |
I'll check if the problem still reproducible if I call It's weird that the problem is not reproducible on Linux, the man page says
As far as I understand, that should mean that it's illegal to change the alternative signal stack if it has already been used to handle a signal. |
We should never be swapping out the alt stack when the signal handler is running, only after we've swapped back to the host context. Linux must correctly recognize when we've jumped away from the handler, where Mac OS still believes it's running. |
I tried to construct a small C program reproducing the macOS issue with #include <stdlib.h>
#include <stdio.h>
#include <signal.h>
static volatile int G_caught_sig = 0;
static char G_sig_stack_1[SIGSTKSZ];
static char G_sig_stack_2[SIGSTKSZ];
void handler(int sig, siginfo_t *_info, void *_arg) {
G_caught_sig = sig;
}
int main() {
stack_t ss, old_ss;
ss.ss_size = SIGSTKSZ;
ss.ss_flags = 0;
ss.ss_sp = &G_sig_stack_1;
if (sigaltstack(&ss, &old_ss) < 0) {
perror("sigaltstack");
exit(EXIT_FAILURE);
}
struct sigaction sa;
sa.sa_flags = SA_ONSTACK | SA_SIGINFO;
sa.sa_sigaction = &handler;
sigemptyset(&sa.sa_mask);
if (sigaction(SIGILL, &sa, NULL) < 0) {
perror("sigaction");
exit(EXIT_FAILURE);
}
raise(SIGILL);
printf("Caught signal = %d\n", G_caught_sig);
ss.ss_sp = &G_sig_stack_2;
if (sigaltstack(&ss, &old_ss) < 0) {
perror("sigaltstack");
exit(EXIT_FAILURE);
}
return 0;
} UPD: One interesting difference I observed is that the syscall trace of this small C program contains |
This is more evidence that Mac OS behaves differently when jumping out of a signal handler, rather than returning. Unfortunately for many of the signals that a guest raises when faulting can't be immediately resolved, so it would just keep reraising the same signal if we returned from the handler. I wonder if there's a syscall similar to |
Yes, that makes a lot of sense. I wrote another small C program that uses #include <stdlib.h>
#include <stdio.h>
#include <signal.h>
#include <setjmp.h>
#include <unistd.h>
#include <sys/mman.h>
static jmp_buf jump_buffer;
static volatile int G_caught_sig = 0;
static volatile char *G_memory = NULL;
static char G_sig_stack_1[SIGSTKSZ];
static char G_sig_stack_2[SIGSTKSZ];
void handler(int sig, siginfo_t *_info, void *_arg) {
G_caught_sig = sig;
longjmp(jump_buffer, sig);
}
int main() {
G_memory = mmap(0, getpagesize(), PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (G_memory == MAP_FAILED) {
perror("mmap");
exit(EXIT_FAILURE);
}
stack_t ss, old_ss;
ss.ss_size = SIGSTKSZ;
ss.ss_flags = 0;
ss.ss_sp = &G_sig_stack_1;
if (sigaltstack(&ss, &old_ss) < 0) {
perror("sigaltstack");
exit(EXIT_FAILURE);
}
struct sigaction sa;
sa.sa_flags = SA_ONSTACK | SA_SIGINFO;
sa.sa_sigaction = &handler;
sigemptyset(&sa.sa_mask);
if (sigaction(SIGBUS, &sa, NULL) < 0) {
perror("sigaction");
exit(EXIT_FAILURE);
}
while (setjmp(jump_buffer) == 0) {
printf("Setting memory at %p\n", G_memory);
G_memory[0] = 'z';
}
printf("Caught signal = %d\n", G_caught_sig);
ss.ss_sp = &G_sig_stack_2;
if (sigaltstack(&ss, &old_ss) < 0) {
perror("sigaltstack");
exit(EXIT_FAILURE);
}
return 0;
} The program terminates successfully. The syscall trace still contains |
Ah, it looks like Again, we probably won't be able to dive deeply into this in the short term, but this is very helpful diagnostic information for us. Thank you! |
It looks like thread-local state becomes infected once an instance traps (e.g. by executing
unreachable
), and when the next instance is created in the same thread, it panics with the following backtrace:The panic comes from this place:
You can reproduce the issue using the following small crate:
src/main.rs
Cargo.toml
The text was updated successfully, but these errors were encountered: