Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

thread_local macro stability precludes safe async signal handling #30003

Open
mahkoh opened this issue Nov 23, 2015 · 10 comments
Open

thread_local macro stability precludes safe async signal handling #30003

mahkoh opened this issue Nov 23, 2015 · 10 comments
Labels
C-bug Category: This is a bug. I-needs-decision Issue: In need of a decision. P-low Low priority T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@mahkoh
Copy link
Contributor

mahkoh commented Nov 23, 2015

The thread_local! macro accepts arbitrary (non-Sync) objects to be put into thread local storage. It is not hard to construct a case where this causes signal handlers to observe inconsistent state:

extern {
    fn signal(num: i32, handler: extern fn(i32)) -> extern fn(i32);
}

use std::cell::{RefCell};

/// First and second value always the same.
thread_local!(static X: RefCell<(usize,usize)> = RefCell::new((0,0)));

extern fn handler(_: i32) {
    X.with(|x| {
        let x = x.borrow();
        println!("{:?}", *x);
    });
}

fn main() {
    unsafe { signal(2, handler); }
    X.with(|x| {
        let mut x = x.borrow_mut();
        x.0 += 1;
        // raise(2)
        x.1 += 1;
    });
}

RefCell is not signal safe. A mutable borrow will not mark the RefCell as being borrowed in this case. This can be simulated as follows:

  • Compile with -O -C lto
  • In gdb, step to the instruction that stores the second value.
  • signal 2

Expected result: panic/abort/segfault or similar. Actual result: (1, 0) is printed.

Fixing RefCell by adding a memory barrier does not fix the problem since there might be many other non-Sync types that were not written with signal handling in mind and that use unsafe constructs. For correctness, TLS would have to be restricted to types that are async safe via a new marker trait. With such a trait, signal handling would be safe by default in all rust code and all signals handlers could call arbitrary rust functions (as long as said functions don't call non-rust code which might not be async safe.)


This concerns me because adding a signal handler is a safe operation in lrs and all #[thread_local] objects that require mutation are wrapped in a single threaded mutex implementation with interior mutability. And if it is decided that async signal handling is never safe in rust, then #[thread_local] might be stabilized and might also start to accept arbitrary objects which would practically force me to create a full compiler fork for the sake of safety. The current implementation in lrs is already unsafe because the single threaded mutex implementation has to be marked Sync to be placed in a #[thread_local]. For correctness, there would have to be the above mentioned marker trait that restricts what can be put in a #[thread_local].

@eefriedman
Copy link
Contributor

There are other substantial problems with asynchronous signal handlers... for example, malloc() isn't signal-safe. It might be possible to deal with that... but it would be a lot of work, a lot of if in_signal_hander { signal_handler_abort() }, and a bunch of changes to crates outside the standard library which wrap C libraries. Even if we did manage to pull it off, signal handlers would still be a minefield of aborts and/or deadlocks, so making them safe doesn't actually gain much.

Given all of that, it's extremely unlikely the standard library will ever provide safe async signal handlers.


#[thread_local] is not going to be stabilized as-is; it's possible that accessing a raw thread_local global will be considered unsafe in the final design.

@mahkoh
Copy link
Contributor Author

mahkoh commented Nov 23, 2015

There are other substantial problems with asynchronous signal handlers... for example, malloc() isn't signal-safe.

It is certainly possible to write a signal-safe malloc and it is also possible to write programs that don't call any C functions. Furthermore, there are probably also malloc implementations out there that are not thread-safe. Yet this does not stop the language from requiring statics to be Sync.

@brson brson added P-low Low priority T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. T-lang Relevant to the language team, which will review and decide on the PR/issue. I-needs-decision Issue: In need of a decision. and removed T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels May 4, 2017
@Mark-Simulacrum Mark-Simulacrum added the C-bug Category: This is a bug. label Jul 24, 2017
@steveklabnik
Copy link
Member

Triage: this is still marked as needs-decision for @rust-lang/lang, but i'm not sure there was ever a discussion on it.

Personal opinion: the root issue here is the unsafe, and so it's not likely to be fixed.

@mkroening
Copy link
Contributor

I just published InterruptRefCell, which uses a compiler fence to synchronize with signal handlers and interrupt handlers to share a RefCell with them. In the original example at the top, the signal would be deferred until after the RefCell is not accessed from the main thread anymore. Of course, one could explicitly enable signals while accessing an InterruptRefCell, though.

@RalfJung
Copy link
Member

RalfJung commented Oct 8, 2023

In the original example at the top, the signal would be deferred until after the RefCell is not accessed from the main thread anymore.

Which mechanism achieves that?

@mkroening
Copy link
Contributor

Which mechanism achieves that?

Ah, sorry. I mask (disable) all signals when borrowing InterruptRefCell and restore the previous mask once the borrow is done. When masked, incoming signals are deferred.

@RalfJung
Copy link
Member

RalfJung commented Oct 8, 2023

Ah okay, so that way the RefCell is "thread-safe" for signals. Makes sense.

Enabling signals while having the InterruptRefCell open should even be fine since other "threads" will see borrow flag and avoid giving a 2nd borrow, right? The concurrency due to signals/interrupts only matters during borrow/borrow_mut to avoid race conditions. (And there might have to be some fences as well to avoid reordering.)

@mkroening
Copy link
Contributor

Yeah, enabling them should be "fine" but unwanted, since trying to borrow mutably in a signal handler while the main context holds another reference would not work.

So losing mutual exclusion with signal handlers on RefCells is safe but bad, similar to deadlocks and mutexes, I would say.

@mkroening
Copy link
Contributor

And there might have to be some fences as well to avoid reordering.

Yeah, compiler fences are the base for this.

@RalfJung
Copy link
Member

RalfJung commented Oct 9, 2023

Note that you absolutely need to have interrupts disabled whenever the RefCell's borrow flags are accessed. This is crucial for soundness. So doing just "best-effort" interrupt disabling is unsound. See mkroening/interrupt-ref-cell#5.

So losing mutual exclusion with signal handlers on RefCells is safe but bad, similar to deadlocks and mutexes, I would say.

No, it's unsound. You get a data race between the thread releasing its hold of the RefCell and the handler trying to acquire it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. I-needs-decision Issue: In need of a decision. P-low Low priority T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

7 participants