Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the opsem of signal/interrupt handlers. #444

Open
chorman0773 opened this issue Aug 8, 2023 · 18 comments
Open

What is the opsem of signal/interrupt handlers. #444

chorman0773 opened this issue Aug 8, 2023 · 18 comments

Comments

@chorman0773
Copy link
Contributor

chorman0773 commented Aug 8, 2023

Signal Handlers (and interrupt handlers, which are basically equivalent) are a fun thing to specify.

Signal Handlers have been modeled as separate threads, but this is incorrect: Because the code that is interrupted is running on the same thread, you cannot, e.g. acquire a mutex, or perform I/O from a signal handler. (C, C++, and POSIX each individually maintain a more comprehensive list of what you can and cannot do: https://en.cppreference.com/w/cpp/utility/program/signal, https://en.cppreference.com/w/c/program/signal, https://man7.org/linux/man-pages/man7/signal-safety.7.html).

Some things that are clear, at least to me:

  • non-mut, impl Freeze statics should be accessible and yield proper values when read,
  • non-mut statics with Atomic* types should be accessible and be usable as normal,
  • You can call any function that C allows you to call in a signal handlers.

Note: This question only tracks asynchronous signals and asynchronous interrupts. I'd assume the answer to "What can a signal caused by raise do" is "Anything raise could do if it was an arbitrary opaque function".

Further sub-questions:

@RalfJung
Copy link
Member

RalfJung commented Aug 8, 2023

You said in the meetings there are examples where treating a signal handler as an independent thread causes not just liveness issues but UB?

@bjorn3
Copy link
Member

bjorn3 commented Aug 8, 2023

non-mut, impl Freeze statics should be accessible and yield proper values when read,

In C you did have to use volatile reads and writes, sig_atomic_t or atomic_signal_fence for that, right?

@RalfJung
Copy link
Member

RalfJung commented Aug 8, 2023

In C you did have to use volatile reads and writes, sig_atomic_t or atomic_signal_fence for that, right?

Do you? Even in C concurrent non-atomic reads of the same data from multiple threads are allowed. So I would expect "concurrent" non-atomic reads of the same data from a thread and its signal handler to be equally permitted.

@chorman0773
Copy link
Contributor Author

You said in the meetings there are examples where treating a signal handler as an independent thread causes not just liveness issues but UB?

The most direct examples are "Call an AS-unsafe function". I don't know a specific example of pure-rust, but I bet I could come up with one.

In C you did have to use volatile reads and writes, sig_atomic_t or atomic_signal_fence for that, right?

The C equivalent would be (at file scope, or static at function scope)

const int X = 0;

Do you? Even in C concurrent non-atomic reads of the same data from multiple threads are allowed. So I would expect "concurrent" non-atomic reads of the same data from a thread and its signal handler to be equally permitted.

You would think. Technically, it would have an unspecified value. I'd be quite shocked if that value was any different from it's initialized value, though.

@chorman0773
Copy link
Contributor Author

chorman0773 commented Aug 8, 2023

C++17's definition does fix that, by redefining it's access model in terms of it's MT Memory Model (and constant initialization happens-before every evaluated statement in the entire program).

@bjorn3
Copy link
Member

bjorn3 commented Aug 8, 2023

In C you did have to use volatile reads and writes, sig_atomic_t or atomic_signal_fence for that, right?

Do you? Even in C concurrent non-atomic reads of the same data from multiple threads are allowed. So I would expect "concurrent" non-atomic reads of the same data from a thread and its signal handler to be equally permitted.

For a non-atomic value you did have a data race without volatile or a signal fence and with an atomic value you did miss synchronization between the store and the start of the signal handler (so the signal handler could run before the store) or between the end of the signal handler and the load (so the signal handler could run after the load) unless you explicitly synchronize in some way for an async signal handler as far as I understand. For atomics this is UB free, but likely not what you want. On the other hand for sync signals I did expect a synchronization to happen between the function raising the signal and the signal handler itself, but it seems to not be guaranteed.

https://en.cppreference.com/w/c/program/signal

On entry to the signal handler, the state of the floating-point environment and the values of all objects is unspecified, except for

On return from a signal handler, the value of any object modified by the signal handler that is not volatile sig_atomic_t or lock-free atomic(since C11) is undefined.

@chorman0773
Copy link
Contributor Author

Do keep in mind that my statement was about non-mut impl Freeze statics.
They only have constant initializaition, and cannot be modified during execution.

That being said, your analysis is correct as to C (though, as I mentioned, C++17 would allow the C++ equivalent).

@RalfJung
Copy link
Member

RalfJung commented Aug 8, 2023

@bjorn3 We are specifically talking about the read-only case. There are no stores. Without stores, there cannot be data races nor missed synchronization.

@chorman0773
Copy link
Contributor Author

chorman0773 commented Aug 8, 2023

@bjorn3 is correct though, that in C it would have an unspecified value absent actual synchronization with a proper fence or atomic operation. I think it's reasonable to adopt something closer to the C++ definition, though.

@RalfJung
Copy link
Member

RalfJung commented Aug 8, 2023

That seems like an oversight that was fixed by C++17 properly integrating this into their concurrency model.

@chorman0773
Copy link
Contributor Author

chorman0773 commented Aug 8, 2023

Perhaps. I haven't heard of a proposal fixing it for C23, though, so the earliest it fixes in C is going to be 2029.

For rust, though, I"d definitely go with C++'s model, though.

Frankly, seems easier to make operational before even considering what it implies.

@RalfJung
Copy link
Member

Another interesting question for signal handlers is the FP environment. I think we want to allow inline assembly blocks to change the FP unit to a different state, perform some operations, and then change it back. However that means we must declare float operations as not-async-signal-safe, since the signal handler might encounter the FP environent in a non-default state which leads to Rust floating-point operations executing incorrectly.

(Thanks to @bjorn3 and @digama0 for pointing that out.)

@RalfJung
Copy link
Member

Or is that handled by the register save/restore logic when "context-switching" between the actual thread and the signal handler?

@comex
Copy link

comex commented Aug 13, 2023

Based on some testing, it varies:

  • Linux modifies the floating point mode on signal entry and exit (setting it to a default value on entry and restoring the previous value on exit).
  • macOS modifies the floating point mode only on exit (restoring it to the previous value, which is also seen inside the signal handler).
  • Windows does not modify the floating point mode on either entry or exit (at least for this method of raising a signal).

So indeed, a portable signal handler should assume the FP environment is in an unknown state, and make sure it's still in that state before returning. After all, it's not just inline assembly that might temporarily change the FP environment, but also external code called through FFI.

@Lokathor
Copy link
Contributor

I suspect that counting interrupt handlers and signal handlers as just one category is too much of a simplification.

Putting it perhaps overly broadly, interrupt handlers are usually used in a no_std environment when a hardware situation occurs, while signal handlers are usually used in a std-like environment when a software situation occurs. I have less experience with signal handlers, but it seems just from reading this thread like they have more restrictions than what an interrupt handler has to think about.

Should we try to split up the situation?

@comex
Copy link

comex commented Aug 14, 2023

I'd argue the restrictions have less to do with the actual differences between interrupt handlers and signal handlers, and more to do with the fact that C and POSIX specs try to specify signal handlers semi-formally and portably at the C level, while interrupt handlers are non-portable and are only specified at the assembly level. For example:

  • In both signal handlers and interrupt handlers, you want to avoid entering the memory allocator, since the interrupted code might already be in the middle of allocating something, resulting in a deadlock or UB depending on how the allocator is implemented. But for signal handlers, this is specified in the form "you can only call this specific list of system library functions" (and all other functions can potentially allocate or do other unsafe things). For interrupt handlers in no_std code, you're bringing your own allocator and your own 'system library functions', so it just depends on how that code works.

  • Most of the above discussion is about memory ordering. At the assembly level, memory ordering works the same way in signal handlers and interrupt handlers: you're guaranteed to observe memory writes performed on the same thread/core prior to the interrupted instruction. But the C spec tries to define how this works at the C level (and maybe messes it up, which is what most of the above discussion was about), while for interrupt handlers it's defined at the assembly level, and how that translates to the C level is only defined informally.

However, since we're trying to define high-level language semantics, and aren't focusing on a specific architecture, I don't think there's much advantage in splitting up the two topics.

@ChrisDenton
Copy link
Member

ChrisDenton commented Aug 14, 2023

Windows does not modify the floating point mode on either entry or exit (at least for this method of raising a signal).

Side note: Windows doesn't have signals or anything really equivalent. The C standard library does emulate them to the extent necessary but that's all it is.

@chorman0773
Copy link
Contributor Author

Another interesting question for signal handlers is the FP environment. I think we want to allow inline assembly blocks to change the FP unit to a different state, perform some operations, and then change it back. However that means we must declare float operations as not-async-signal-safe, since the signal handler might encounter the FP environent in a non-default state which leads to Rust floating-point operations executing incorrectly.

Both C and C++ say that the floating-point environment is unspecified upon entry to a signal handler, and doesn't make changing it signal-safe.

I think that the only option is indeed to say that floating-point operations are UB inside an asynchronous signal handler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants