Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Signals abstraction #321

Closed
Tracked by #388
haesbaert opened this issue Sep 20, 2022 · 9 comments
Closed
Tracked by #388

Signals abstraction #321

haesbaert opened this issue Sep 20, 2022 · 9 comments
Assignees
Labels
api API design decision
Milestone

Comments

@haesbaert
Copy link
Contributor

As discussed in #301 we might need to abstract Signals.

I've got a working prototype that works on uring and libuv but we have to define some things before I go forward.

1. Do we want to export more signals, less or the same as Sys ?

Sys is a bit conservative and I think we should export more, for example it misses SIGWINCH and SIGINFO (linux doesn't have this but it's popular in Unix).

2. Do we want to restrict what we export ?

Like type signum = Sigint | Sigwhatever, or we just keep accepting an int and if the user wants to use whatever other signal he has, he can.
I think we should keep taking an int, but do some discovery ourselves (for the value of SIGWINCH for example) and export that as a
val sigwinch : int
Worth noting that the default ocaml signal interface accepts arbitrary integers so that would be no issue, also not an issue with Luv.

3. What do we do with signals that are not supported ?

Do we want to fail hard, fail silently or give it a "button" to control the behavior, this is likely very relevant for windows as they have only a small set of signals.

@haesbaert haesbaert added the api API design decision label Sep 20, 2022
@haesbaert haesbaert self-assigned this Sep 20, 2022
@haesbaert
Copy link
Contributor Author

haesbaert commented Sep 23, 2022

Progress

I have something not in a reviewable state, but I think now I understand all the nits for us to decide what to do:
https://github.com/haesbaert/eio/tree/signal

The runtime processes signals in a more native way than libuv, I'll try to describe how both work so we can decide what we want, the differences are particularly important for multiple domains.

Ocaml runtime signals + Domains

The runtime doesn't do a lot, basically signals are recorded in a bitmap and processed at some point outside the trampoline.
The signal bitmap is shared between multiple domains, and code is careful enough when accessing it concurrently, so all good.
This also means we fall into standard pthread signals semantics, it's undefined which pthread (therefore which Domain) gets the signal.

This makes signals pretty hard to use with multiple domains, if we wanted to cancel some Fiber for example we would have to maybe cross Domains depending where the signal was delivered.

Libuv is a Good Boi (tm)

Libuv handles signals in a different way, they have their own signal dispatch mechanism (therefore also run signals outside the trampoline). Each signal handle is associated with a loop and each loop is associated with a Domain (in our case).
One signal can have multiple handles and each handle is associated with a loop, so in libuv you can have multiple handlers for SIGINT, one for each pthread and they all do something cool with it, or not.

This is more useful than what the runtime offers, you can establish a handler that will be called from the Domain you are, so you can maybe cancel some Fiber or whatever. Or maybe you actually want every domain to receive the signal and do some cleanup before exit.

Opinion

I believe we should have a similar mechanism like libuv, this would involve creating the dispatcher in the uring backend, which shouldn't be very hard. It also allows us to be more clear about the semantics: "signals get processed in the domain they were installed" instead of "signals get processed on a random Domain, rejoice".

My tree

In my tree I implemented something similar to the runtime semantics:

  • uring is like the runtime
  • luv is pinning the signal to whatever the first Domain it was installed on

This is more to get things going and it might be a first step for us to define the API and get something in that at least is useful for single domains.

On supported signals

I have added a discovery/config thingy to find the values of signals as Sys is very conservative with the number of signals exported.
I'd like to have signals that we can't guarantee existence in some platform (looking at you Windows) to be an optional type, if you check the code you'll see that Config.Signum.siginfo_opt is an int option, and on Linux it's actually None.
I think this is good enough since it forces the user to acknowledge that the signal he is using may not be present, instead of assuming it is.

@talex5
Copy link
Collaborator

talex5 commented Sep 26, 2022

On Linux, should we be using signalfd(2) instead? Then they'll get processed in the domain that read the FD.

@haesbaert
Copy link
Contributor Author

On Linux, should we be using signalfd(2) instead? Then they'll get processed in the domain that read the FD.

Oh that's nice, I didn't know about it, I think that simplifies a lot.

@haesbaert haesbaert mentioned this issue Nov 29, 2022
@talex5 talex5 mentioned this issue Dec 14, 2022
25 tasks
@avsm
Copy link
Contributor

avsm commented Dec 15, 2022

At a high level, do we have a sense of what modern consumers of signals actually expect? The lowlevel interface is extremely difficult to get right in the presence of multiple threads, processes and runtimes.

Is there anything useful beyond:

  • SIGINT: ctrl-c should work!
  • SIGTERM: just exit the program
  • SIGUSR1/2: callbacks to a single domain that set them up

If that's it, then we could just expose those behaviours from eio and actually have a hope of making them work in a cross-platform way :-)

@haesbaert
Copy link
Contributor Author

At a high level, do we have a sense of what modern consumers of signals actually expect? The lowlevel interface is extremely difficult to get right in the presence of multiple threads, processes and runtimes.

Is there anything useful beyond:

  • SIGINT: ctrl-c should work!
  • SIGTERM: just exit the program
  • SIGUSR1/2: callbacks to a single domain that set them up

SIGWINCH for terminal resizing, SIGINFO for polling program behaviour, SIGHUP for daemon reconfig

If that's it, then we could just expose those behaviours from eio and actually have a hope of making them work in a cross-platform way :-)

@avsm
Copy link
Contributor

avsm commented Dec 21, 2022

I'm in favour of just supporting OCaml callbacks for those usecases specifically, tied to the main domain, as none of them actually need multiple domains as far as I can tell.

  • SIGINT: trigger cancellation of all the fibres
  • SIGTERM: just exit
  • SIGWINCH/INFO/USER1/2 : user callbacks
  • SIGHUP : dont think we should support this, it's hard to do in the presence of sandboxing (pledge, privsep) without a reexec.

@haesbaert
Copy link
Contributor Author

SIGTERM for multiple domain cleanup is a kinda common thing.
Not supporting SIGHUP seems unecessary, I don't see why we should remove this from the user.
Cancelling fibers with SIGINT should be one a per-fiber basis and not a global thing (hence why I wrote sigbox).
SIGWINCH/INFO/USER1/2 can be done as callbacks, but I still think it's clumsy as the user will need to do the signaling himself, and will very likely due it wrong due to the issues already discussed in the PR for signals.

But I'm not gonna insist :)

@haesbaert
Copy link
Contributor Author

haesbaert commented Dec 21, 2022

SIGTERM for multiple domain cleanup is a kinda common thing. Not supporting SIGHUP seems unecessary, I don't see why we should remove this from the user. Cancelling fibers with SIGINT should be one a per-fiber basis and not a global thing (hence why I wrote sigbox). SIGWINCH/INFO/USER1/2 can be done as callbacks, but I still think it's clumsy as the user will need to do the signaling himself, and will very likely due it wrong due to the issues already discussed in the PR for signals.

But I'm not gonna insist :)

oh and I mean SIGTERM is even more crucial since you might need to be able to do things for a proper cleanup.
For example MDNS should send a goodbye message unpublishing records, or any other non trivial software that needs to do some cleanup.

@talex5
Copy link
Collaborator

talex5 commented Feb 6, 2023

Closed by #436.

@talex5 talex5 closed this as completed Feb 6, 2023
@balat balat added this to the Eio 1.0 milestone May 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api API design decision
Projects
None yet
Development

No branches or pull requests

4 participants