Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Signal #374

Closed
wants to merge 2 commits into from
Closed

Signal #374

wants to merge 2 commits into from

Conversation

haesbaert
Copy link
Contributor

This implements Signals in a publish/subscriber pattern, the main goal is to
give a bit better semantics than pure POSIX signals while still being able to
work in cenarios where a native signal is needed (think a debugger and whatnot).

Rationale

POSIX semantics for multithreading are very weak, signal dipositions continue to
be process-wide while signal masking is pthread-wide, this means when a signal
arrives to the process PID, any thread can end up handling the signal, this
together with the normal semantics of signals of being processed in whatever
altstack and interrupting code make it difficult to make proper use of it in
places like EIO.

I propose a publish/subscriber way of dealing with signals, a bit inspired by
how Libuv already handles it, users can subscribe to a signal number in a signal
box, this box can be waited on signals and integrates well with EIO Fibers,
blocking and so on.

A signal box (subscription) is always linked to a Switch.t in order to guarantee
the subscription is removed when the handler goes out of scope.

Multithreading is allowed, any Domain can create subscriptions, this makes it
interesting for things like every domain installing a handler for SIGTERM and
then doing a proper finishing when needed, or maybe handling a SIGHUP and
parking everyone until the main domain reconfigures and whatnot.

Signal Numbers

The signals from Sys are in a negative form and the runtime translates them to
the system signal number, this would be ok if Sys exported all signals we want,
but it doesnt, so we have to discover our signal numbers. Sys actually works
with its own idea of a signal number (the negative) as well as the positive (the
system), with some minor inconveniences.

Since not every system has every signal, signals that are not guaranteed to
exist are exposed as an optional type, this forces the user to acknowledge "ok
this is not portable" as well as preventing him from using an invalid number.
The actual number is hidden from the user as a private int, but can be acquired
via Signal.signum_to_int.

Libuv/Luv Implementation

Libuv provides pretty much the publish/subscriber mechanism already, so we just
make use of that, in luv signals are not really processed in the signal handler,
they just schedule something to be processed in the main loop.

Uring Implementation

In uring we implement the publish/subscriber mechanism ourselves, since a signal
can have multiple listeners, we have to be careful to install a new handler when
it's the first and restore the original one when it's the last one being
removed. We do not make sigprocmask dances as they become a nightmare due to
inheritance.

The signal handler in uring must wake the signalbox (an Eio.Stream.t), which
means it must go into the scheduler, which means if we just handled it from the
signal handler, we would end up recursing, which means we wouldn't be able to
grab locks, which means we wouldn't be able to do Eio.Stream.add.

We solve this by using the old pipe trick, the signal handler just writes a byte
to a pipe via Unix.write (must be outside of EIO). We then have an extra Fiber
(like the eventfd listener) running by default that reads from the other end of
the pipe, gets the signal and publishes to all listeners via Eio.Stream.add.

Why Not signalfd

I've actually written one version with signalfd, but it's a pretty terrible API
if you need multithreading, to make usage of it you basically have to mask all
signals on every Domain (which means you have to decide which ones you will mask
beforehand, possibly killing uses of catching SIGSEGV/SIGSTOP). Doing the pipe
trick gives us what we wanted from signalfd and is also portable if we want it
in other backends. I still have the signalfd code in a branch if anyone wants
it.
There's a half-decent rant here:
https://ldpreload.com/blog/signalfd-is-useless

Why Stream.addcanfail

I've added addcanfail to Stream since signals can be lost, we don't want to
block the sender, we also don't want to be forking senders and/or buffering
signals indefinately. Actual POSIX signals can only be pending, there are never
"two SIGINT" to be processed, two SIGINT would be coalesced into one. We do a
bit more, when we introduce the pipe trick we end up buffering signals, which
means two SIGINTS would be two SIGINTS, so we're bound by the pipe buffer size.
Since draining the pipe is not dependant on the user calling anything, it's
virtually impossible to saturate it.

We still need addcanfail since the pipe reader cannot block on publish, imagine
you have multiple signals but the user never reads from one of the boxes, if the
pipe reader would block trying to send, it would not publish anything else.

Worth noting that libuv as well as libevent do the exact same, as it's an old
solution.

Pokes at #321

This implements Signals in a publish/subscriber pattern, the main goal is to
give a bit better semantics than pure POSIX signals while still being able to
work in cenarios where a native signal is needed (think a debugger and whatnot).

Rationale
~~~~~~~~~

POSIX semantics for multithreading are very weak, signal dipositions continue to
be process-wide while signal masking is pthread-wide, this means when a signal
arrives to the process PID, any thread can end up handling the signal, this
together with the normal semantics of signals of being processed in whatever
altstack and interrupting code make it difficult to make proper use of it in
places like EIO.

I propose a publish/subscriber way of dealing with signals, a bit inspired by
how Libuv already handles it, users can subscribe to a signal number in a signal
box, this box can be waited on signals and integrates well with EIO Fibers,
blocking and so on.

A signal box (subscription) is always linked to a Switch.t in order to guarantee
the subscription is removed when the handler goes out of scope.

Multithreading is allowed, any Domain can create subscriptions, this makes it
interesting for things like every domain installing a handler for SIGTERM and
then doing a proper finishing when needed, or maybe handling a SIGHUP and
parking everyone until the main domain reconfigures and whatnot.

Signal Numbers
~~~~~~~~~~~~~~

The signals from Sys are in a negative form and the runtime translates them to
the system signal number, this would be ok if Sys exported all signals we want,
but it doesnt, so we have to discover our signal numbers. Sys actually works
with its own idea of a signal number (the negative) as well as the positive (the
system), with some minor inconveniences.

Since not every system has every signal, signals that are not guaranteed to
exist are exposed as an optional type, this forces the user to acknowledge "ok
this is not portable" as well as preventing him from using an invalid number.
The actual number is hidden from the user as a private int, but can be acquired
via Signal.signum_to_int.

Libuv/Luv Implementation
~~~~~~~~~~~~~~~~~~~~~~~~

Libuv provides pretty much the publish/subscriber mechanism already, so we just
make use of that, in luv signals are not really processed in the signal handler,
they just schedule something to be processed in the main loop.

Uring Implementation
~~~~~~~~~~~~~~~~~~~~

In uring we implement the publish/subscriber mechanism ourselves, since a signal
can have multiple listeners, we have to be careful to install a new handler when
it's the first and restore the original one when it's the last one being
removed. We do not make sigprocmask dances as they become a nightmare due to
inheritance.

The signal handler in uring must wake the signalbox (an Eio.Stream.t), which
means it must go into the scheduler, which means if we just handled it from the
signal handler, we would end up recursing, which means we wouldn't be able to
grab locks, which means we wouldn't be able to do Eio.Stream.add.

We solve this by using the old pipe trick, the signal handler just writes a byte
to a pipe via Unix.write (must be outside of EIO). We then have an extra Fiber
(like the eventfd listener) running by default that reads from the other end of
the pipe, gets the signal and publishes to all listeners via Eio.Stream.add.

Why Not signalfd
~~~~~~~~~~~~~~~~

I've actually written one version with signalfd, but it's a pretty terrible API
if you need multithreading, to make usage of it you basically have to mask all
signals on every Domain (which means you have to decide which ones you will mask
beforehand, possibly killing uses of catching SIGSEGV/SIGSTOP). Doing the pipe
trick gives us what we wanted from signalfd and is also portable if we want it
in other backends. I still have the signalfd code in a branch if anyone wants
it.
  There's a half-decent rant here:
  https://ldpreload.com/blog/signalfd-is-useless

Why Stream.addcanfail
~~~~~~~~~~~~~~~~~~~~~

I've added addcanfail to Stream since signals can be lost, we don't want to
block the sender, we also don't want to be forking senders and/or buffering
signals indefinately. Actual POSIX signals can only be pending, there are never
"two SIGINT" to be processed, two SIGINT would be coalesced into one. We do a
bit more, when we introduce the pipe trick we end up buffering signals, which
means two SIGINTS would be two SIGINTS, so we're bound by the pipe buffer size.
Since draining the pipe is not dependant on the user calling anything, it's
virtually impossible to saturate it.

We still need addcanfail since the pipe reader cannot block on publish, imagine
you have multiple signals but the user never reads from one of the boxes, if the
pipe reader would block trying to send, it would not publish anything else.

Worth noting that libuv as well as libevent do the exact same, as it's an old
solution.
@haesbaert
Copy link
Contributor Author

Also cc #301

@talex5
Copy link
Collaborator

talex5 commented Dec 2, 2022

Why Not signalfd

Well, that's annoying! The signalfd man-page even admits this:

Be aware, however, that this won't be possible in the case of a helper program spawned behind the scenes by any library function that the program may call. In such cases, one must fall back to using a traditional signal handler that writes to a file descriptor monitored by select(2), poll(2), or epoll(7).

This PR seems a bit ad-hoc, though. Perhaps we can start with something simpler. The basic building block we need is the ability to schedule something to run in the main event loop. e.g. for Eio_linux we should provide a way to get this enqueue function (e.g. via env or by performing an effect):

let enqueue fn =
  Lf_queue.push st.run_q (Fn fn);
  if Atomic.get st.need_wakeup then wakeup st

where Fn is handled in schedule with:

  | Some Fn fn -> fn (); schedule st

Then you could handle signals like this:

open Eio.Std

let () =
  Eio_main.run @@ fun env ->
  let enqueue = (* get it somehow *) in
  let handler (_ : int) =
    (* (running in signal handler) *)
    enqueue @@ fun () ->
    (* (called from event loop) *)
    traceln "Interrupt!"
  in
  Sys.(set_signal sigint) (Sys.Signal_handle handler);
  traceln "Sleeping for 5 seconds...";
  Eio.Time.Mono.sleep env#mono_clock 5.

That would give people the flexibility to try different schemes. e.g. they could set a flag and signal a condition if it was previously clear.

Possibly we should store enqueue in domain-local storage, so it's always available. That would also be useful for GC finalizers.

(note: wakeup is not quite signal safe currently, as it takes a mutex and might deadlock if sent from the same domain, but the pipes thing has the same problem where a write to the pipe races with closing the pipe)

@haesbaert
Copy link
Contributor Author

Why Not signalfd

Well, that's annoying! The signalfd man-page even admits this:

Be aware, however, that this won't be possible in the case of a helper program spawned behind the scenes by any library function that the program may call. In such cases, one must fall back to using a traditional signal handler that writes to a file descriptor monitored by select(2), poll(2), or epoll(7).

This PR seems a bit ad-hoc, though. Perhaps we can start with something simpler. The basic building block we need is the ability to schedule something to run in the main event loop. e.g. for Eio_linux we should provide a way to get this enqueue function (e.g. via env or by performing an effect):

What do you mean by ad-hoc specifically ?

let enqueue fn =
  Lf_queue.push st.run_q (Fn fn);
  if Atomic.get st.need_wakeup then wakeup st

where Fn is handled in schedule with:

  | Some Fn fn -> fn (); schedule st

This could also be done with a Stream and it was one of the ideas, but it's a bit tricky, you either have to fork a fiber indefinately so to allow the handler to block freely, or you would constrain the handler to not being able to block, which is kinda bad. I like the idea of a generic "please domain X run Y for me" but the semantics need to be defined and it's not something users should have to learn.

Most of my design is to be able to free the user from any constraint and also to make sure we're dealing with being bound and not forking things indefinately. I don't like the idea of the user having to learn different contexts, most of EIO is waiting for things, so why not channel signals through the same way he is already used to ?

Then you could handle signals like this:

open Eio.Std

let () =
  Eio_main.run @@ fun env ->
  let enqueue = (* get it somehow *) in
  let handler (_ : int) =
    (* (running in signal handler) *)
    enqueue @@ fun () ->
    (* (called from event loop) *)
    traceln "Interrupt!"
  in
  Sys.(set_signal sigint) (Sys.Signal_handle handler);
  traceln "Sleeping for 5 seconds...";
  Eio.Time.Mono.sleep env#mono_clock 5.

That would give people the flexibility to try different schemes. e.g. they could set a flag and signal a condition if it was previously clear.

Not sure I follow this, but I assume you don't mean this:

let rec handler =
  Signal.wait sigint_box;
  set_flag foo 

Possibly we should store enqueue in domain-local storage, so it's always available. That would also be useful for GC finalizers.

(note: wakeup is not quite signal safe currently, as it takes a mutex and might deadlock if sent from the same domain, but the pipes thing has the same problem where a write to the pipe races with closing the pipe)

That's not entirely true as the pipe is only closed after all fibers have finished, basically when domain exits, also, the closing of the pipe would just trigger an error on the write, which would not lead to any damage.

The recursive mutex is a real issue (wakeup recursing into mutex) and I don't see how we could start with that.

I fail to see what's wrong with the pipe, but the proper solution if that is not wanted is an Atomic.t that is modified by the signal handler and pokes the mainloop via eventfd, and we would check that Atomic.t at every loop iteration, I prefer the pipe though.

@haesbaert
Copy link
Contributor Author

Not sure if I was clear but you can do things like

(* wait for 5 seconds or a sigint *)
Fiber.first (fun () -> Eio.Time.Mono.sleep env#mono_clock 5)
            (fun () -> Signal.wait_one sigint))

@talex5
Copy link
Collaborator

talex5 commented Dec 5, 2022

the closing of the pipe would just trigger an error on the write, which would not lead to any damage.

It can, though, because another domain could open a different file reusing the FD. Then writing to the pipe will corrupt that file. This is why wakeup takes a mutex.

The recursive mutex is a real issue (wakeup recursing into mutex) and I don't see how we could start with that.

There are other ways of solving this problem. I think the easiest is to keep a free list of eventfds instead of closing them. Then you might just get a spurious wakeup, which doesn't matter.

What do you mean by ad-hoc specifically ?

I mean it's doing a load complicated stuff that seems only useful for some specific use. Ideally, I think it would look like this:

open Eio.Std

let () =
  Eio_main.run @@ fun _ ->
  let interrupted = Eio.Condition.create () in
  Sys.(set_signal sigint) (Sys.Signal_handle (fun _ -> Eio.Condition.broadcast interrupted));
  Fiber.first
    (fun () -> Eio.Condition.await_no_mutex interrupted; traceln "Interrupted!")
    (fun () ->
       traceln "Running job (Ctrl-C to cancel)";
       Fiber.await_cancel ()
    );
  traceln "Done"

Note that the above doesn't require any new APIs at all. It almost works already; the only problem is that Eio.Condition.broadcast isn't signal-safe because it takes a mutex (so in theory it could deadlock). But we want to make that lock-free anyway.

@haesbaert
Copy link
Contributor Author

haesbaert commented Dec 5, 2022

the closing of the pipe would just trigger an error on the write, which would not lead to any damage.

It can, though, because another domain could open a different file reusing the FD. Then writing to the pipe will corrupt that file. This is why wakeup takes a mutex.

Ack, true I was mixing up with channels where there is a flag for closed, but my point stands: see below.

The recursive mutex is a real issue (wakeup recursing into mutex) and I don't see how we could start with that.

There are other ways of solving this problem. I think the easiest is to keep a free list of eventfds instead of closing them. Then you might just get a spurious wakeup, which doesn't matter.

It's actually simpler than that, the pipe is only open/read/written/closed from within the same domain, so just setting a flag that the fd is closed is enough, then if you happen to get a signal during domain exit, it just become a nop.
That was the intention but I screwed up, I want the handler to always write on "its own pipe".

What do you mean by ad-hoc specifically ?

I mean it's doing a load complicated stuff that seems only useful for some specific use. Ideally, I think it would look like this:

I don't see it as the specific use tbh, I see is as the very general way of handling signals, it escapes exposing anything that is not safe to the user while profiting from the same patterns he would use to wait for IO.

open Eio.Std

let () =
  Eio_main.run @@ fun _ ->
  let interrupted = Eio.Condition.create () in
  Sys.(set_signal sigint) (Sys.Signal_handle (fun _ -> Eio.Condition.broadcast interrupted));
  Fiber.first
    (fun () -> Eio.Condition.await_no_mutex interrupted; traceln "Interrupted!")
    (fun () ->
       traceln "Running job (Ctrl-C to cancel)";
       Fiber.await_cancel ()
    );
  traceln "Done"

Note that the above doesn't require any new APIs at all. It almost works already; the only problem is that Eio.Condition.broadcast isn't signal-safe because it takes a mutex (so in theory it could deadlock). But we want to make that lock-free anyway.

Not only that, Condition is a bit more complex because you'd either have to have one condition per listener, or keep flipping the state condition state all the time, would each listener flip its own condition to wait for the next signal ? would we accumulate ? (I need to read Condition better but so far that's what I see). IIRC our Condition implementation is more suitable for "I'm gonna wait for this to happen and I'm done".

But do you want to expose all that to the user, or that would be hidden ? There is virtually nothing he can do Eio wise that is safe from an actual signal handler, I see it as a step backwards to expose something he will shoot himself in the foot.

…on, to make

the pipe always local to the Domain.

This also fixes the race between the domain exiting and a handler writing to the
output pipe, as the handler will now match and get 'None' instead of seeing the
opened pipe.
@talex5
Copy link
Collaborator

talex5 commented Dec 5, 2022

Condition is a bit more complex because you'd either have to have one condition per listener, or keep flipping the state condition state all the time, would each listener flip its own condition to wait for the next signal ? would we accumulate ?

I'm not sure what you mean. Any number of fibers can wait on a condition. e.g. with two fibers:

  Fiber.both
    (fun () ->
       Fiber.first
         (fun () -> Eio.Condition.await_no_mutex interrupted; traceln "Interrupted 1!")
         (fun () ->
            traceln "Running job 1 (Ctrl-C to cancel)";
            Fiber.await_cancel ()
         )
    )
    (fun () ->
       Fiber.first
         (fun () -> Eio.Condition.await_no_mutex interrupted; traceln "Interrupted 2!")
         (fun () ->
            traceln "Running job 2 (Ctrl-C to cancel)";
            Fiber.await_cancel ()
         )
    );

I get:

+Running job 1 (Ctrl-C to cancel)
+Running job 2 (Ctrl-C to cancel)
^C+Interrupted 1!
+Interrupted 2!
+Done

Also, it doesn't matter if the domain where the signal was registered exits or is busy when the signal is delivered.

(the condition we're waiting for is "more interrupts have been received than when we started waiting", but we don't need to keep a count or check it, since the condition will always be true when the await returns)

@haesbaert
Copy link
Contributor Author

Condition is a bit more complex because you'd either have to have one condition per listener, or keep flipping the state condition state all the time, would each listener flip its own condition to wait for the next signal ? would we accumulate ?

I'm not sure what you mean. Any number of fibers can wait on a condition. e.g. with two fibers:

  Fiber.both
    (fun () ->
       Fiber.first
         (fun () -> Eio.Condition.await_no_mutex interrupted; traceln "Interrupted 1!")
         (fun () ->
            traceln "Running job 1 (Ctrl-C to cancel)";
            Fiber.await_cancel ()
         )
    )
    (fun () ->
       Fiber.first
         (fun () -> Eio.Condition.await_no_mutex interrupted; traceln "Interrupted 2!")
         (fun () ->
            traceln "Running job 2 (Ctrl-C to cancel)";
            Fiber.await_cancel ()
         )
    );

I get:

+Running job 1 (Ctrl-C to cancel)
+Running job 2 (Ctrl-C to cancel)
^C+Interrupted 1!
+Interrupted 2!
+Done

Also, it doesn't matter if the domain where the signal was registered exits or is busy when the signal is delivered.

(the condition we're waiting for is "more interrupts have been received than when we started waiting", but we don't need to keep a count or check it, since the condition will always be true when the await returns)

I was referring to keeping the "interrupt signaled state somewhere", since you don't want to lose signals while there are no listeners.

In your example (IIUC), if the broadcast happens before the waiter is on await_no_mutex, the signal is lost. The second issue is that it doesn't buffer subsequent signals, usually you can be processing SIGFOO while there is a subsequent SIGFOO pending. In other words, signals should be buffered (at least one).

@talex5
Copy link
Collaborator

talex5 commented Dec 5, 2022

I was referring to keeping the "interrupt signaled state somewhere", since you don't want to lose signals while there are no listeners.

Well, I do want that. Ctrl-C should cancel the jobs that are running when I press it, not any future jobs I start later. If I press Ctrl-C three times trying to kill one job, it shouldn't then kill the next two as well.

Of course, if I wanted that then I could use a stream instead.

The second issue is that it doesn't buffer subsequent signals, usually you can be processing SIGFOO while there is a subsequent SIGFOO pending. In other words, signals should be buffered (at least one).

There might be cases where you want that, but it's not wanted in this job example. Pressing Ctrl-C a second time while one job is cancelling will cancel any new jobs started since then, but not jobs started after the second Ctrl-C.

The point is, the user needs to be able to choose the handling schema that makes sense for their use.

@haesbaert
Copy link
Contributor Author

I was referring to keeping the "interrupt signaled state somewhere", since you don't want to lose signals while there are no listeners.

Well, I do want that. Ctrl-C should cancel the jobs that are running when I press it, not any future jobs I start later. If I press Ctrl-C three times trying to kill one job, it shouldn't then kill the next two as well.

That's why my buffering is done per box, and not per signal. A wait_one call would be the sigint case:

(* a sigint arriving before the Fiber.first call is not seen by waitone *)
Fiber.first (fun () -> Signal.(waitone sigint))
            (fun () -> read_stdin_or_whatever)
(* here the signal disposition is restored, if there was only one handler, then sigint terminates the program *)

(* second case *)
let reconfig_daemon () =
  let rec loop hup =
    Signal.wait hup;
    do_some_whatever_reconfig ();
    (* if subsequent SIGHUP arrives here, it will be seen on next loop  *)
    (* as in, you don't need to be waiting to guarantee you'll see the signal, as the buffering is per box *)
    loop hup
  in
  Switch.run @@ fun sw ->
  loop (Signal.(subscribe ~sw sighup))

Of course, if I wanted that then I could use a stream instead.

The second issue is that it doesn't buffer subsequent signals, usually you can be processing SIGFOO while there is a subsequent SIGFOO pending. In other words, signals should be buffered (at least one).

There might be cases where you want that, but it's not wanted in this job example. Pressing Ctrl-C a second time while one job is cancelling will cancel any new jobs started since then, but not jobs started after the second Ctrl-C.

The point is, the user needs to be able to choose the handling schema that makes sense for their use.

Yes, but I don't see how my scheme doesn't provide him with both schemas

@talex5 talex5 mentioned this pull request Dec 5, 2022
@talex5
Copy link
Collaborator

talex5 commented Dec 5, 2022

BTW, here's the config-reload example using Eio.Condition (^C to reload, for easier testing):

open Eio.Std

let () =
  Eio_main.run @@ fun _ ->
  let interrupted = Eio.Condition.create () in
  Sys.(set_signal sigint) (Sys.Signal_handle (fun _ -> Eio.Condition.broadcast interrupted));
  while true do
    Fiber.both
      (fun () -> Eio.Condition.await_no_mutex interrupted)
      (fun () ->
         traceln "Reading configuration...";
         Eio_unix.sleep 2.0;
         traceln "Finished reading configuration";
      )
  done

(we reload the configuration when we've both finished the previous load and received a signal since that started)

+Reading configuration...
+Finished reading configuration

^C+Reading configuration...
+Finished reading configuration

^C+Reading configuration...
^C+Finished reading configuration
+Reading configuration...
+Finished reading configuration

^C+Reading configuration...
^C^C^C^C^C^C^C^C^C+Finished reading configuration
+Reading configuration...
+Finished reading configuration

(line-breaks added for clarity)

@haesbaert
Copy link
Contributor Author

BTW, here's the config-reload example using Eio.Condition (^C to reload, for easier testing):

open Eio.Std

let () =
  Eio_main.run @@ fun _ ->
  let interrupted = Eio.Condition.create () in
  Sys.(set_signal sigint) (Sys.Signal_handle (fun _ -> Eio.Condition.broadcast interrupted));
  while true do
    Fiber.both
      (fun () -> Eio.Condition.await_no_mutex interrupted)
      (fun () ->
         traceln "Reading configuration...";
         Eio_unix.sleep 2.0;
         traceln "Finished reading configuration";
      )
  done

(we reload the configuration when we've both finished the previous load and received a signal since that started)

+Reading configuration...
+Finished reading configuration

^C+Reading configuration...
+Finished reading configuration

^C+Reading configuration...
^C+Finished reading configuration
+Reading configuration...
+Finished reading configuration

^C+Reading configuration...
^C^C^C^C^C^C^C^C^C+Finished reading configuration
+Reading configuration...
+Finished reading configuration

(line-breaks added for clarity)

Ack, that works, but my points still stand:

  • It's not safe as we mentioned since broadcast recurses into the mutex, which sure is fixable by changing broadcast, but what about all the rest of the API ?
  • The handler can never block since it recurses into the scheduler.
  • There's virtually nothing the user can do in a signal handler except somehow deferring whatever he wants to do to a sane fiber context, that's the whole point of my design, to cut the middle ground.
  • The user also now needs to be carefull to always keep a fiber awaiting it, and even so there is a window where he misses the next signal, a sensical, yet wrong version of your solution would be:
while true do
      Eio.Condition.await_no_mutex interrupted;
      traceln "Reading configuration...";
      Eio_unix.sleep 2.0;
      traceln "Finished reading configuration";

Which is wrong as he will only see interrupts every two seconds. I don't see why we should have to burden the user with a semantic that is complex (imho) and more error prone.

This is how it looks currently:

let case_one () =
  Eio_main.run @@ fun _ ->
  Eio.Switch.run @@ fun sw -> 
  traceln "case_one";
  let interrupted = Eio.Signal.(subscribe ~sw sigint) in
  while true do
    Eio.Signal.wait interrupted;
    traceln "Reading configuration...";
    Eio_unix.sleep 2.0;
    traceln "Finished reading configuration";
  done

let case_two () =
  Eio_main.run @@ fun env ->
  traceln "case_two";
  Eio.Fiber.first
    (fun () -> Eio.Flow.single_read (Eio.Stdenv.stdin env) (Cstruct.create 1) |> ignore)
    (fun () -> Eio.Signal.(wait_one sigint))
  (* as a bonus this is clean, the handler is not installed in the system anymore *)
  • It's correct, as in there is no mutex recursion from broadcast (or whatever the user wants to do in the future).
  • It's pretty hard for the user to get it wrong, since there are no hidden/obsucre semantics he must understand beforehand.
  • Signal is "buffered" when I want (case_one), ephemeral and cleans up when I want (case_two).
  • It might be worth mentioning that all those "event framework apis" don't expose the real handler to the user, that is true for libevent(3), libev(3), libuv(3)

I don't wanna be a pain in the ass (although I am and I blame my parents), but I fail to see how your model is preferrable.

The recent PR #381 is nice and I can cut the pipe, although I'll have to check_signals () on the scheduler loop, or something like it. I still stand by the API and the (future) user experience.

@talex5
Copy link
Collaborator

talex5 commented Dec 14, 2022

There's virtually nothing the user can do in a signal handler except somehow deferring whatever he wants to do to a sane fiber context, that's the whole point of my design, to cut the middle ground.

Sometimes though there are other things you want to do, such as updating an atomic counter, or dumping debug info (which might not be completely safe, but still good enough for debugging why your signal handler isn't waking the main loop). But whether we provide a high-level API or not, we can still implement it this way without needing special support in the backends.

The user also now needs to be carefull to always keep a fiber awaiting it, and even so there is a window where he misses the next signal

I don't think there is. Fiber.both runs the first fiber (that subscribes to the signal) first, and only runs the second branch once that's safely suspended. If a signal arrives before the Eio.Condition.await_no_mutex then loading the configuration is certain to happen in the future anyway.

You can certainly argue that's a bit subtle and should be wrapped in something. However, if so, we should be providing an API that wraps Eio.Condition (not just signals). There are other places where we need to recalculate something when a condition changes. For example, OCurrent needs to recalculate when it receives a web-hook, including if it gets another web-hook event while recalculating.

It's pretty hard for the user to get it wrong, since there are no hidden/obsucre semantics he must understand beforehand.

I disagree with that. My version uses only OCaml signals and Eio.Condition, which are both things you might know anyway. Sigbox is a new abstraction, used only for signals. And I think it is quite surprising:

  1. Any domain can register for signals, but processing them requires action from whichever domain registered first.
  2. If that domain is slow, handling the signal will be slow for all domains.
  3. If that domain exits, further signals will be lost (I think?).

@haesbaert
Copy link
Contributor Author

haesbaert commented Dec 15, 2022

There's virtually nothing the user can do in a signal handler except somehow deferring whatever he wants to do to a sane fiber context, that's the whole point of my design, to cut the middle ground.

Sometimes though there are other things you want to do, such as updating an atomic counter, or dumping debug info (which might not be completely safe, but still good enough for debugging why your signal handler isn't waking the main loop). But whether we provide a high-level API or not, we can still implement it this way without needing special support in the backends.

The user also now needs to be carefull to always keep a fiber awaiting it, and even so there is a window where he misses the next signal

I don't think there is. Fiber.both runs the first fiber (that subscribes to the signal) first, and only runs the second branch once that's safely suspended. If a signal arrives before the Eio.Condition.await_no_mutex then loading the configuration is certain to happen in the future anyway.

You can certainly argue that's a bit subtle and should be wrapped in something. However, if so, we should be providing an API that wraps Eio.Condition (not just signals). There are other places where we need to recalculate something when a condition changes. For example, OCurrent needs to recalculate when it receives a web-hook, including if it gets another web-hook event while recalculating.

I think there is, remember, the broadcast happens from any domain, and since there is no state being kept, lost wakeups[1] can occur at anytime the signal is broadcasted before or after someone wait_no_mutex for them. This happens in two places in your example, right after the handler is installed and between each loop interaction (I marked as three just for visualization).

let () =
  Eio_main.run @@ fun _ ->
  let interrupted = Eio.Condition.create () in
  Sys.(set_signal sigint) (Sys.Signal_handle (fun _ -> Eio.Condition.broadcast interrupted));
-----> signal broadcasted here is lost since there is no one waiting
  while true do
-----> signal broadcasted here is lost since there is no one waiting
    Fiber.both
      (fun () -> Eio.Condition.await_no_mutex interrupted)
      (fun () ->
         traceln "Reading configuration...";
         Eio_unix.sleep 2.0;
         traceln "Finished reading configuration";
      )
-----> signal broadcasted here is lost since there is no one waiting
  done

The wait/broadcast dance needs to keep state, and not just get a signal when it's waiting. That's what I meant earlier with having to keep state, you need pretty much the example that is documented in condition.mli (adapted to Mutex), that correctly keeps state (x) and checks it while interlocked (p).

   {[
      let x = ref 0
      let cond = Eio.Condition.create ()
      let mutex = Eio.Mutex.create ()

      let set_x value =
        Eio.Mutex.use_rw ~protect:false mutex (fun () -> x := value);
        Eio.Condition.broadcast cond

      let await_x p =
        Eio.Mutex.use_ro mutex (fun () ->
           while not (p !x) do                  (* [x] cannot change, as mutex is locked. *)
             Eio.Condition.await ~mutex cond    (* Mutex is unlocked while suspended. *)
           done
        )
    ]}

It's pretty hard for the user to get it wrong, since there are no hidden/obsucre semantics he must understand beforehand.

I disagree with that. My version uses only OCaml signals and Eio.Condition, which are both things you might know anyway. Sigbox is a new abstraction, used only for signals. And I think it is quite surprising:

  1. Any domain can register for signals, but processing them requires action from whichever domain registered first.

This is incorrect. Each Fiber can register for signals with a sigbox, this Fiber can be in whatever Domain. The first sigbox that is created installs the hard handler, which doesn't require any action from the the user code.

  1. If that domain is slow, handling the signal will be slow for all domains.

This is incorrect, each sigbox is independantly buffered, the hard handler never blocks on write, or waits for anyone, if the user creates a sigbox and never waits for it, nothing happens with further signals, of any kind, on any other box.
Writing to the pipe and draining the pipe is never dependant on user calling wait (that's why the initial pipe draining fiber is there, without user interaction)

  1. If that domain exits, further signals will be lost (I think?).

This is incorrect. If the switch containing the sigbox ends, that sigbox is destroyed, it has no impact on other sigbox or further signals of that type. If this was the last sigbox, the hard handler is unregistered.

[1] https://docs.oracle.com/cd/E19120-01/open.solaris/816-5137/sync-30/index.html

@talex5
Copy link
Collaborator

talex5 commented Dec 15, 2022

This happens in two places in your example, right after the handler is installed and between each loop interaction (I marked as three just for visualization).

It doesn't matter if the signal is lost in any of the places you marked. If we're at any of those points then we haven't yet started reading the configuration, so we will read the configuration after the signal was sent, as required.

let () =
  Eio_main.run @@ fun _ ->
  let interrupted = Eio.Condition.create () in
  Sys.(set_signal sigint) (Sys.Signal_handle (fun _ -> Eio.Condition.broadcast interrupted));
-----> signal broadcasted here is lost since there is no one waiting
-----> doesn't matter; we're about to enter the loop and read the configuration anyway
  while true do
-----> signal broadcasted here is lost since there is no one waiting
-----> doesn't matter; we're about to start reading anyway
    Fiber.both
      (fun () -> Eio.Condition.await_no_mutex interrupted)
      (fun () ->
         traceln "Reading configuration...";
         Eio_unix.sleep 2.0;
         traceln "Finished reading configuration";
      )
-----> signal broadcasted here is lost since there is no one waiting
-----> doesn't matter; we're about to loop and read the config again anyway
  done

Any domain can register for signals, but processing them requires action from whichever domain registered first.

This is incorrect. Each Fiber can register for signals with a sigbox, this Fiber can be in whatever Domain. The first sigbox that is created installs the hard handler, which doesn't require any action from the the user code.

I'm looking at this code in this PR:

 let subscribe signum box =
    match Domain.DLS.get signal_pipe_key with
    | None -> ()
    | Some pipe ->
      with_signal (fun () ->
          let ol = Atomic.get sigmap.(signum) in
          let first = ol = [] in
          Atomic.set sigmap.(signum) (box :: ol);
          if first then
            Sys.set_signal signum
              (Sys.Signal_handle
                 (fun _sysnum ->
                    let b = Bytes.create 1 in
                    Bytes.set_uint8  b 0 signum;
                    let n = Unix.single_write (snd pipe) b 0 1 in
                    assert (n = 1))))

Let's say:

  1. Domain 1 registers for signal 1.
  2. This gets the pipe to domain 1's event loop and installs a handler to write to it.
  3. Domain 2 registers for signal 1. This isn't first, so it just adds it to the list.
  4. Domain 1 is now busy doing some CPU stuff. Signal 1 occurs. The handler runs and sends a wake-up to domain 1's event loop. This won't be processed until domain 1 is ready, even though domain 2 is ready now.

@haesbaert
Copy link
Contributor Author

haesbaert commented Dec 15, 2022

This happens in two places in your example, right after the handler is installed and between each loop interaction (I marked as three just for visualization).

It doesn't matter if the signal is lost in any of the places you marked. If we're at any of those points then we haven't yet started reading the configuration, so we will read the configuration after the signal was sent, as required.

I have to disagree here, it does matter, the configuration file changed again and the signal happens while you were in the blank spots, you miss the event, that's the whole point of signals being leveled and not edge triggered. Your code makes them all edge triggered.

let () =
  Eio_main.run @@ fun _ ->
  let interrupted = Eio.Condition.create () in
  Sys.(set_signal sigint) (Sys.Signal_handle (fun _ -> Eio.Condition.broadcast interrupted));
-----> signal broadcasted here is lost since there is no one waiting
-----> doesn't matter; we're about to enter the loop and read the configuration anyway
  while true do
-----> signal broadcasted here is lost since there is no one waiting
-----> doesn't matter; we're about to start reading anyway
    Fiber.both
      (fun () -> Eio.Condition.await_no_mutex interrupted)
      (fun () ->
         traceln "Reading configuration...";
         Eio_unix.sleep 2.0;
         traceln "Finished reading configuration";
      )
-----> signal broadcasted here is lost since there is no one waiting
-----> doesn't matter; we're about to loop and read the config again anyway
  done

Any domain can register for signals, but processing them requires action from whichever domain registered first.

This is incorrect. Each Fiber can register for signals with a sigbox, this Fiber can be in whatever Domain. The first sigbox that is created installs the hard handler, which doesn't require any action from the the user code.

I'm looking at this code in this PR:

 let subscribe signum box =
    match Domain.DLS.get signal_pipe_key with
    | None -> ()
    | Some pipe ->
      with_signal (fun () ->
          let ol = Atomic.get sigmap.(signum) in
          let first = ol = [] in
          Atomic.set sigmap.(signum) (box :: ol);
          if first then
            Sys.set_signal signum
              (Sys.Signal_handle
                 (fun _sysnum ->
                    let b = Bytes.create 1 in
                    Bytes.set_uint8  b 0 signum;
                    let n = Unix.single_write (snd pipe) b 0 1 in
                    assert (n = 1))))

Let's say:

  1. Domain 1 registers for signal 1.
  2. This gets the pipe to domain 1's event loop and installs a handler to write to it.
  3. Domain 2 registers for signal 1. This isn't first, so it just adds it to the list.
  4. Domain 1 is now busy doing some CPU stuff. Signal 1 occurs. The handler runs and sends a wake-up to domain 1's event loop. This won't be processed until domain 1 is ready, even though domain 2 is ready now.

Yes, if you mean that signal can be delayed until the receiving Domain blocks on a Fiber, yes it is delayed, if this is a problem, then the whole assumption of a concurrent scheduler is broken and we should delete it all and make it all preemptive through SIGALARM tricks :).

@talex5
Copy link
Collaborator

talex5 commented Dec 15, 2022

I have to disagree here, it does matter, the configuration file changed again and the signal happens while you were in the blank spots, you miss the event, that's the whole point of signals being leveled and not edge triggered.

Perhaps you could give an example sequence of actions where it doesn't work? If the user changes the config file and then we reach one of the spots you indicated, we will certainly read the updated configuration (in fact, we'll do that even if the user forgets to send a signal at all, in those specific cases).

The only difficult case is where a signal occurs after we start reading the configuration and before we start watching for the signal. But that can't happen because we have already started watching for the signal before starting to read the config.

Domain 1 is now busy doing some CPU stuff. Signal 1 occurs. The handler runs and sends a wake-up to domain 1's event loop. This won't be processed until domain 1 is ready, even though domain 2 is ready now.

Yes, if you mean that signal can be delayed until the receiving Domain blocks on a Fiber, yes it is delayed, if this is a problem, then the whole assumption of a concurrent scheduler is broken and we should delete it all and make it all preemptive through SIGALARM tricks :).

I would expect this to work (and it will, with Eio.Condition and lock-free waiters). In the example I gave:

  1. Domain 2 registered for a signal.
  2. The signal happened.
  3. Domain 2 is ready to process it, but it doesn't get delivered because an unrelated domain happens to be busy.

@haesbaert
Copy link
Contributor Author

haesbaert commented Dec 15, 2022

I have to disagree here, it does matter, the configuration file changed again and the signal happens while you were in the blank spots, you miss the event, that's the whole point of signals being leveled and not edge triggered.

Perhaps you could give an example sequence of actions where it doesn't work? If the user changes the config file and then we reach one of the spots you indicated, we will certainly read the updated configuration (in fact, we'll do that even if the user forgets to send a signal at all, in those specific cases).

I can't, but you end up processing the file twice for one signal, no ?

1 - user changed the configuration
2 - user sent a SIGHUP
3 - you caught the SIGHUP, so you will read the new file
(here you're not watching for the new signal, Fiber1 is done and you're in Fiber2)
4 - you start reading the new file
5 - a signal is broadcasted and lost
6 - Fiber2 finishes updating configuration
7 - your loop restart
8 - Fiber1 starts watching for signals, there is none, so it just yields
(you're now watching for further signals)
9 - Fiber2 reads the configuration again <---

The only difficult case is where a signal occurs after we start reading the configuration and before we start watching for the signal. But that can't happen because we have already started watching for the signal before starting to read the config.

Domain 1 is now busy doing some CPU stuff. Signal 1 occurs. The handler runs and sends a wake-up to domain 1's event loop. This won't be processed until domain 1 is ready, even though domain 2 is ready now.

Yes, if you mean that signal can be delayed until the receiving Domain blocks on a Fiber, yes it is delayed, if this is a problem, then the whole assumption of a concurrent scheduler is broken and we should delete it all and make it all preemptive through SIGALARM tricks :).

I would expect this to work (and it will, with Eio.Condition and lock-free waiters). In the example I gave:

  1. Domain 2 registered for a signal.
  2. The signal happened.
  3. Domain 2 is ready to process it, but it doesn't get delivered because an unrelated domain happens to be busy.

@haesbaert
Copy link
Contributor Author

What I might be missing, is that in your example, after start, one Fiber has always finished, but the one waiting the signal did not, so Fiber.both is always half complete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants