Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pipe hang (issue #319) #327

Merged
merged 1 commit into from
Oct 21, 2022
Merged

Conversation

haesbaert
Copy link
Contributor

@haesbaert haesbaert commented Sep 26, 2022

Turns out the kernel needs to handle blocking file descriptors differently, it needs to do thread-pooling and it's a different code path (also slower).

It also seems that this is bugged in some cases. This makes sense as I believe most people don't test with blocking FDs. I believe we should make all our FDs non blocking as a precaution.

There is a program listed in issue #319 that replicates this. I could hang it in < 5 seconds, and it's now running for 10minutes with the fix, so kaboom.

@haesbaert
Copy link
Contributor Author

haesbaert commented Sep 26, 2022

There is a second hang in the direct_copy test, I'm not sure it is related now, but I could reproduce it before as well, this fixes the first one at least.

@haesbaert
Copy link
Contributor Author

haesbaert commented Sep 26, 2022

There is a second hang in the direct_copy test, I'm not sure it is related now, but I could reproduce it before as well, this fixes the first one at least.

It's also likely I'm just hiding the original bug by changing some timing.
Hold this for a while until I figure it out.

@talex5
Copy link
Collaborator

talex5 commented Sep 26, 2022

Is there an upstream bug report for this somewhere (e.g. on https://github.com/axboe/liburing)?

@haesbaert
Copy link
Contributor Author

haesbaert commented Sep 26, 2022

Is there an upstream bug report for this somewhere (e.g. on https://github.com/axboe/liburing)?

I have to check deeper, I found a bunch of stuff related to bugs with pipes, also the commit for adding splice support for nonblocking in pipes is from June this year, it's all very recent (https://lore.kernel.org/all/[email protected]/)

Basically setting the pipes to nonblocking fix the first case, but the second case makes the splice call return EAGAIN, and restarting/handling is not enough. it ends up hanging at some point.

If I make both pipes nonblocking and disable splicing, everything works.

I need to isolate things better before opening something upstream, but it all seems a bit dicey and immature, I'm writing C equivalents to make sure we're not missing something.

@haesbaert
Copy link
Contributor Author

As noted in #319 (comment), I can reproduce the behaviour in that C program.

@anmonteiro
Copy link
Contributor

I believe we should make all our FDs non blocking as a precaution.

After reading through this PR and comments, I did this for my application, which solved hangs I was seeing under eio_linux for eio-ssl

@bikallem
Copy link
Contributor

Probably outside of the scope of this PR, but shouldn't we make the sockets non blocking by default too - if they are not already? I was under the impression that the uring and luv backends both set sockets fd to be non-blocking by default. Is this not the case?

@haesbaert
Copy link
Contributor Author

So I've started to poke uring with perf(1) and test pipe in blocking and non blocking, can confirm that the blocking is using the worker thread while non blocking is not:

-N is nonblocking

sam:tmp: sudo perf stat -a -e io_uring:io_uring_poll_arm,io_uring:io_uring_queue_async_work -- ./uring_tests
pipe_bug blocking=1
read(A) cqe->res = 6 (buf=foobar)
read(B) cqe->res = 0

 Performance counter stats for 'system wide':

                 0      io_uring:io_uring_poll_arm
                 3      io_uring:io_uring_queue_async_work

       0.001237004 seconds time elapsed

sam:tmp: sudo perf stat -a -e io_uring:io_uring_poll_arm,io_uring:io_uring_queue_async_work -- ./uring_tests -N
pipe_bug blocking=0
read(A) cqe->res = 6 (buf=foobar)
read(B) cqe->res = 0

 Performance counter stats for 'system wide':

                 0      io_uring:io_uring_poll_arm
                 0      io_uring:io_uring_queue_async_work

       0.001059103 seconds time elapsed

@haesbaert
Copy link
Contributor Author

haesbaert commented Oct 19, 2022

I've ran some more tests today and this is where we are:

  • The fix has been commited to mainline linux: torvalds/linux@46a525e
  • We can't expect users to be on kernel 6.1.
  • The fix we can do now is use pipes as non-blocking and disable splice, I couldn't reproduce the bug with this.
  • The kernel fix also fixes the splice case, so it's the same bug, so in the future we can use it.
  • There are more things that need to be changed in u-ring but I wouldn't address that now, for now I'm concerned with just providing something that won't hang. So I think we should set pipes to non-block and disable splice.
  • Even with the kernel fix, splice can return EAGAIN and we have to handle.

Copy link
Collaborator

@talex5 talex5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rebased it on main to fix the CI error.

@talex5
Copy link
Collaborator

talex5 commented Oct 20, 2022

Is this ready to merge, or does it need splice to be disabled first?

@haesbaert
Copy link
Contributor Author

Is this ready to merge, or does it need splice to be disabled first?

Apologies, I should have been more clear, it's not ready, it was more like checking if this would be ok. I've also wanted to hunt the kernel versions where the patch has been applied and maybe test conditionally for disabling it, but I didn't get to it.

@talex5 talex5 marked this pull request as draft October 20, 2022 19:21
This issue turned out to be a kernel bug which has been fixed in:
torvalds/linux@46a525e

Reported here:
axboe/liburing#665 (comment)

We can workaround it by making sure pipes are non-blocking, uring considers
pipes unbounded work and relies on uring worker threads if they are blocking,
the bug is only triggered if this is the case, so force them to be non-blocking.

If we use splice, the splice call itself needs the worker threads and the bug
surfaces again (I verified this with perf probes), so disable splice for now.
Even with the fix, it's desirable to keep pipes as non-blocking to avoid thread
pooling.

The splice call can return EAGAIN in uring, this happens even with the kernel
patched, so handle it for the future.

We can tune this better by disabling splice only for the unpatched kernels.
@haesbaert haesbaert marked this pull request as ready for review October 20, 2022 20:19
@haesbaert
Copy link
Contributor Author

This should be good to go, I updated the commit message as well.

@haesbaert haesbaert merged commit 63f5896 into ocaml-multicore:main Oct 21, 2022
talex5 added a commit to talex5/opam-repository that referenced this pull request Dec 7, 2022
CHANGES:

API changes:

- Unify IO errors as `Eio.Io` (@talex5 ocaml-multicore/eio#378).
  This makes it easy to catch and log all IO errors if desired.
  The exception payload gives the type and can be used for matching specific errors.
  It also allows attaching extra information to exceptions, and various functions were updated to do this.

- Add `Time.Mono` for monotonic clocks (@bikallem @talex5 ocaml-multicore/eio#338).
  Using the system clock for timeouts, etc can fail if the system time is changed during the wait.

- Allow datagram sockets to be created without a source address (@bikallem @haesbaert ocaml-multicore/eio#360).
  The kernel will allocate an address in this case.
  You can also now control the `reuse_addr` and `reuse_port` options.

- Add `File.stat` and improve `Path.load` (@haesbaert @talex5 ocaml-multicore/eio#339).
  `Path.load` now uses the file size as the initial buffer size.

- Add `Eio_unix.pipe` (@patricoferris ocaml-multicore/eio#350).
  This replaces `Eio_linux.pipe`.

- Avoid short reads from `getrandom(2)` (@haesbaert ocaml-multicore/eio#344).
  Guards against buggy user code that might not handle this correctly.

- Rename `Flow.read` to `Flow.single_read` (@talex5 ocaml-multicore/eio#353).
  This is a low-level function and it is easy to use it incorrectly by ignoring the possibility of short reads.

Bug fixes:

- Eio_luv: Fix non-tail-recursive continue (@talex5 ocaml-multicore/eio#378).
  Affects the `Socket_of_fd` and `Socketpair` effects.

- Eio_linux: UDP sockets were not created close-on-exec (@talex5 ocaml-multicore/eio#360).

- Eio_linux: work around io_uring non-blocking bug (@haesbaert ocaml-multicore/eio#327 ocaml-multicore/eio#355).
  The proper fix should be in Linux 6.1.

- `Eio_mock.Backend`: preserve backtraces from `main` (@talex5 ocaml-multicore/eio#349).

- Don't lose backtrace in `Switch.run_internal` (@talex5 ocaml-multicore/eio#369).

Documentation:

- Use a proper HTTP response in the README example (@talex5 ocaml-multicore/eio#377).

- Document that read_dir excludes "." and ".." (@talex5 ocaml-multicore/eio#379).

- Warn about both operations succeeding in `Fiber.first` (@talex5 ocaml-multicore/eio#358, reported by @iitalics).

- Update README for OCaml 5.0.0~beta2 (@talex5 ocaml-multicore/eio#375).

Backend-specific changes:

- Eio_luv: add low-level process support (@patricoferris ocaml-multicore/eio#359).
  A future release will add Eio_linux support and a cross-platform API for this.

- Expose `Eio_luv.Low_level.Stream.write` (@patricoferris ocaml-multicore/eio#359).

- Expose `Eio_luv.Low_level.get_loop` (@talex5 ocaml-multicore/eio#371).
  This is needed if you want to create resources directly and then use them with Eio_luv.

- `Eio_linux.Low_level.openfile` is gone (@talex5 ocaml-multicore/eio#378).
  It was just left-over test code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants