Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eio_posix: yield on IO #477

Merged
merged 1 commit into from
Mar 29, 2023
Merged

eio_posix: yield on IO #477

merged 1 commit into from
Mar 29, 2023

Conversation

talex5
Copy link
Collaborator

@talex5 talex5 commented Mar 28, 2023

Before, a read would only yield if it needed to block. This could prevent other fibers from running at all if data was always available and also prevented it from checking for cancellation.

Also, for operations on paths, run the operation in a system thread.

This could be optimised a bit, but this makes it correct at least. I tried it on the wrk benchmark and this change made it faster for some reason!

Part of #456.

@talex5 talex5 added the enhancement New feature or request label Mar 28, 2023
@talex5 talex5 mentioned this pull request Mar 28, 2023
5 tasks
@anuragsoni
Copy link

I tried it on the wrk benchmark and this change made it faster for some reason!

Just guessing, but if the wrk benchmark was run with a large number of clients maybe the better numbers were because yielding the fiber allowed the runtime to switch to accepting more sockets?

Before, a read would only yield if it needed to block. This could
prevent other fibers from running at all if data was always available
and also prevented it from checking for cancellation.

Also, for operations on paths, run the operation in a system thread.

This could be optimised a bit, but this makes it correct at least.
@talex5 talex5 force-pushed the posix-yield branch 2 times, most recently from aad97b1 to 609b13c Compare March 29, 2023 16:19
@talex5 talex5 merged commit 7c625fa into ocaml-multicore:main Mar 29, 2023
@talex5 talex5 deleted the posix-yield branch March 29, 2023 17:00
@talex5
Copy link
Collaborator Author

talex5 commented Mar 29, 2023

Just guessing, but if the wrk benchmark was run with a large number of clients maybe the better numbers were because yielding the fiber allowed the runtime to switch to accepting more sockets?

Possibly, but in the past I've noticed you can cheat at the benchmark by servicing fewer sockets, as wrk only counts the total responses, without caring if they're distributed fairly! It might also (just guessing here) be that it makes the server switch to another fiber after doing a write, rather than immediately queuing up the next read (which would certainly block).

@anuragsoni
Copy link

I've noticed you can cheat at the benchmark by servicing fewer sockets, as wrk only counts the total responses, without caring if they're distributed fairly!

Ah! I did not know this!

it makes the server switch to another fiber after doing a write, rather than immediately queuing up the next read (which would certainly block).

This is quite possible! I've definitely struggled at times to come up with a generic plan for interleaving yields + io when writing shuttle. I was able to punt on the issue a bit because of async not being eager when it comes to binding on a filled promise (I think this in-turn acts as a yield since the event loop will switch control to another task after performing io).

I've yet to spend time with eio but I'm guessing part of the problem here might be the lack of a bind/await etc that would allow automatically switching control to another fiber?

@anuragsoni
Copy link

That said, I don't want to de-rail the discussion!! Its been nice to follow the developments here, and this diff should indeed resolve the problem of never accidentally blocking the event loop because of a flow always having more data to read.

@talex5
Copy link
Collaborator Author

talex5 commented Mar 30, 2023

I've yet to spend time with eio but I'm guessing part of the problem here might be the lack of a bind/await etc that would allow automatically switching control to another fiber?

Yes. Lwt doesn't yield on bind unless it has to, while I think Async does, so there are two options when you have bind.

I think always switching on IO makes sense (and the eio_linux backend has to do that anyway due to how uring works).

However, in Eio e.g. Promise.await does not yield if the promise is already resolved. I think I prefer to keep it that way because otherwise using resolved promises can be fairly expensive as you switch to another fiber (bad for the cache), and you can always put a yield in manually if you need one.

talex5 added a commit to talex5/opam-repository that referenced this pull request Apr 11, 2023
CHANGES:

New features:

- Add eio_posix backend (@talex5 @haesbaert ocaml-multicore/eio#448 ocaml-multicore/eio#477, reviewed by @avsm @patricoferris @polytypic).
  This replaces eio_luv on all platforms except Windows (which will later switch to its own backend). It is a lot faster, provides access to more modern features (such as `openat`), and can safely share OS resources between domains.

- Add subprocess support (@patricoferris @talex5 ocaml-multicore/eio#461 ocaml-multicore/eio#464 ocaml-multicore/eio#472, reviewed by @haesbaert @avsm).
  This is the low-level API support for eio_linux and eio_posix. A high-level cross-platform API will be added in the next release.

- Add `Fiber.fork_seq` (@talex5 ocaml-multicore/eio#460, reviewed by @avsm).
  This is a light-weight alternative to using a single-producer, single-consumer, 0-capacity stream, similar to a Python generator function.

Bug fixes:

- eio_linux: make it safe to share FDs across domains (@talex5 ocaml-multicore/eio#440, reviewed by @haesbaert).
  It was previously not safe to share file descriptors between domains because if one domain used an FD just as another was closing it, and the FD got reused, then the original operation could act on the wrong file.

- eio_linux: release uring if Linux is too old (@talex5 ocaml-multicore/eio#476).
  Avoids a small resource leak.

- eio_linux: improve error handling creating pipes and sockets (@talex5 ocaml-multicore/eio#474, spotted by @avsm).
  If we get an error (e.g. too many FDs) then report it to the calling fiber, instead of exiting the event loop.

- eio_linux: wait for uring to finish before exiting (@talex5 ocaml-multicore/eio#470, reviewed by @avsm).
  If the main fiber raised an exception then it was possible to exit while a cancellation operation was still in progress.

- eio_main: make `EIO_BACKEND` handling more uniform (@talex5 ocaml-multicore/eio#447).
  Previously this environment variable was only used on Linux. Now all platforms check it.

- Tell dune about `EIO_BACKEND` (@talex5 ocaml-multicore/eio#442).
  If this changes, dune needs to re-run the tests.

- eio_linux: add some missing close-on-execs (@talex5 ocaml-multicore/eio#441).

- eio_linux: `read_exactly` fails to update file offset (@talex5 ocaml-multicore/eio#438).

- Work around dune `enabled_if` bug on non-Linux systems (@polytypic ocaml-multicore/eio#475, reviewed by @talex5).

- Use raw system call of `getrandom` for glibc versions before 2.25 (@zenfey ocaml-multicore/eio#482).

Documentation:

- Add `HACKING.md` with hints for working on Eio (@talex5 ocaml-multicore/eio#443, reviewed by @avsm @polytypic).

- Improve worker pool example (@talex5 ocaml-multicore/eio#454).

- Add more Conditions documentation (@talex5 ocaml-multicore/eio#436, reviewed by @haesbaert).
  This adds a discussion of conditions to the README and provides examples using them to handle signals.

- Condition: fix the example in the docstring (@avsm ocaml-multicore/eio#468).

Performance:

- Add a network benchmark using an HTTP-like protocol (@talex5 ocaml-multicore/eio#478, reviewed by @avsm @patricoferris).

- Add a benchmark for reading from `/dev/zero` (@talex5 ocaml-multicore/eio#439).

Other changes:

- Add CI for macOS (@talex5 ocaml-multicore/eio#452).

- Add tests for `pread`, `pwrite` and `readdir` (@talex5 ocaml-multicore/eio#451).

- eio_linux: split into multiple files (@talex5 ocaml-multicore/eio#465 ocaml-multicore/eio#466, reviewed by @avsm).

- Update Dockerfile (@talex5 ocaml-multicore/eio#471).

- Use dune.3.7.0 (@patricoferris ocaml-multicore/eio#457).

- Mint exclusive IDs across domains (@TheLortex ocaml-multicore/eio#480, reported by @haesbaert, reviewed by @talex5).
  The tracing currently only works with a single domain anyway, but this will change when OCaml 5.1 is released.
talex5 added a commit to talex5/opam-repository that referenced this pull request Jun 2, 2023
CHANGES:

New features:

- Add eio_posix backend (@talex5 @haesbaert ocaml-multicore/eio#448 ocaml-multicore/eio#477, reviewed by @avsm @patricoferris @polytypic).
  This replaces eio_luv on all platforms except Windows (which will later switch to its own backend). It is a lot faster, provides access to more modern features (such as `openat`), and can safely share OS resources between domains.

- Add subprocess support (@patricoferris @talex5 ocaml-multicore/eio#461 ocaml-multicore/eio#464 ocaml-multicore/eio#472, reviewed by @haesbaert @avsm).
  This is the low-level API support for eio_linux and eio_posix. A high-level cross-platform API will be added in the next release.

- Add `Fiber.fork_seq` (@talex5 ocaml-multicore/eio#460, reviewed by @avsm).
  This is a light-weight alternative to using a single-producer, single-consumer, 0-capacity stream, similar to a Python generator function.

Bug fixes:

- eio_linux: make it safe to share FDs across domains (@talex5 ocaml-multicore/eio#440, reviewed by @haesbaert).
  It was previously not safe to share file descriptors between domains because if one domain used an FD just as another was closing it, and the FD got reused, then the original operation could act on the wrong file.

- eio_linux: release uring if Linux is too old (@talex5 ocaml-multicore/eio#476).
  Avoids a small resource leak.

- eio_linux: improve error handling creating pipes and sockets (@talex5 ocaml-multicore/eio#474, spotted by @avsm).
  If we get an error (e.g. too many FDs) then report it to the calling fiber, instead of exiting the event loop.

- eio_linux: wait for uring to finish before exiting (@talex5 ocaml-multicore/eio#470, reviewed by @avsm).
  If the main fiber raised an exception then it was possible to exit while a cancellation operation was still in progress.

- eio_main: make `EIO_BACKEND` handling more uniform (@talex5 ocaml-multicore/eio#447).
  Previously this environment variable was only used on Linux. Now all platforms check it.

- Tell dune about `EIO_BACKEND` (@talex5 ocaml-multicore/eio#442).
  If this changes, dune needs to re-run the tests.

- eio_linux: add some missing close-on-execs (@talex5 ocaml-multicore/eio#441).

- eio_linux: `read_exactly` fails to update file offset (@talex5 ocaml-multicore/eio#438).

- Work around dune `enabled_if` bug on non-Linux systems (@polytypic ocaml-multicore/eio#475, reviewed by @talex5).

- Use raw system call of `getrandom` for glibc versions before 2.25 (@zenfey ocaml-multicore/eio#482).

Documentation:

- Add `HACKING.md` with hints for working on Eio (@talex5 ocaml-multicore/eio#443, reviewed by @avsm @polytypic).

- Improve worker pool example (@talex5 ocaml-multicore/eio#454).

- Add more Conditions documentation (@talex5 ocaml-multicore/eio#436, reviewed by @haesbaert).
  This adds a discussion of conditions to the README and provides examples using them to handle signals.

- Condition: fix the example in the docstring (@avsm ocaml-multicore/eio#468).

Performance:

- Add a network benchmark using an HTTP-like protocol (@talex5 ocaml-multicore/eio#478, reviewed by @avsm @patricoferris).

- Add a benchmark for reading from `/dev/zero` (@talex5 ocaml-multicore/eio#439).

Other changes:

- Add CI for macOS (@talex5 ocaml-multicore/eio#452).

- Add tests for `pread`, `pwrite` and `readdir` (@talex5 ocaml-multicore/eio#451).

- eio_linux: split into multiple files (@talex5 ocaml-multicore/eio#465 ocaml-multicore/eio#466, reviewed by @avsm).

- Update Dockerfile (@talex5 ocaml-multicore/eio#471).

- Use dune.3.7.0 (@patricoferris ocaml-multicore/eio#457).

- Mint exclusive IDs across domains (@TheLortex ocaml-multicore/eio#480, reported by @haesbaert, reviewed by @talex5).
  The tracing currently only works with a single domain anyway, but this will change when OCaml 5.1 is released.
talex5 added a commit to talex5/opam-repository that referenced this pull request Jun 2, 2023
CHANGES:

New features:

- Add eio_posix backend (@talex5 @haesbaert ocaml-multicore/eio#448 ocaml-multicore/eio#477, reviewed by @avsm @patricoferris @polytypic).
  This replaces eio_luv on all platforms except Windows (which will later switch to its own backend). It is a lot faster, provides access to more modern features (such as `openat`), and can safely share OS resources between domains.

- Add subprocess support (@patricoferris @talex5 ocaml-multicore/eio#461 ocaml-multicore/eio#464 ocaml-multicore/eio#472, reviewed by @haesbaert @avsm).
  This is the low-level API support for eio_linux and eio_posix. A high-level cross-platform API will be added in the next release.

- Add `Fiber.fork_seq` (@talex5 ocaml-multicore/eio#460, reviewed by @avsm).
  This is a light-weight alternative to using a single-producer, single-consumer, 0-capacity stream, similar to a Python generator function.

Bug fixes:

- eio_linux: make it safe to share FDs across domains (@talex5 ocaml-multicore/eio#440, reviewed by @haesbaert).
  It was previously not safe to share file descriptors between domains because if one domain used an FD just as another was closing it, and the FD got reused, then the original operation could act on the wrong file.

- eio_linux: release uring if Linux is too old (@talex5 ocaml-multicore/eio#476).
  Avoids a small resource leak.

- eio_linux: improve error handling creating pipes and sockets (@talex5 ocaml-multicore/eio#474, spotted by @avsm).
  If we get an error (e.g. too many FDs) then report it to the calling fiber, instead of exiting the event loop.

- eio_linux: wait for uring to finish before exiting (@talex5 ocaml-multicore/eio#470, reviewed by @avsm).
  If the main fiber raised an exception then it was possible to exit while a cancellation operation was still in progress.

- eio_main: make `EIO_BACKEND` handling more uniform (@talex5 ocaml-multicore/eio#447).
  Previously this environment variable was only used on Linux. Now all platforms check it.

- Tell dune about `EIO_BACKEND` (@talex5 ocaml-multicore/eio#442).
  If this changes, dune needs to re-run the tests.

- eio_linux: add some missing close-on-execs (@talex5 ocaml-multicore/eio#441).

- eio_linux: `read_exactly` fails to update file offset (@talex5 ocaml-multicore/eio#438).

- Work around dune `enabled_if` bug on non-Linux systems (@polytypic ocaml-multicore/eio#475, reviewed by @talex5).

- Use raw system call of `getrandom` for glibc versions before 2.25 (@zenfey ocaml-multicore/eio#482).

Documentation:

- Add `HACKING.md` with hints for working on Eio (@talex5 ocaml-multicore/eio#443, reviewed by @avsm @polytypic).

- Improve worker pool example (@talex5 ocaml-multicore/eio#454).

- Add more Conditions documentation (@talex5 ocaml-multicore/eio#436, reviewed by @haesbaert).
  This adds a discussion of conditions to the README and provides examples using them to handle signals.

- Condition: fix the example in the docstring (@avsm ocaml-multicore/eio#468).

Performance:

- Add a network benchmark using an HTTP-like protocol (@talex5 ocaml-multicore/eio#478, reviewed by @avsm @patricoferris).

- Add a benchmark for reading from `/dev/zero` (@talex5 ocaml-multicore/eio#439).

Other changes:

- Add CI for macOS (@talex5 ocaml-multicore/eio#452).

- Add tests for `pread`, `pwrite` and `readdir` (@talex5 ocaml-multicore/eio#451).

- eio_linux: split into multiple files (@talex5 ocaml-multicore/eio#465 ocaml-multicore/eio#466, reviewed by @avsm).

- Update Dockerfile (@talex5 ocaml-multicore/eio#471).

- Use dune.3.7.0 (@patricoferris ocaml-multicore/eio#457).

- Mint exclusive IDs across domains (@TheLortex ocaml-multicore/eio#480, reported by @haesbaert, reviewed by @talex5).
  The tracing currently only works with a single domain anyway, but this will change when OCaml 5.1 is released.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants