-
-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC 0007: Event loop refactor #7
Conversation
This pull request has been mentioned on Crystal Forum. There might be relevant details there: https://forum.crystal-lang.org/t/rfc-0007-event-loop-refactor/6812/1 |
text/0007-even_loop-refactor.md
Outdated
|
||
Should events from the Crystal runtime be part of the event loop as well? | ||
|
||
- Fiber: sleep |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, under the assumption that it is sleep with argument and not endless sleep. At least for io_uring the easiest implementation is to just emit the appropriate op (ie timeout
) and it will just work in a way that is extremely straightforward (also note the even simpler implementation of Fiber#yield
a few lines beneath it. That is the territory of the scheduler, but event loop and schedulers are not necessarily all that separate, in practice..).
text/0007-even_loop-refactor.md
Outdated
- **Generic API**: Independent of a specific event loop design | ||
- **Black Box**: No interference with internals of the event loop | ||
- **Pluggable**: It must be possible to compile multiple event loop implementations and choose at runtime. | ||
Only one type of implementation will typically be active (it would be feasible to have different event loop implementations in different execution contexts, if there's a use case for that). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Feasible, but problematic). Some default values may be wanted to be different when using file descriptors in different event loops, like for example if sockets are opened in nonblocking mode or not. This is also quite problematic for the pluggable-at-runtime switching. :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To properly solve this, I suppose it would require the opening of file descriptors to be passed to the event loop as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deferring it to the event loop help a little but doesn't really solve it as it means fds cannot be passed around. So for example, it would not work to open the socket in one context, and then start to listen on it and spawn handlers in another. And if descriptors are not setup properly, the error symptoms would be weird. Not having that confusion and that footgun around is worth more than what it would fix.
My preference for linux in the really long term would be to keep the default to nonblocking until there is no longer any support for any linux version that doesn't have a sufficiently modern io_uring, and then switch the default once we have an implementation we are happy with. The latter can be implemented using poll ops until the switch (similar to the uring branch) - less neat and does more work than necessary, but it would work. But at some time it would be nice to switch as the polling adds a little overhead both to code and performance.
(FWIW, note that there is still cases where wait_readable
and wait_writable
is useful even with io_uring. perhaps their fate needs to be a separate discussion as they don't provide value/make sense on certain other platforms)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't think it makes sense to move open fds between different event loop implementations. That's just going to be chaos. And I'm not sure it would even be very useful anyway.
However I could potentially see use cases for using one kind of event loop implementation for a part of an application, and another one in another part. Not saying that this definitely makes sense, but it might.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't think it makes sense to move open fds between different event loop implementations.
So what you are saying is that each event loop need to open their own copy of std_* and keep track of them separately? And that it shouldn't be possible to safely pass file descriptors around without restrictions? That seems a lot more chaotic to me. The gains for adding restrictions like that on the user seems very tiny.
At least on linux/mac. No idea what would be the best choice on windows.
edit: I am not against allowing having loop-private fds, but I don't think it make sense as a default choice.
edit2: It could also make sense to have it unspecified. Allowing globality where it make sense (due to the fds actually being global like the linux default), or local (when they are not global, eg handles on windows or privately registered fds which is something uring can do).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a use case I was imagining a set of file descriptors being handled by a separate event loop from the rest of the application. This could be a set of sockets for some specific communication purpose. There doesn't need to be any standard IO on such an event loop. Or any other interference with file descriptors that are not part of that set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly we want to have multiple event loop implementations compiled together (i.e. have io_uring + epoll on Linux) but only one implementation will be activated after a runtime check for the whole program.
Couldn't the event loop implementation tell whether it wants O_NONBLOCK for example?
text/0007-even_loop-refactor.md
Outdated
|
||
### Optional event loop features | ||
|
||
Some activities are managed on the event loop on one platform but not on others. Example would be `Process#wait` which goes through IOCP on Windows but on Unix it’s part of signal handling. (Note: Perhaps we could try to get that on the event loop on Unix as well? **🤔** But there are other examples of system differences) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
linux has signalfd
which could help, but I am not aware if mac/bsd have any good solutions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BSD and Darwin have kqueue
that can receive signals (and many other things).
Recent Linux kernels have pidfd
that look even more lovely than signalfd
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good to know we have the option to put signals on the event loop. It's a different question whether we'd want that. It could be useful as it would help simplify the implementation, I imagine?
But this is just an example. The real question here is how to handle events that need to be on the event loop on one system and we cannot put them on on another one. Not sure if there's any currently relevant use case except signals, but it's something to ponder for future extensibility.
text/0007-even_loop-refactor.md
Outdated
|
||
### Bulk events without fibers | ||
|
||
For some applications it might be useful to interact with the event loop directly, being able to push operations in bulk without having to spawn (and wait) a fiber for each one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One could also think of the reverse. For example io_uring supports multishot accept
, ie it is not necessary to rearm the op after getting one without emitting more events, so a usage of it could either spawn new fibers or reuse a set of existing fibers with each trigger.
This pull request has been mentioned on Crystal Forum. There might be relevant details there: https://forum.crystal-lang.org/t/curious-about-the-eventloop-updates/6825/3 |
text/0007-event_loop-refactor.md
Outdated
# Opens a connection on *socket* to the target *address* and continues fiber | ||
# when the connection has been established. | ||
# Returns `IO::Error` but does not raise. | ||
abstract def connect(socket : ::Socket, address : ::Socket::Addrinfo | ::Socket::Address, timeout : ::Time::Span?) : IO::Error? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The address
type could perhaps be abstracted. We need to support both types, though they both boil down to the same interface: #to_unsafe
returning a pointer to a LibC::Sockaddr
and #size
returning its size in bytes.
So it could be an option to define an interface for that and use that as type restriction. But we cannot actually make it strictly type-safe because the return type of #to_unsafe
is Void*
and we cannot enforce a pointer to LibC::Sockaddr
(so Array
would implement this interface for example).
Alternatively, we could remove the type restriction entirely because any type that fulfills the implicit interface should work.
I think it's better to be explicit, either with ::Socket::Addrinfo | ::Socket::Address
or an abstract module interface.
Introducing a new model would also solve the issue that the Socket
name space is not in core lib (see https://github.com/crystal-lang/rfcs/blob/rfc/0005/text/0007-event_loop-refactor.md#socket).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's actually hard to stub out ::Socket::Addrinfo
and ::Socket::Address
because that would require defining the Socket
namespace, which is a class that inherits IO
and that's so much complexity we could just include socket.cr
.
I've considered introducing a module which could be included in both types. But that's actually not a great solution for when we return an address.
So I think it's probably best to introduce an internal struct type which holds a reference to LibC::Sockaddr
and its size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think it's probably best to introduce an internal struct type which holds a reference to LibC::Sockaddr and its size.
Or to leave it as the union?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those types are not in core lib. Their existence depends on require "socket"
. So leaving the union won't work as is. Trying to see if we can stub it out (or replace with an internal type) is one possible way to deal with that.
But I think a better approach is described in #7 (comment)
Here's a good link on CancelToken instead of looking for Timeouts or even signals - use a CancelToken and have those as subclasses of it https://vorpus.org/blog/timeouts-and-cancellation-for-humans/ |
text/0007-event_loop-refactor.md
Outdated
#### Socket | ||
|
||
One instance of this problem shows already in the core features: The event loop interface has type restrictions of the `Socket` namespace in abstract defs, but `Socket` is not in the core lib. | ||
|
||
Options: | ||
|
||
- Omit those abstract defs (dilutes the interface, so not ideal) | ||
- Split `EventLoop` interface and add parts of it only with `require “socket”` | ||
- Add stub declarations for the involved types (`Socket::Handle` and `Socket::Address` - `Socket` itself is only used as parameter type which is technically okay for abstract methods) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered option 3 (stubbing the affected types in core lib) as maybe the most approachable solution, but I don't think it is.
We technically don't need to stub Socket
because it's only used as type restriction in the parameters which are not validated by the abstract method checker. However, I believe this should eventually be covered as well, which would break this solution in the future. And I think it's generally better to be strict about abstract method implementations, even if the compiler won't complain.
Either way, I don't think it makes much sense to stub that many types (2 or 3) for the event loop API. It would probably be easier to just use the actual types directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering that we may need to handle optional event loop features eventually (see previous section), it might be a good exercise to start with sockets.
Here's how this could work:
- Define a module
Crystal::System::EventLoop::Socket
- This module is empty by default.
module Crystal::System::EventLoop::Socket
# empty
end
- In
crystal/system/socket.cr
, add abstract defs toCrystal::System::EventLoop::Socket
(this could probably be in a different file).
module Crystal::System::EventLoop::Socket
abstract def close(socket : ::Socket) : Nil
# etc.
end
- All current event loop implementations implement the socket interface, so they can unconditionally include
Crystal::System::EventLoop::Socket
. There's no need to hide the implementation when sockets are not used.
We can even include it inCrystal::System::EventLoop
(and pull it out when we ever get an event loop implementation that doesn't implement sockets).
class Crystal::LibEvent::EventLoop
include Crystal::EventLoop::Socket
def close(socket : ::Socket) : Nil
# ...
end
# etc.
end
- In a program where sockets are not used, the abstract defs are not checked because the module is empty. The implementation methods are never called, so their type restrictions don't matter.
- In a program where sockets are used, the module has abstract defs and their implementation will be checked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know I was an advocate for this style, but now that I see it I'm having a little itch. How would it look like with a split interface for the EventLoop? Say, EventLoop::SocketInterface
, EventLoop::FileInterface
, ... and then let each EL implementation include the module interfaces they comply to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's basically what this is.
However, there's the problem that at the event loop implementation we can't know whether sockets are used in the program (i.e. if Socket
is defined) or not. The event loop gets required as part of the prelude while require "socket"
will only be in user code.
So the trick here is to have the socket interface empty by default and only fill it with abstract defs if sockets are used.
That makes it possible for the event loop implementation to always include the socket interface. We get abstract def checks if sockets are used.
text/0007-event_loop-refactor.md
Outdated
# Opens a connection on *socket* to the target *address* and continues fiber | ||
# when the connection has been established. | ||
# Returns `IO::Error` but does not raise. | ||
abstract def connect(socket : ::Socket, address : ::Socket::Addrinfo | ::Socket::Address, timeout : ::Time::Span?) : IO::Error? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think it's probably best to introduce an internal struct type which holds a reference to LibC::Sockaddr and its size.
Or to leave it as the union?
text/0007-event_loop-refactor.md
Outdated
#### Socket | ||
|
||
One instance of this problem shows already in the core features: The event loop interface has type restrictions of the `Socket` namespace in abstract defs, but `Socket` is not in the core lib. | ||
|
||
Options: | ||
|
||
- Omit those abstract defs (dilutes the interface, so not ideal) | ||
- Split `EventLoop` interface and add parts of it only with `require “socket”` | ||
- Add stub declarations for the involved types (`Socket::Handle` and `Socket::Address` - `Socket` itself is only used as parameter type which is technically okay for abstract methods) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know I was an advocate for this style, but now that I see it I'm having a little itch. How would it look like with a split interface for the EventLoop? Say, EventLoop::SocketInterface
, EventLoop::FileInterface
, ... and then let each EL implementation include the module interfaces they comply to.
Based on comment in the read_evented issue, I'm noting that support for As these cannot be implemented on windows, but is at the same time very convenient to have access to where it is possible, I wonder if there should be something written about them in the rfc. |
Damn, those are valid use cases. |
Obviously the event loop implementations for libevent, kqueue, epoll, io_uring etc. will always require something like But it probably fits best as a feature of the event loop implementation, not the file descriptor model. The former is platform-specific, while the latter is (supposed to be) platform-independent. |
@straight-shoota that's what I was thinking: an interface for the event loop to implement. Windows would raise NotImplementedError (until we can figure something, like running a WSAPoll in its own thread?) |
After implementing the Epoll (and Kqueue) event loops:
|
This pull request has been mentioned on Crystal Forum. There might be relevant details there: https://forum.crystal-lang.org/t/new-event-loop-unix-call-for-reviews-tests/7207/1 |
This pull request has been mentioned on Crystal Forum. There might be relevant details there: https://forum.crystal-lang.org/t/new-event-loop-unix-call-for-reviews-tests/7207/16 |
This pull request has been mentioned on Crystal Forum. There might be relevant details there: https://forum.crystal-lang.org/t/charting-the-route-to-multi-threading-support/7320/1 |
I have updated the RFC text to reflect the implemented status. So I think it's ready to be merged. There are a number of Unresolved questions but I don't think they need to be resolved in the scope of this RFC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last bit can be refined now to explain or at least comment what we ended up doing
We can place them under "future possibilities". In particular, the |
This PR has been merged. The RFC text is available at: https://github.com/crystal-lang/rfcs/blob/main/text/0007-event_loop-refactor.md
This RFC is very much work in progress and has a lot of unresolved questions which I intend to discuss here.
Preview: https://github.com/crystal-lang/rfcs/blob/rfc/0005/text/0007-event_loop-refactor.md