-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Fiber preemption, blocking calls and other concurrency issues #1454
Comments
Not at the moment. We first plan to support fiber concurrency and after that we might figure out if we really need this feature. It's a quite difficult feature to implement and any ideas are welcome. Go is a language that inspire us in many areas and the goroutines are not fully preemptive: golang/go#11462
We can choose to use other system calls that do not block. Currently most IO is running through the event loop (and you did help a lot with this :-) ).
Yes, and if there is an alternative of a library call than can work asynchronously or with callback it should be used instead.
If the library makes use of thread local data it will be hard to use in an environment where you don't have control of where the fiber is executed (and resumed if preempted). We might think of pinning fibers to threads or have manual threads available for this cases. Do you have a concrete example of a library that relies on thread local data between calls? Not every library might be suitable to be used with Crystal anyway, and I don't think we need to design the language to do so.
Yes, don't use them. You can use channels to communicate fibers instead. Channels will support calls from multiple threads when they are implemented in the language. We might implement mutexes suitable for fibers in the future. |
You could attempt erlang style preemption. All operations have a cost and fibers have priorities.
Source: http://jlouisramblings.blogspot.com/2013/01/how-erlang-does-scheduling.html |
I fixed the waitpid issue in #1295. It just needs to be merged. Pthread mutexes are used in MANY libraries. I don't think you can avoid them completely. If mutexes are held by a library, followed by a callback (with the mutex held) and the context is switched the program may deadlock if the library is used by another fiber. Library calls may take an unknown amount of time to return with no possibility for preemption. There may not be callbacks in the API. More threads is the general solution. Google is working on a application controlled context switching thread solution for linux. I can't find a link. It's similar to windows user mode threading and gives fiber like context switch performance without the drawbacks of fibers. |
LibC.errno is an example of thread local data that can be destroyed after a fiber context switch. If there is preemption some method of tracking thread local data would need to be used. Auto converting to fiber local, automatically identifying thread local variable usage, or anti preemption directives are some possible solutions. |
|
Preemption of Erlang processes is different because it occurs inside a VM. And even Erlang is affected by long native library calls and they discourage that because it affects the scheduler. Libraries like PCRE have been adapted to cooperate with the scheduler (https://github.com/erlang/otp/tree/maint/erts/emulator/pcre) but this doesn't count for any random library that you want to use in your program. I didn't see your waitpid fix yet. I'll review it soon. Thanks! I think this is the link that you're looking for? https://www.youtube.com/watch?v=KXuZi9aeGTw I'm probably more inclined to Go style, where it doesn't fully preempt at any random place. Instead, it leaves the long call running while it reschedules the runnable coroutines in a new thread. That might makes the design a lot simpler while avoiding preemption locking or thread pinning. Not a final choice here... but I'd consider that option when the time of working with the multithreaded scheduler finally arrives. |
A VM isn't necessary to account for runtime or operations performed. Count the number of primitive operations (or estimate them) and accumulate them in a fiber local variable. Add preempt checks inside loops and at function boundaries. External C calls could be timed and fast ones annotated with a fixed cost. This could put crystal ahead of go for soft realtime computing. Thats's the video. The Go GC thread sets an preempt atomic every 100us. I'm not sure where that's checked currently but it used to on memory allocation, or channel calls. I think it's checked in more places now. |
But... I'm confused. Accounting is not enough to avoid blocking system or library calls. And all that accounting I think will add some overhead that I'm not sure we want to pay. We need to think about all these things more deeply. |
Different topics. Accounting avoids starvation in For blocking system/library calls there is little that can be done other than let them run on threads. |
Goroutines can be preempted at at any non-inlined function call, and in the future tight loop will also be able to be preempted via checking a counter inserted by compiler in each loop. See golang/go#10958 I'm a newbie to crystal, when can fiber be automatically yielded? Only IO blocks and sleep? |
golang/go#10958 looks like bad proposal without hints to compiler due to performance issues. Ideally compiler need to know duration of every function call. Here are some 2016 old benchmarks: golang/go#10958 (comment) So better to allow explicit inline call like yield fiber and maybe something like |
I'd really like to have some fruitful discussion here to learn more from awesome people 😄 |
Leaving this here for posterity: I was just bitten by an IoT program stalling randomly for several seconds as soon as it lost internet and its normal MQTT connection. I did all the MQTT handling on a separate fiber, but the MQTT library uses TCPSocket, and the culprit apparently was that when internet goes away the reconnect code calls So please add LibC.getaddrinfo to your list of calls that need to made non-blocking. As always, thanks for a great language! |
@atlantis nice point 👀 I wonder if this should be handled sooner then later, as this might be easy fix that will save a lot of headache down the road |
@bararchy My temporary fix was to use 636f7374/durian.cr and the following monkey patch to fix HTTP::Client by default (which also fixes Crest), but this seems pretty fragile and would be great if it worked out of the box someday!
|
This is an interesting article about how Golang implements fiber preemption: https://unskilled.blog/posts/preemption-in-go-an-introduction/ |
The text was updated successfully, but these errors were encountered: