-
Notifications
You must be signed in to change notification settings - Fork 256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Commands and Reactors #13
Comments
Why isn't Reactor just a WASM library used on top of a Command? |
You could create such a library, and it could be useful. But I think there's still value in having a reactor-type model explicitly modeled by WASI; in particular because as mentioned the main thread on the web inherently has this model, and its event loop is baked into the platform. That means that if WASI mandates a command-style model, it will be come much more awkward and less useful on the web. |
As someone who has written such a main loop abstraction, it is very hard to coerce all the platforms an implementations into the same behavior. I think there is negative value in trying to implement a main loop into WASI. |
A relevant thought experiment would be the following use case: How do you handle a grep program? One the one hand, it definitely has the semantics of a command: it has a main function, reads input files sequentially and prints lines sequentially into its output file descriptor. It's very deterministic and doesn't "react" to anything happening around it. On the other hand, a wasm grep implementation could definitely benefit from async/await and an event loop. The programs tend to be I/O bound, and async/await allows them to process text while the next block is being loaded in memory, without relying on multithreading or additional logic. Using the guidelines @sunfishcode suggested above:
I'm not really sure what the right semantics are, but I think that grep should still be classified as a command, provided commands have a way to keep executing past blocking syscalls. They could still be more limited than reactors (eg be only allowed one call stack), but I'm not sure what benefits these limitations would actually bring. |
Also, another question: what effect do you expect the Commands/Reactors distinction to have on the type system and library interoperability? In the rationale, you propose:
Which means that every syscall would have two signatures: Wouldn't that split the library ecosystem in two? With every I/O library developer either picking a side, or making a I think a better solution might be to have a common function signature for Reactor-mode and Command-mode read: This would, of course, require That way, developers could write library that can be used interchangeably in Commands and Reactors. Ideally, you'd also want a |
Somehow I still can't grasp what would be the reason for such (weird) differentiation.
Sounds to me like a disguised attempt to get at least some finer grained capability support than the current coarse grained one - see #1 .
Again the same - it sounds to me like a disguised attempt to get at least some finer grained capability support than the current coarse grained one - see #1 . @kentonv what do you think?
Compared e.g. to the approach @npmccallum proposed above, there could be some minor gain, but only for one thread out of N (nowadays from hundreds up to many thousands) and only in case the Reactor runs in the very same thread as the main loop (which is not necessarily the case and is becoming less and less probable over the time as we're having in our pockets HW capable of running tens and hundreds of threads in parallel and software trying to accommodate...). Also I'm pretty confident, that optimization of switching JIT tiers is possible (and not much more difficult) also without knowing about the distinction between Reactors and Commands in advance in compile time. But maybe there are other reasons for such distinction and I'm missing them... |
@dumblob Hmm, not sure what you mean -- I'm not seeing the connection between this and fine-grained capabilities. My opinion, FWIW (disclaimer: I don't know a whole lot about WASI and I'm probably missing important context): In the context of both capability systems and the Web platform, event loops and Promises have "won". async/await has provided a finishing blow by making async code almost as easy to write as sync code, while keeping all the advantages of async. It seems like WASI should focus on async as the main way it expects to be used. That said, supporting legacy C/C++/etc. code written in synchronous style seems like a desirable goal. I'd argue for creating only one set of I/O APIs that operates in an async way. All such calls should return a "promise descriptor". A special call, In order to support legacy synchronous code, you could then offer a special You might then consider offering a You might also consider supporting Promise Pipelining: Consider an I/O call which eventually produces a file descriptor as its result, such as |
I quote |
The Reactor concept here is largely motivated by thinking about whether WASI could be used within Web browsers on the main thread, and then, if we design a way to make that work, would it be usable in other contexts as well? The Web browser main thread environment imposes some constraints, such as the constraint that synchronous I/O is not available (ignoring sync xhr, proxying to Workers, etc.). This is what motivates the suggestion of limiting APIs available to Reactors. And it probably limits the options concerning blocking until an event arrives, or pipelining, or other things. Maybe having a Reactor concept doesn't mean that all event-based I/O needs to use it, and we still need an epoll-like way for applications to build their own custom event loops. Or perhaps we should have multiple variants of the Reactor concept, to address different use cases. |
@sunfishcode FWIW I had the browser main thread use case in mind when writing my comment. I don't think there's any conflict with pipelining... FWIW, as someone who has built several custom event loops (using poll, epoll, kqueue, Windows IOCP)... I think I would prefer a built-in event loop. All these interfaces are a PITA and all I really want to do is have the OS call some callback when the event happens (but only one callback at a time). |
@kentonv As much as I like cap'n proto (and I think it has potential as a cross-language layer above wasm, similar to webidl bindings), I don't think Promise pilelining is something the processor should worry about. Eg an implementation of capn'proto could just have its RPC methods return plain old structs, with both the promise descriptor and the RPC "token" as members. Then wasm would use the promise descriptor, while promise pilelining would use the token. |
Designing APIs is hard and the sync/async debate has a long history. It's hard, there are lots of perspectives, so I don't enjoy making overly definitive statements...nevertheless for brevity, I'll state that synchronous I/O is a fantastically straightforward and successful programming model that has been employed in many, many contexts. Straight-line control flow is as simple as it gets. And blocking I/O follows very naturally from straight-line control flow. Of course, it works best when execution stacks are plentiful and cheap, and task switching is similarly cheap, like many threads in C or many goroutines. Callbacks and promises and async functions are all far more complicated than simple blocking I/O APIs. They haven't "won" so much as they are the only choice for the web platform, which historically has lacked plentiful execution stacks and cheap task switching (and is also tied an inherently single-threaded primary programming language). The web's mostly accidental evolution into a programming platform where UI reactivity, I/O, and computation are multiplexed in userland onto a single underlying execution thread, complete with jank galore, should not be emulated here, IMHO. |
@PoignardAzur Yeah honestly I was only half-seriously suggesting it. Promise pipelining could be a win when operating on a remote filesystem, but probably doesn't really benefit anything operating on local resources, which is probably the vast majority of what WASI does. So probably not worth the effort here, but fun to think about. @titzer Fair enough. But are you saying that the specific design of having async API calls plus a separate |
I don't think that having a separate |
@titzer Sorry, I probably created some confusion by calling it "await". I wasn't suggesting that this call would be used to implement language-level async/await; instead, it would be used to implement traditional blocking I/O at the language level. The alternative seems to be to have two parallel calls for every I/O operation, one blocking and one non-blocking -- or two modes for every call, as in POSIX -- which seems comparatively ugly. |
@kentonv Yeah, this is what I was describing as well. Although I wonder how much a language can do with an await instruction but without an event loop. I'm guessing "not much", because in most situations, there isn't much your code can do locally while awaiting I/O data; the operations that can go on in the background are going to be on different stack frames entirely. |
I like this distinction. I think its useful regardless of the web and its main thread concepts. This approach would solve #24 and I was already about to propose the same thing in #48. The fact the emscripten tries to merge these two concepts with its "EXIT_RUNTIME" configuration is a the source of much confusion. I don't see a compelling case for allowing the same app to be used a Command (a single main function) and then a library. These things seem fundamentally different. Am I missing something? |
Just two more points to keep in mind: a) It seems to me that "reactors" are except for standalone use basic building blocks for "commands". Can "reactors" be used inside of "commands" (I think this will be inevitably needed)? What about the case with different "limitations" (as mentioned in #13 (comment) ) contradicting such use-case? b) We shouldn't forget, that the future lies within massively parallel computation and thus languages like ParaSail which execute "each line of code" in parallel (not just concurrent!) might collide with the concept of having one main event loop (which seems to be the sole reason for the differentiation between "commands" and "reactors"). It seems to me the current proposal assumes the future lies within concurrent, but not parallel computing. |
This adds support for a new experimental "Reactor" executable model. The "Commands" and "Reactors" concepts are introduced here: WebAssembly/WASI#13 A companion Clang patch, which just consists of using the new reactor-crt1.o and Reactor-specific entry point name, is here: https://reviews.llvm.org/D62922 Instead of an entrypoint named "_start", which calls "main", which then scopes the lifetime of the program, Reactors have a "__wasi_unstable_reactor_start" function, which calls "reactor_setup". When "reactor_setup" exits, the intention is that the program should persist and be available for calling. At present, the main anticipated use for this is in environments like Node, where WASI-using modules can be imported and don't necessarily want the semantics of a "main" function. The "unstable" in "__wasi_unstable_reactor_start" reflects that this Reactor concept is not yet stable, and likely to evolve.
As an update here, I and others have been continuing to discuss Commands and Reactors. I expect these concepts will end up being defined by interface types, as the main distinction between a Command and a Reactor is how you interface with a module from the outside. Thinking about @PoignardAzur's observation about |
…ichton Emit a reactor for cdylib target on wasi Fixes rust-lang#79199, and relevant to rust-lang#73432 Implements wasi reactors, as described in WebAssembly/WASI#13 and [`design/application-abi.md`](https://github.com/WebAssembly/WASI/blob/master/design/application-abi.md) Empty `lib.rs`, `lib.crate-type = ["cdylib"]`: ```shell $ cargo +reactor build --release --target wasm32-wasi Compiling wasm-reactor v0.1.0 (/home/coolreader18/wasm-reactor) Finished release [optimized] target(s) in 0.08s $ wasm-dis target/wasm32-wasi/release/wasm_reactor.wasm >reactor.wat ``` `reactor.wat`: ```wat (module (type $none_=>_none (func)) (type $i32_=>_none (func (param i32))) (type $i32_i32_=>_i32 (func (param i32 i32) (result i32))) (type $i32_=>_i32 (func (param i32) (result i32))) (type $i32_i32_i32_=>_i32 (func (param i32 i32 i32) (result i32))) (import "wasi_snapshot_preview1" "fd_prestat_get" (func $__wasi_fd_prestat_get (param i32 i32) (result i32))) (import "wasi_snapshot_preview1" "fd_prestat_dir_name" (func $__wasi_fd_prestat_dir_name (param i32 i32 i32) (result i32))) (import "wasi_snapshot_preview1" "proc_exit" (func $__wasi_proc_exit (param i32))) (import "wasi_snapshot_preview1" "environ_sizes_get" (func $__wasi_environ_sizes_get (param i32 i32) (result i32))) (import "wasi_snapshot_preview1" "environ_get" (func $__wasi_environ_get (param i32 i32) (result i32))) (memory $0 17) (table $0 1 1 funcref) (global $global$0 (mut i32) (i32.const 1048576)) (global $global$1 i32 (i32.const 1049096)) (global $global$2 i32 (i32.const 1049096)) (export "memory" (memory $0)) (export "_initialize" (func $_initialize)) (export "__data_end" (global $global$1)) (export "__heap_base" (global $global$2)) (func $__wasm_call_ctors (call $__wasilibc_initialize_environ_eagerly) (call $__wasilibc_populate_preopens) ) (func $_initialize (call $__wasm_call_ctors) ) (func $malloc (param $0 i32) (result i32) (call $dlmalloc (local.get $0) ) ) ;; lots of dlmalloc, memset/memcpy, & libpreopen code ) ``` I went with repurposing cdylib because I figured that it doesn't make much sense to have a wasi shared library that can't be initialized, and even if someone was using it adding an `_initialize` export is a very small change.
As another update here, the basic concepts of commands and libraries (reactors) are now being defined by the component model. As presented in the WASI Preview2 presentation, commands are conceptually components that have value imports for inputs, and value exports for exports, and which run their code from the wasm start function. |
There seem to be two distinct modes of program execution that applications broadly fit into: Commands and Reactors.
Reactors could run in the main thread of a browser, but they may also have uses in a variety of settings where applications will primarily be responding to external events. Putting the event loop in the runtime gives the runtime the flexiblity to use a single event loop for multiple purposes.
(I briefly mentioned these ideas here, but I want to give them more visibility here.)
Emscripten provides a kind of hybrid approach, where instances can stick around after "main" exits to allow them to be called by callbacks. For WASI, it may be useful to make an explicit static distinction between Commands and Reactors, because:
Currently, WASI programs combined with clang fit the Command model, with
_start
being the "main" function. It might make sense to rename it to__wasi_command_main
or something. Programs could declare themselves to be Reactors by exporting a function named something like__wasi_reactor_setup
or so.Some interesting questions:
The text was updated successfully, but these errors were encountered: