-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WasmFS] Async proxied JS backend #16229
Conversation
Ok, this should now be mostly cleaned up and ready for review. @rstz This now uses |
(top comment has been updated to give an overview) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to be able to have async backends without them necessarily also being JS backends, but I agree that that seems complicated to put together and not too useful in practice.
#if ASSERTIONS | ||
assert(wasmFS$backends[backend]); | ||
#endif | ||
{{{ runtimeKeepalivePush() }}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this pushing and popping necessary? Is there a way to make it less manual, like a withRuntimeKeptAlive(...)
wrapper function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sbc100 , did we consider a wrapper function? It's not less code, and only works in some situations, but might be nice.
Another option, for places where we can use await
, is to have a macro {{{ makeAwait }}}
perhaps that would put the push/pop around it?
system/lib/wasmfs/fetch_backend.cpp
Outdated
namespace wasmfs { | ||
|
||
class FetchBackend : public Backend { | ||
emscripten::SyncToAsync proxy; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems strange to have a proxying object outside of the virtual proxying backend. Could we use that virtual backend here instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that we need that proxy object to call the constructor. I guess we could add a layer of indirection with a function that creates it, and call that, like we did elsewhere, would you prefer that?
(But not sure what you mean by "use that virtual backend here" - maybe you have another idea?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can't use the proxying backend here because the proxied operations are not the normal backend operations, but rather async versions of the normal operations. The proxying backend would be expecting to call the normal operations, not async operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly.
It is odd though to have a Proxy created here, though, so I refactored some now to avoid that. The FetchBackend class is now entirely gone, so creating more backends like this will have less boilerplate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM besides these last comments.
{{{ runtimeKeepalivePush() }}} | ||
await wasmFS$backends[backend].allocFile(file); | ||
{{{ runtimeKeepalivePop() }}} | ||
{{{ makeDynCall('vi', 'fptr') }}}(arg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will vi
handle wasm64? Is there some form of vp
we should use instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. It doesn't look like we have support for that atm, so it's another limitation of wasm64. I'll add a comment here at least to make it easier to fix up later when we do.
buf, | ||
len, | ||
offset, | ||
[](CppCallbackState* state) { (*state->resume)(); }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe there's any need to dereference the function pointer before calling it.
[](CppCallbackState* state) { (*state->resume)(); }, | |
[](CppCallbackState* state) { state->resume(); }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you're thinking of a C function pointer (where the deref is optional - kind of weird btw...), but in C++ that doesn't seem to work:
proxied_async_js_impl_backend.h:111:54: error: called object type 'emscripten::SyncToAsync::Callback' (aka 'function<void ()> *') is not a function or function pointer
[](CppCallbackState* state) { (state->resume)(); },
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I thought that emscripten::SyncToAsync::Callback
was a function pointer, but I guess it is a pointer to std::function
.
return state.result; | ||
} | ||
|
||
void flush() override {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we abort with some sort of "unimplemented" message here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This matches js_impl_backend
, but fair point, maybe we should do the same in both. OTOH, I think a default of "sync does nothing" is reasonable. I guess we should decide between those eventually, and if we do go with the latter, we should probably put the default in File and not here and in JSImpl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, leaving this for now seems fine.
&state); | ||
}); | ||
|
||
return state.offset; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be worth having different callback state types for the different function signatures, or even for each individual method? On the one hand it would be more types to keep track of, but on the other hand there would be no ambiguity about which fields to use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that adds complexity and I'm not sure what ambiguity is removed? offset
is, by its type, the proper field to use when a call returns an offset. I guess if we had a call that returns two offsets we'd have a problem - but I don't think we have such a syscall?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, we can keep it in mind as an option if we do run into any ambiguity.
The key file
src/library_wasmfs_fetch.js
here shows a simpleasync JS backend that depends on pthreads proxying. That is, the main code
looks sync as usual, and we proxy to a dedicated thread which does the async
operation, here, a network
fetch()
.This is a "hello world" backend, the minimal one I can think of that is async. But
it may still be useful - we used to have a LazyFile option in the old FS, which this
is very close to, that is, on the first read of the data we fetch it from the network,
and then it is cached like a normal JS file.
To implement this, add a new
ProxiedAsyncJSImplFile
in C++. This combinesthe matters of proxying and the target being async. In theory we could add two
layers, here, first a C++ File that is async (perhaps using C++ futures?), but I
think that might be over-engineering, since we don't really want the async
aspect for C++ - it's very specific to JS. However, those are internal details, and
we could refactor the code later to add such laying if we wanted.
The JS side is the important part here. Basically each JS backend would define
a bunch of JS hooks that return Promises, and everything else is taken care of
automatically.