Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WasmFS] Async proxied JS backend #16229

Merged
merged 65 commits into from
Feb 10, 2022
Merged

[WasmFS] Async proxied JS backend #16229

merged 65 commits into from
Feb 10, 2022

Conversation

kripken
Copy link
Member

@kripken kripken commented Feb 9, 2022

The key file src/library_wasmfs_fetch.js here shows a simple
async JS backend that depends on pthreads proxying. That is, the main code
looks sync as usual, and we proxy to a dedicated thread which does the async
operation, here, a network fetch().

This is a "hello world" backend, the minimal one I can think of that is async. But
it may still be useful - we used to have a LazyFile option in the old FS, which this
is very close to, that is, on the first read of the data we fetch it from the network,
and then it is cached like a normal JS file.

To implement this, add a new ProxiedAsyncJSImplFile in C++. This combines
the matters of proxying and the target being async. In theory we could add two
layers, here, first a C++ File that is async (perhaps using C++ futures?), but I
think that might be over-engineering, since we don't really want the async
aspect for C++ - it's very specific to JS. However, those are internal details, and
we could refactor the code later to add such laying if we wanted.

The JS side is the important part here. Basically each JS backend would define
a bunch of JS hooks that return Promises, and everything else is taken care of
automatically.

@kripken kripken requested a review from tlively February 9, 2022 19:36
@kripken kripken marked this pull request as ready for review February 9, 2022 19:37
@kripken kripken requested a review from sbc100 February 9, 2022 19:37
@kripken
Copy link
Member Author

kripken commented Feb 9, 2022

Ok, this should now be mostly cleaned up and ready for review.

@rstz This now uses await as you suggested. First time I tried that, looks like it works! Nicer code too.

@kripken
Copy link
Member Author

kripken commented Feb 9, 2022

(top comment has been updated to give an overview)

Copy link
Member

@tlively tlively left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to be able to have async backends without them necessarily also being JS backends, but I agree that that seems complicated to put together and not too useful in practice.

#if ASSERTIONS
assert(wasmFS$backends[backend]);
#endif
{{{ runtimeKeepalivePush() }}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this pushing and popping necessary? Is there a way to make it less manual, like a withRuntimeKeptAlive(...) wrapper function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sbc100 , did we consider a wrapper function? It's not less code, and only works in some situations, but might be nice.

Another option, for places where we can use await, is to have a macro {{{ makeAwait }}} perhaps that would put the push/pop around it?

src/library_wasmfs_fetch.js Show resolved Hide resolved
src/struct_info_internal.json Show resolved Hide resolved
namespace wasmfs {

class FetchBackend : public Backend {
emscripten::SyncToAsync proxy;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems strange to have a proxying object outside of the virtual proxying backend. Could we use that virtual backend here instead?

Copy link
Member Author

@kripken kripken Feb 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that we need that proxy object to call the constructor. I guess we could add a layer of indirection with a function that creates it, and call that, like we did elsewhere, would you prefer that?

(But not sure what you mean by "use that virtual backend here" - maybe you have another idea?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can't use the proxying backend here because the proxied operations are not the normal backend operations, but rather async versions of the normal operations. The proxying backend would be expecting to call the normal operations, not async operations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly.

It is odd though to have a Proxy created here, though, so I refactored some now to avoid that. The FetchBackend class is now entirely gone, so creating more backends like this will have less boilerplate.

Base automatically changed from wfjsbs to main February 9, 2022 23:24
@kripken kripken changed the title [WasmFS] [DRAFT] Async proxied JS backend [WasmFS] Async proxied JS backend Feb 9, 2022
Copy link
Member

@tlively tlively left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM besides these last comments.

{{{ runtimeKeepalivePush() }}}
await wasmFS$backends[backend].allocFile(file);
{{{ runtimeKeepalivePop() }}}
{{{ makeDynCall('vi', 'fptr') }}}(arg);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will vi handle wasm64? Is there some form of vp we should use instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. It doesn't look like we have support for that atm, so it's another limitation of wasm64. I'll add a comment here at least to make it easier to fix up later when we do.

buf,
len,
offset,
[](CppCallbackState* state) { (*state->resume)(); },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe there's any need to dereference the function pointer before calling it.

Suggested change
[](CppCallbackState* state) { (*state->resume)(); },
[](CppCallbackState* state) { state->resume(); },

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you're thinking of a C function pointer (where the deref is optional - kind of weird btw...), but in C++ that doesn't seem to work:

proxied_async_js_impl_backend.h:111:54: error: called object type 'emscripten::SyncToAsync::Callback' (aka 'function<void ()> *') is not a function or function pointer
        [](CppCallbackState* state) { (state->resume)(); },

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I thought that emscripten::SyncToAsync::Callback was a function pointer, but I guess it is a pointer to std::function.

return state.result;
}

void flush() override {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we abort with some sort of "unimplemented" message here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This matches js_impl_backend, but fair point, maybe we should do the same in both. OTOH, I think a default of "sync does nothing" is reasonable. I guess we should decide between those eventually, and if we do go with the latter, we should probably put the default in File and not here and in JSImpl.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, leaving this for now seems fine.

&state);
});

return state.offset;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worth having different callback state types for the different function signatures, or even for each individual method? On the one hand it would be more types to keep track of, but on the other hand there would be no ambiguity about which fields to use.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that adds complexity and I'm not sure what ambiguity is removed? offset is, by its type, the proper field to use when a call returns an offset. I guess if we had a call that returns two offsets we'd have a problem - but I don't think we have such a syscall?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, we can keep it in mind as an option if we do run into any ambiguity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants