Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WasmFS] Add JSImplBackend #16209

Merged
merged 17 commits into from
Feb 9, 2022
Merged

[WasmFS] Add JSImplBackend #16209

merged 17 commits into from
Feb 9, 2022

Conversation

kripken
Copy link
Member

@kripken kripken commented Feb 4, 2022

This refactors the existing JSFileBackend to use a new model for creating JS
backends. A JSImplBackend is defined, which adds a layer of indirection in
JS, allowing a backend to be implemented almost 100% in JS. That is,
JSImplBackend is a new backend that forwards operations to an implemenation
in JS.

See full docs in js_impl_backend.h. Note that the diffs here may not be as
useful as reading the files in their final state. In particular, look at the how the
JSFileBackend is now implemented in a minimal way in C++ in
js_file_backend.cpp.

This is a step towards it being practical to make backends like Node and
OPFS.

None of the actual JSFileBackend logic is changed, just moved around.

@kripken kripken requested review from sbc100 and tlively February 4, 2022 23:18
@kripken
Copy link
Member Author

kripken commented Feb 5, 2022

This hits a minor but interesting issue. The JSImplFile does getBackend() to find out the backend, which it then informs JS about. But in the case of a ProxiedBackend on top of a JSImplFile, the actual backend is the ProxiedBackend - so the backend ID does not match what the JSBackend expects. I guess files will need to store their "immediate" backend, but another way might be to send JS the entire chain of backends, if there's a use case for that..?

#if ASSERTIONS
assert(wasmFS$backends[backend]);
#endif
return wasmFS$backends[backend].constructor(file);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will probably need to differentiate DataFile constructors and destructors from those of Directory and Symlink.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. So we'll want something like [new|delete][File|Directory|Symlink]? And then a JS backend would implement just the ones it needs to just newFile/deleteFile for a basic JS File.

Thoughts on new/delete? This is about the lifecycle of the C++ File object, so I think new/delete might make sense. But could also be alloc/free or something else. free might be less likely to be confused with removing a file from the actual filesystem which delete might...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we want to expose the lifecycle of a conceptual file rather than the underlying C++ object. So instead of a destructor, maybe we should expose onUnlink, onClose, and onDelete. The C++ destructor would call onDelete only if the file is actually being deleted and not when the program is shutting down.

But a problem with that is that backends may actually want to know when the underlying C++ object is being destructed, perhaps because they own other C++ objects that should also be destructed at that point. So maybe we need a destructor hook in addition to the file lifecycle hooks? In that case, it is probably sufficient for this PR to just have the destructor hook.

In that case, I would want to avoid using delete to describe the destructor hook, since it would be ambiguous whether delete referred to the in-memory resources or the file itself, as you mentioned.

newFile/freeFile or createFile/freeFile or constructFile/destructFile might be good names. I don't feel strongly, except that I would avoid delete.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. How about allocFile/freeFile?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM 👍

'$wasmFS$backends',
'$wasmFS$JSMemoryFiles',
],
_wasmfs_backend_add_js_file: function(backend) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be add_js_backend rather than add_js_file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to be consistent, wasmfs_backend_* are APIs to add various backends. This backend is called the "js file backend".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, maybe we should change the name of the backend, or maybe change the naming convention. _wasmfs_add_js_file_backend would be much clearer that we are adding a backend rather than a File.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about wasmfs_backend_add_jsfile? The lack of a _ makes it clear that "jsfile" is a "thing".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that seems ok as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this is the internal API, so we might as well change it as much as we want I guess. Also we have a public API wasmfs_create_js_file_backend so I'll make it more similar to that, and follow the convention of duplicating the public C API + adding _js to the end.

extern "C" backend_t wasmfs_create_js_file_backend() {
return wasmFS.addBackend(std::make_unique<JSFileBackend>());
backend_t backend = wasmFS.addBackend(std::make_unique<JSImplBackend>());
_wasmfs_backend_add_js_file(backend);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way to get this to be executed on the correct thread would be to call it lazily the first time a backend operation is called, since those operations will be proxied to the worker thread.

An even better way might be to have the proxying backend constructor take a function that creates a backend as its argument rather than having it take a handle to a backend that has already been created. Then the proxying backend constructor can proxy the creation of the underlying backend to the worker thread and the underlying backend can perform initialization in its constructor, as I would have expected.

No matter what we do there, this call should be somewhere in the JSImplBackend so that the C++ API (even if it's private) is able to fully initialize the backend without needing to be called through the C wrapper.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I like the idea for making the proxied backend get a constructor function. But how would we do that in the C API? Literally a backend_t (*)()..? That seems unoptimal but I'm not sure how else.

Comment on lines 32 to 34
// 2. Add a cpp file for the new backend, and implement the C function from 1,
// which should create it on both the C++ (using JSImplBackend) and JS
// sides.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C layer should be a thin wrapper around the underlying C++ and should not perform any important initialization itself.

We can also improve this workflow so that JS backends can be defined completely in JS userspace with no need to modify the WasmFS source. All we need is a JS API function that creates an instance of the JSImplBackend and allows users to install methods on it and perform whatever other instantiation they want, all from JS. The JSImplBackend constructor should probably install the backend on the JS side then call a user-installable JS function to perform custom initialization.

Adding a small C wrapper to emscripten/wasmfs.h can be a step on top of that workflow for Emscripten developers creating "official" backends, so I would expect to see that documented as an optional last step rather than as the first step.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C layer should be a thin wrapper around the underlying C++ and should not perform any important initialization itself.

Added a comment now, thanks.

We can also improve this workflow so that JS backends can be defined completely in JS userspace with no need to modify the WasmFS source.

Yes, I think that can work. But it does mean a forced export which increases code size, both of the backend-generating code itself, and also the JSImplBackend itself. Those would end up bundled in all builds (unless metadce gets lucky and manages to remove parts). But I agree there are nice benefits to a pure JS approach. Perhaps that should be an option, say if the full JS API is enabled (FORCE_FILESYSTEM) - if we go that route, then user JS backends would need no C but perhaps our own would still have C files (so that they work without FORCE_FILESYSTEM).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps that should be an option, say if the full JS API is enabled (FORCE_FILESYSTEM) - if we go that route, then user JS backends would need no C but perhaps our own would still have C files (so that they work without FORCE_FILESYSTEM).

That makes sense to me.

@kripken
Copy link
Member Author

kripken commented Feb 9, 2022

Any remaining comments here @tlively ?

@kripken kripken merged commit 22b499d into main Feb 9, 2022
@kripken kripken deleted the wfjsbs branch February 9, 2022 23:24
],
_wasmfs_create_js_file_backend_js: function(backend) {
wasmFS$backends[backend] = {
alloc_file: function(file) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use arrow function for this type of thing these days.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks, and looks like that's possible with async functions too. Added to #16229

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants