-
Notifications
You must be signed in to change notification settings - Fork 695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Synchronous JS API callback for memory growth? #1296
Comments
Thanks @Macil! The first is the pthreads+growth issue, which is very related, but distinct. (In particular a synchonous event is not enough there. as you mention there, but it would be sufficient here.) The second does indeed look like it arrives at this proposal, so it's too bad I didn't find it when I searched earlier... but the discussion there seems to have stalled, and the proposal isn't clearly focused on the necessary synchronous aspect. So I'd suggest keeping this issue open for discussion? |
Adding read/write methods on wasm memories (as a substitute for ArrayBuffer views) might serve this use case too, but would still mean that arraybuffer views could never be reliably used on wasm memories, which would be an unfortnate wart. |
What actually happens right now (or is supposed to happen) if a Javascript thread takes a reference to a WASM memory's ArrayBuffer and is reading/writing to it while a separate WASM thread grows the memory? I assume one of these two things happen:
If it's 1, then it seems like a synchronous JS API callback for memory growth would be fine. The callback wouldn't actually happen synchronously with the grow call (which happened in another thread), but it would happen synchronously with the point in time where the existing ArrayBuffer gets invalidated in the current JS context. |
IIRC, what should happen is sort of like (1). A thread obtains the memory.buffer value. This has a length field that never changes. The thread will be able to write into the buffer up to that length, even if another thread grows the memory concurrently. The extracted buffer value remains valid indefinitely, but if the thread reads memory.buffer again (even during the same turn) it may observe that the length has changed and using this buffer it can write up to the new length. |
Any new developments here? |
@kripken This all seems most cleanly solved by making sure the capabilities of |
Depends what you mean by "this" - do you mean the entire issue? Then IIUC |
Yeah, I mean the entire issue.
That's what I'm proposing for |
Maybe I'm misunderstanding you, but I think If I'm missing something, what would invoke the JS that calls |
wasm's memory.grow would be explained in terms of realloc. If ArrayBuffers can be resized in place, there is no need to do the manual work of updating the views IIUC. It's technically a backwards breaking change, in that result of growing memory would no longer result in distinct ArrayBuffers in the JS side, but I feel like that won't actually break anything. |
Thanks, now I see! Yes, if realloc is done automatically, so ArrayBuffers and views just automatically grow with the wasm memory, then that would solve the issue here. Could this also happen with SharedArrayBuffers, so JS is updated across threads? That would solve a very big problem too (right now we have to manually poll if the memory grew in JS). |
How are we planning to handle growing a SAB across threads? Are we expecting the SAB and WebAssembly.memory to be atomically updated together? It might be very hard to make sure that every JS thread sees a consistent size for memory between WASM and JS. e.g. A WASM Is the expectation that memory growth in WASM will suspend all threads and atomically update their buffers? If so, that could be expensive depending on how often people want to grow memory. It's possible this doesn't matter in practice though. |
We determined during previous discussions that the propagation of bounds updates between Wasm threads as a result of It's still guaranteed that length updates from growth will be an "atomic counter", so other threads will never see a truly inconsistent view of memory, they just won't have freshly-grown memory reliably accessible unless they synchronize with the latest growing thread. I'd hope JavaScript could use the same semantics, so long as EDIT: perhaps, analogous to |
From my brief chats with @binji about the matter, shared memory would have different realloc restrictions than non-shared memory, and I don't really see another way than following what wasm already does. |
To wit, it seems very, very problematic to shrink SABs in-place. Doesn't seem so bad to shrink ABs in place? |
It seems like that might be desirable. Even if it causes some performance headaches... |
Seems pretty reasonable to me to say that multiple threads witnessing the length be synchronizing. Other restrictions included that for SABs to be resizable at all, the |
I don't have much of an opinion on multiple threads yet but within one thread it should be probably be synchronizing. Otherwise, you have the problem where your length magically shrinks if you ask a different API because those numbers are stored at different addresses in the VM. |
Within one thread your semantics are constrained by agent-order anyways (or program order as it's normally known), which is stronger than synchronizing seqcst events. |
That's only true if all the various internal slots and whatever WASM uses are the same. Assuming slots are different then they'd be effectively relaxed, no? |
I think I'd have to see the particular example you're thinking of. I don't know what synchronizing accesses in a single thread means. In my mind there shouldn't be any staleness issues intra-thread for the same reason regular |
Currently, if a SAB is backed by a Wasm memory which is grown, the SAB stays attached, but with the same original length (so the newly grown memory isn't accessible, due to the JS bounds check). So I think @kmiller68 is thinking of things through this lens, where the "length" of a SAB is a distinct field on the object which isn't necessarily kept in sync with the length of the underlying memory (which could have been concurrently grown in Wasm). If I'm reading #1296 (comment) correctly, @kripken is rooting for a change to this, so that growth of a SAB in one thread (either through a Wasm EDIT: my comment #1296 (comment) was written presupposing that this change would be accomplished by making SABs act exactly like first-class Wasm memories, so bounds/length checks are no longer against a distinct thread/object-local "length" field, but are implemented the same way as Wasm. |
@conrad-watt Yeah, that's likely the most sensible way to specify it. i.e. something like, if a JS SAB or AB object is backed by wasm, then reach into however wasm memory holds the length and return that value (times WASM page size in bytes, I guess). That way, you don't have two independent places holding the length. Alternatively, we could go the other direction too. Where WASM gets its length from JS but that's likely more more difficult as the core wasm spec has no real concept of a host hook right now, IIRC. |
Ah, I see. You mean "synchronizing" in the sense of the AB/SAB length vs the backing store length going out of sync. I agree with @conrad-watt's assumption: that we're working off of the backing store. That's the only reasonable thing to do IMO, otherwise we'd need to constantly be synchronizing per-SAB wrapper's length with the single backing store, which is pretty weird. |
Rediscovering problems that came up the first time something like this was talked about at Mozilla with asm.js: growable Hmm... |
FWIW, this idea has been discussed before in this issue as well: #1210 With a synchronous event, any code that potentially allocates (and therefore potentially grows memory) can potentially run JS, which can then do basically anything, including re-enter Wasm again. Most code that calls With Is this approach untenable for others? |
Yes, this is my exact concern for the synchronous callback. |
Ah, I had naively thought that typed arrays could also be changed to define their length in terms of the underlying buffer, but yeah, at the very least that doesn't cover the case where the length is explicitly fixed. Looking back through the previously linked discussions, there was this suggestion here to just expose Wasm-style accesses in JS via the Wasm memory object: https://gist.github.com/kripken/949eab99b7bc34f67c12140814d2b595 Would this solve the problem? |
I think it would for the wasm use case but sets an undesirable precedent. It would encourage further divergence between wasm and JS on accessing buffers. I'm still rooting for a more integrated wasm-JS interface story. |
I think it's a possibility, yes. However, I do think it'll be noticeably slower in some cases: you basically need to poll in JS whether memory changed each time you cross the wasm/js boundary. If you call a small getter repeatedly that can be painful. It's faster to avoid polling and get notified when a growth occurs, which is a rare event.
I think I don't understand the concern here. Yes, in theory such code could do horrible things. But the only thing we need that code to do is exactly recreate the views, and nothing more. That's a very simple operation - just a few lines of JS, basically a few If a toolchain misuses this callback to do weird things, that would be a bug there - but any mechanism can be misused? |
Part of my concern is implicit re-entrancy points are in general a pandora's box, even if their intended use case is well understood and simple. This smells like a signal handler. |
Exactly. |
Oh, yeah, I agree there. This is really a signal handler, and that feels a little odd to me too. I was worried about this before opening this issue, in fact, but some people thought it would be acceptable, see their reasons there. |
I wrote some benchmarks to check the size of the polling overhead. It can be as high as 50% in some cases that I see, but it definitely varies a lot and depends on the VM! One example is for (var i = 0; i < 500 * 1000 * 1000;) {
// poll to see if the heap grew, and if so, realloc the view
if (module.HEAP8.length > 20000000) {
module.HEAP8 = new Int8Array(module.buffer);
}
// wasm function that just returns the input
result = result | module.copy(i++);
// ..4 more unrolled copies..
// ..it may also be good to read from HEAP8 here for realism..
if (i == 1000) {
// will force one realloc in the next polling check
module.HEAP8 = new Int8Array(20000000 + 1);
}
} To see the overhead most, unroll helps to avoid the loop overhead, and at least on some VMs the last 4 lines make a big difference (they force one of the polling checks to actually be true for once, but then they return to be false again). 50% is probably an unlikely amount of overhead, but it's easy to see 10% or so. OTOH, maybe it's not common for people to call small getters across the js/wasm boundary (but I've seen plenty of examples of people doing it). In any case, manually polling in JS does add some overhead that a non-polling approach can avoid. I think it would be good to avoid that overhead, but I definitely agree with you and others both here and in the other issue that the possible solutions have downsides too. I can try to summarize them:
I don't have a good idea for a way forward here myself, given the (very reasonable) objections. We may end up with the status quo, that is, toolchains can either
(Separately, in both options, growth + pthreads requires unavoidable polling.) |
Curious if it is possible to
? |
After experimenting with dynamic memory growth within If an event occurs between While the event is synchronous the user could still define the handler as an asynchronous function, or call Within the context of audio processing by the time the event handler is fired the time between reading that memory growth information and the next audio processing event can render the event notification moot, as the code has already proceeded with the next operation.
for dynamic memory growth performed within JavaScript context for audio processing, here, to avoid race conditions. Keep track of write offset and read offsets
For dynamic memory growth within WASM code a shallow copy of that memory instance exposing only the number of pages currently set as read-only properties on the object, for users to get the current page size. |
Background: WebAssembly/WASI#82
In brief, JavaScript needs to know when wasm memory grows, because it needs to update JS typed array views. In emscripten we have things set up in sbrk etc. to call into JS to do that when memory grows. However, in wasi the current approach is to just call the memory grow wasm instruction inside the wasm module, so JS is not updated.
JS can check if memory grew every time wasm code runs (arrive in JS, or return from wasm, etc.). Emscripten does something similar for pthreads+memory growth, but it's unpleasant, adds overhead, and is error-prone.
Another option is to add an API in wasi to update the system on memory growth.
Yet another option is to extend the JS wasm API to get a callback when memory growth occurs. That is what this issue is about.
It might look something like this:
The callback would be synchronous, that is, when wasm memory grows, all such callbacks are immediately executed, while the wasm is still on the stack etc. This is necessary because if it's an event that happens in the next JS event loop then that is too late - we may execute JS before that, and it would use the old obsolete views.
Thoughts?
The text was updated successfully, but these errors were encountered: