Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[js-api] Integration with the ResizableArrayBuffer and GrowableSharedArrayBuffer proposal #1292

Open
syg opened this issue Mar 2, 2021 · 27 comments · Fixed by #1300
Open

Comments

@syg
Copy link
Contributor

syg commented Mar 2, 2021

The ResizableArrayBuffer and GrowableSharedArrayBuffer proposal is a TC39 proposal that is currently stage 2.

There was some discussion of how the proposal might integrate with the Wasm/JS API. It seems better to move the discussion here to get a concrete plan and to get more Wasm eyes on the issue.

As a starting point, I suggest the following changes to the WebAssembly.Memory API:

  • Memory will have a new resizableBuffer attribute.
    • Returns a ResizableArrayBuffer or GrowableSharedArrayBuffer instance.
    • If an ArrayBuffer was vended by the buffer attribute on the same instance, that ArrayBuffer instance will be detached. SharedArrayBuffers and GrowableSharedArrayBuffer will alias the same memory.
    • If the Memory was created with a maximum size, that size will be reflected on the ResizableArrayBuffer or GrowableSharedArrayBuffer instance.
    • Otherwise, the maximum size is 2^32.
  • Memory's buffer attribute will detach any ResizableArrayBuffer buffers vended by the resizableBuffer attribute.
  • ResizableArrayBuffers gotten via Memory's resizableBuffer attribute will:
    • Throw on transfer(), since they can't be detached.
    • Throw on resize() calls that are shrinks.
    • Throw on resize() calls that are not in multiples of the wasm page size. (Thanks @conrad-watt)
  • GrowableArrayBuffers gotten via Memory's resizableBuffer attribute will:
    • Throw on grow() calls that are not in multiples of the wasm page size.

Thoughts welcome.

(Also, what is the process for bringing such a "normative PR" that's not a full proposal to the CG? Is it just an agenda item?)

@syg
Copy link
Contributor Author

syg commented Mar 2, 2021

cc @kripken for thoughts on the design for emscripten-generated code.

@conrad-watt
Copy link
Contributor

The fields of the memoryDescriptor currently correspond precisely to the components of the type of the memory in core Wasm. Would it be acceptable for all memories to have a working resizableBuffer attribute added, rather than keying it off a new flag in the memoryDescriptor? Otherwise, there'd be an issue of how to handle memories that aren't directly created in JS, but are instead exported from a Wasm instance.

Also, there was some discussion that grow/resize would need to throw if grown by amounts which are not multiples of the Wasm page size, which seems fine.

It's great that this feature is being added to JS! To add more context for people here, I've linked below some pretty big toolchain/performance limitations Wasm has been dealing with which would be significantly helped by Resizable/GrowableArrayBuffer integration.
WebAssembly/design#1271
WebAssembly/design#1296

@syg
Copy link
Contributor Author

syg commented Mar 2, 2021

Would it be acceptable for all memories to have a working resizableBuffer attribute added, rather than keying it off a new flag in the memoryDescriptor? Otherwise, there'd be an issue of how to handle memories that aren't directly created in JS, but are instead exported from a Wasm instance.

Ah, good point about non-JS-created memories. I don't have a strong opinion, it's more that I'm not sure how an always-available resizableBuffer attribute works. Vending a ResizableArrayBuffer/GrowableSharedArrayBuffer is mutually exclusive with vending a plain ArrayBuffer/SharedArrayBuffer. So if resizableBuffer is always there, is it a "treat whichever attribute is accessed first as the intended Buffer class to use"? That is, if I access resizableBuffer first, then afterwards accessing buffer would throw, and vice versa. That's a pretty weird API, but maybe it's fine?

@syg
Copy link
Contributor Author

syg commented Mar 2, 2021

A third path would be another way to signal intent for what kind of buffer you want that's not on the MemoryDescriptor. I'm not fully clued in on the intent of the MemoryDescriptor, so that might be the wrong place to signal intent to begin with.

@conrad-watt
Copy link
Contributor

conrad-watt commented Mar 2, 2021

Vending a ResizableArrayBuffer/GrowableSharedArrayBuffer is mutually exclusive with vending a plain ArrayBuffer/SharedArrayBuffer.

Does this have to be the case? Having the same Wasm memory backing both (e.g.) an ArrayBuffer and a ResizableArrayBuffer doesn't instinctively seem weirder to me than having an ArrayBuffer backed by a Wasm memory which is separately manipulated by Wasm operations. Presumably the ArrayBuffer would detach if the ResizableArrayBuffer ever calls resize, just as is already done for memory.grow in Wasm.

@syg
Copy link
Contributor Author

syg commented Mar 3, 2021

I feel somewhat strongly that aliasing buffers like that on the JS side is a bad idea.

  1. You're right that the buffer aliasing already exists for wasm-manipulated memory, but I feel like that's a pretty coarse-grained aliasing. It is still an invariant on the JS side that for non-shared buffers, if ta1.buffer !== ta2.buffer, their accesses can't alias. I'm pretty hesitant to break this invariant. It seems broadly useful for reasoning and source code transforms.

  2. As you point out, resizes would require detaching all aliased non-resizable buffers (I guess there would just be a single one). This is extra complexity in implementations and spec that I think would be harmful.

Edit: Contra to #1292 (comment), I guess I do have a strong opinion if the two attributes are supposed to vend aliased-backing-store buffers. :P

@conrad-watt
Copy link
Contributor

conrad-watt commented Mar 3, 2021

In that case something like #1292 (comment) does seem like the immediate path of least resistance, although if buffer aliasing were to sneak into JS through some other avenue, I'd hope we could relax the throwing behaviour.

@conrad-watt
Copy link
Contributor

conrad-watt commented Mar 3, 2021

Alternatively, maybe the first vend of aResizableArrayBuffer could permanently detach the associated ArrayBuffer (including subsequent results of .buffer), while vending a GrowableSharedArrayBuffer is just allowed to alias with the associated SharedArrayBuffer, since consecutive accesses to .buffer of a shared memory can already give aliasing buffers (e.g. if there is a concurrent growth).

@syg
Copy link
Contributor Author

syg commented Mar 3, 2021

Cool, I like the suggestion in #1292 (comment).

@conrad-watt
Copy link
Contributor

conrad-watt commented Mar 3, 2021

With reference to the OP...

  • If the Memory was created with a maximum size, that size will be reflected on the ResizableArrayBuffer or GrowableSharedArrayBuffer instance.
  • Otherwise, some implementation-defined max size will be reflected.

If I'm interpreting the JS proposal correctly, it's mandatory to provide a maximum size, but grow/resize is still allowed to fail arbitrarily even before reaching the max size, analogous to our memory.grow. In Wasm, a memory's max size is optional, but there's still a hard limit of 2^32 bytes, so from a purely semantic standpoint it would make sense for this to be the observable max in JS for a Wasm memory with no explicit max, rather than leaving it to be implementation-defined (especially if JS were to ever allow buffers larger than this in other contexts).

Since growth can fail even for sizes below this, there's no issue in abstract, but would this approach mess with any engine/code heuristics?

EDIT: there was a time when engines aligned on a hard limit of 2^31 bytes for memories, below the theoretical maximum. I interpreted the "implementation-defined" line of the OP more broadly, but is the intention of this line just to allow for a static limit like this to be exposed?

@syg
Copy link
Contributor Author

syg commented Mar 3, 2021

so from a purely semantic standpoint it would make sense for this to be the observable max in JS for a Wasm memory with no explicit max, rather than leaving it to be implementation-defined (especially if JS were to ever allow buffers larger than this in other contexts).

That makes sense to me, I'll update the OP.

Since growth can fail even for sizes below this, there's no issue in abstract, but would this approach mess with any engine/code heuristics?

The engine implication to me is that Wasm memory-backed buffers need special casing. But they do already for special detachment semantics, special resize semantics, etc, so this seem to me that it'll add any more.

@conrad-watt
Copy link
Contributor

conrad-watt commented Mar 4, 2021

That makes sense to me, I'll update the OP.

Just to be clear, after some pondering I think that it makes sense for the length in this case to be implementation defined up to any static limit that engines set on memory size which is lower than the spec max. For example, this used to be 2^31, is now moving to 2^32 (in line with the spec limit), and when we get 64-bit memories, may be somewhat lower than 2^64.

Generally we've tried to have all Web engines agree to align on such a value, where it's necessary, but it's not documented as part of the "core spec".

EDIT: the value is in fact fixed by the JS API, so that should be the number used for the max field. No need to say it's implementation defined after all!

@syg
Copy link
Contributor Author

syg commented Mar 4, 2021

when we get 64-bit memories, may be somewhat lower than 2^64.

Oh, that's gonna be a tricky thing on the JS side. ArrayBuffer sizes aren't BigInts, nor can BigInts be property keys currently.

@syg
Copy link
Contributor Author

syg commented Mar 4, 2021

EDIT: the value is in fact fixed by the JS API, so that should be the number used for the max field. No need to say it's implementation defined after all!

I'm reading that as 2^32 (max 65536 pages * 65536 byte page size), yeah?

@conrad-watt
Copy link
Contributor

Oh, that's gonna be a tricky thing on the JS side. ArrayBuffer sizes aren't BigInts, nor can BigInts be property keys currently.

AFAIK we haven't explicitly discussed JS integration for the memory64 proposal yet, but that might be something that @aardappel can comment further on at some point.

I'm reading that as 2^32 (max 65536 pages * 65536 byte page size), yeah?

That's my reading 🎉

(Also, what is the process for bringing such a "normative PR" that's not a full proposal to the CG? Is it just an agenda item?)

BigInt integration (which IIUC was also purely a JS-API change) was handled as a regular proposal, but IIRC it moved through the phases very quickly because there were no core spec changes. Unless anyone comes up with objections here (@kripken dropped a 👍 above) it probably makes sense to add this to the agenda as a poll for phase 1. BigInt integration was proposed quite early on in the life of the committee so people may want to bikeshed a revised procedure for JS-API changes going forward.

@conrad-watt
Copy link
Contributor

conrad-watt commented Mar 4, 2021

Oh, wrt the new OP edit

Memory's buffer attribute will detach any ResizableArrayBuffer buffers vended by the resizableBuffer attribute.

The inner bullet just after that line says that the ResizableArrayBuffer can never be detached. In my head, once the .resizableBuffer attribute is accessed for the first time, subsequently attempting to access the .buffer attribute would for evermore give you back a detached buffer (you'd need to continue accessing .resizableBuffer to get a "good" buffer).

EDIT: Alternatively, it could be a straight exception to subsequently access .buffer, or double alternatively is it preferable to really detach the ResizableArrayBuffer (and edit the bullet about it not being detachable)?

@syg
Copy link
Contributor Author

syg commented Mar 4, 2021

The inner bullet just after that line says that the ResizableArrayBuffer can never be detached. In my head, once the .resizableBuffer attribute is accessed for the first time, subsequently attempting to access the .buffer attribute would for evermore give you back a detached buffer (you'd need to continue accessing .resizableBuffer to get a "good" buffer).

Oh I see, I misunderstood your suggestion. Let me back up and be very explicit:

My understanding of the current situation:

  • ArrayBuffers backed by Wasm memories can't be detached by JS, e.g. bypostMessage.
  • But they can be detached by Wasm when Wasm itself does a memory.grow.

Given that, what I thought you were suggesting:

  • Both buffer and resizableBuffer can't be detached by JS, e.g. by postMessage and transfer.
  • The buffer and resizableBuffer attributes become side-effectful: upon each access, they ensure that if there's a competing buffer, that buffer is detached by Wasm. That way, an application can keep switching buffer attributes, just with disastrous performance and ergonomics consequences.

Of course, many other avenues are open, like the ones you suggested in #1292 (comment). I'll be happy to let the CG hash it out.

@conrad-watt
Copy link
Contributor

Makes sense!

@aardappel
Copy link

@conrad-watt from my very minimal understanding of JS, Memory64 doesn't change anything, in the sense that we already had the possibility of i64's on the boundary, just now there will be a lot more of them.

The always get turned into BigInts (a decision which I personally find unfortunate) which is causing a ton of churn in e.g. the Emscripten JS bindings, and I suspect elsewhere as well.

@syg
Copy link
Contributor Author

syg commented Mar 4, 2021

@aardappel But how do you address very large indices? Not all i64s fit inside of a float64.

Edit: pesky double negatives

@aardappel
Copy link

@syg which "very large indices" for example?

53 bits is.. more than a million 4GB memories all stuffed into one. If we assume that most i64 args are indices (or pointers) that fit inside of that, we will not have a need for bigints for a veeeery long time.

And this is a dynamically typed language after all, so emitting bigints for those rare >53 bits uses wouldn't be the end of the world. A pragmatic trade-off if you ask me.

@conrad-watt
Copy link
Contributor

conrad-watt commented Mar 4, 2021

We could probably make 2^53 (or lower) the new Web limit for (64-bit) Wasm memories (ref #1292 (comment)).

@syg
Copy link
Contributor Author

syg commented Mar 4, 2021

@aardappel @conrad-watt Thanks for the clarification. Sounds good to me.

@littledan
Copy link
Collaborator

About the idea in #1292 (comment) : I'm a little uneasy about that API surface suggestion, as it seems to contradict the concept that "Getters must not have any observable side effects.". Another design (probably huge overkill) would be to have a custom section indicating how Wasm-allocated memories should be surfaced in JS.

Overall, it feels like we're still somewhat early in the design stage here. It would be great if someone could draft a concrete specification PR and rope in other implementers for review. Maybe this is the kind of thing that would benefit from an explainer as well, so it's easier to figure out what's happening without having to trace the arguments in several issue threads.

@conrad-watt
Copy link
Contributor

conrad-watt commented Mar 8, 2021

@littledan instead of an accessor property, what about an explicit method like .getResizableBuffer() which detaches the buffer(s) linked to the .buffer attribute? To preserve @syg's point that an application may want to switch, we could add the symmetric .getBuffer(), which detaches any extant resizableBuffer and updates .buffer.

EDIT: bikeshedding... maybe the new method puts the resizableBuffer in .buffer, and we call it as/toResizableBuffer()?

It would be great if someone could draft a concrete specification PR and rope in other implementers for review.

If such a spec PR wouldn't be ready for the next meeting, it might be best to make a phase 1 proposal ASAP anyway, to attract more eyeballs, and then present the PR as part of a phase 2 poll in the subsequent meeting.

@syg
Copy link
Contributor Author

syg commented Mar 11, 2021

Unfortunately I didn't add it in time to the next meeting before the agenda was filled up. I've added it to the https://github.com/WebAssembly/meetings/blob/master/main/2021/CG-03-30.md, which should also give enough time to make a spec PR.

@guest271314
Copy link

My use case is live streaming media.

Given a ReadableStreamDefaultReader.read() call which results in a Uint8Array that can potentially have an odd length (streaming STDOUT from a shell script with WebTransport) that case can have consequences for performance and audio output where the Uint8Array is passed to Int16Array that expects the input buffer to have even length where we process further to create Float32Arrays for use with Web Auduio API, Media Capture Insertable Streams (a.k.a. Breakout Box), WebCodecs, WebRTC, e.g., see WebAudio/web-audio-api-v2#97 (comment).

I initially tried to "carry over" the odd float value to the next Uint8Array, however, that requires creation of multiple new TypedArrays.

I then tried WebAssembly.Memory.grow() to write the values to a single block of memory, which does produce the expected output. WebAssembly.Memory grow() has restrictions on Chrome and Chromium https://bugs.chromium.org/p/v8/issues/detail?id=7881, which negates the ability to create an infinite stream, without using a ring or circular buffer. Ideally we should not need to create an object to store values at all, rather, simply stream the contents of the Uint8Array as the data arrives, without creating and maintaining additional storage objects (WebAssembly.Memory) containing values that are not used after the current frame in the live stream has been rendered to screen and, or audio output devices.

Is it possible to clear memory allocated to a SharedArrayBuffer (WebAssembly.Memory) instance?

What I am uncertain about is if writing values from an existing TypedArray to a WebAssembly.Memory instance points to the same memory or is it possible for memory allocation to grow in proportion to the creation of multiple TypedArrays written to a single WebAssembly.Memory instance?

(If this is not the thread to ask the above questions, kindly refer me to where to ask the questions).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants