-
Notifications
You must be signed in to change notification settings - Fork 27
mem.init and mem.drop appear to break single-pass validation #27
Comments
Yes, I see what you mean. Though you still can do it in one pass, you'd just have to remember the maximum data segment index and defer the validation error until you know how many data segments there are. I don't know of a good way around this, unfortunately. |
The init/drop ops are also verified to only use passive segments, so deferring the validation requires keeping track of all expected passive segment indices. IMO it's better to just make an out-of-range or active segment a runtime trap instead of a validation error. |
What was the original reason that init/drop cannot be used with active segments? Is there an expectation that implementations might want to optimise the representation of the module/instantiation process? Or just general silliness of dropping an active segment? I could imagine legitimate reasons for wanting init. Our decision on this issue/PR (WebAssembly/threads#94) / (WebAssembly/spec#820) moves active segments towards being passive segments + an init-esque syntactic sugar for the start function. So maybe this could be revisited, and then a single index really would be sufficient for deferred validation. |
The original reason came from @lars-t-hansen in this comment which I've inlined here:
|
Thanks! Since this part of instantiation is now going to precisely share runtime semantics with mem.init, and therefore gets mem.init's failure semantics in the drop case for free, the spec point seems less forceful now. One of the motivations I originally gave for the decision I linked above was that this alignment of instantiation and mem.init would reduce the overhead of spec'ing failure cases in a future-proof way (a hypothetical mem.protect in my original example). I might not fully understand the point about GC though. From my perspective, even with purely passive data segments, if mem.drop is a hint to the GC to collect a segment, correctly handling a concurrent mem.init is a concern. |
I think the references to GC are superfluous. The goal is to be able to free the memory used by active segments after instantiation. If implementations are allowed to free that memory, then The concurrent semantics of Since the |
Index spaces are static and all references to indices must be static. Otherwise we'd break the ability to renumber indices (a.k.a. alpha renaming) in code transformations. Consider e.g. tools like module mergers. (Mapping dynamic to static indices is what Wasm has tables for.) To be clear, that is a separate question from making the validity check for the index static. But it would be a very odd disconnect from the rest of the language to not validate the use of a static index statically in this one case. I don't think that is sufficiently justified by minor implementation inconvenience. |
I will concede your point about static indices.
It would stand out, but so does adding a validation dependency on a later section. Another option would be to add a section that must occur before the code section to forward declare passive data segments.
My proposal isn't motivated by convenience. Implementing this stuff is far less effort than to try to effect change in the spec. |
If mem.init were to be allowed for active segments, then obviously any segment used this way couldn't be GC'd after instantiation, and would have to be explicitly dropped to enable GC. I don't think non-determinism is the concern. However, this might make my interpretation of active segments a bad idea, because it might affect expectations existing code could have about GC behaviour of data segments. EDIT: Ah, actually, JIT and so on maybe makes this more non-trivial. And doing it during validation means you have to keep track of indices anyway. Ok - I think my idea doesn't help this issue, unless it's not a big deal that active data segments wouldn't be GC'd as eagerly as they are now. |
@conrad-watt, I thought the idea for explaining active segments and their memory usage behaviour accurately would be that they correspond to a start function doing |
EDIT: After some discussion over email, I've realised this isn't correct, see below. |
I think there is some confusion in this (and perhaps previous) discussion regarding concurrency and dropping segments. To be precise, what's being dropped are the segment instances, which are internal to a single module instance (see @binji's spec draft). Since module instances cannot be shared across threads, neither can segment instances. So GC is involved in so far that an implementation may choose (and is intended) to share the representation of a module's memory segments with their instances. In such an implementation, some concurrent synchronisation is necessary for Note that an implementation can only share memory segments. Table segments contain unsharable references that are generally fresh per instance. So I see no issue with explaining active segments as passive ones with an implicit init+drop. |
A consequence of this would be that mem.init on an active segment could become a runtime trap (same as a passive segment that's dropped), as @AndrewScheidecker floated at one point. This would solve the individual index-tracking issue. I was nurturing some pretty severe misconceptions about how segments are shared and GC'd between instances. |
Looping back to the original point on single-pass validation:
It seems a bit heavyweight to add a new section for this, IMO. As an alternative we could require that all passive segments are defined after all active segments. That way you could just remember the range of referenced segment indexes and make sure it is fully contained within the range of passive segment indexes that is defined later. |
Having now implemented this in SpiderMonkey, it didn't turn out to be much of a big deal. The compiler merely has some extra state ("DeferredValidationState") which accumulates enough information about segment uses to enable them to be checked at the end of the module's bytestream. In practice it also seemed prudent to collect enough information to produce reasonable error messages if required. If we're processing a module with thousands of segments on a memory constrained device, I can see that the additional memory overhead might be a disadvantage. |
Our current implementation has a dynamic error if a passive op attempts to reference an active segment, so we just track the maximum index used. This may be a bit weird given that the index is static and so one might otherwise expect a static validation error. While our current solution is totally manageable and pretty well abstracted, it is a bit pain and an outlier in the presence of our parallel compilation pipeline, so I wonder if an up-front section that declares the "type" (active/passive, init-expr, size) of all the data segments up front would be, in total, less complexity and symmetric with what we've already done for the function/code section split. This would also make it trivial to have static validation errors for passive ops using an active segment. Thoughts? |
If you haven't already, take a look at #30. It adds an optional data declaration section that must be used to declare passive segments before the code section. The difference from what you're suggesting is that it just contains a flag byte for each data segment that determines whether it's passive or active, and then the encoding of each segment in the following data section is dependent on whether it was declared to be passive or not. It would also be possible to redundantly encode the flags in the data section, and if the data segment declarations are present, require that the flags match between the two. After writing up #30, I think the redundant approach may be a little nicer, so the data section encoding doesn't diverge from the elem section encoding. |
Ah, thanks, I didn't see you already had #30. I don't have a strong opinion on what precisely goes in the up-front declaration; but having a symmetry between the function/code sections is attractive. |
One thing that came up in previous discussions is that active segments could be a design mistake. We have the opportunity as part of the bulk memory proposal to make active segments precisely a special case of passive segments (immediately init + drop at start time). This would mean a runtime trap at mem.init (it's exactly like a dropped passive segment at that point). Adding a new section with the type of the segment would more permanently privilege active segments in the spec. |
Yes, good point about being able to desugar active segments into passive ops (like we do with the start function). With those runtime semantics, though, we could still have the up-front data segment declarations, as a pure binary format detail (just like, iirc, the |
The function segment is forward declaring a type, whereas I'm viewing the active/passive segment distinction as just a way of compactly marking (in one bit) "do this very common pattern in the start function". The image I had in my head was that in the streaming compilation case, there's only one way to validate/compile an init instruction, which doesn't care whether the segment is active or passive, so there would be no advantage in forward declaring it. If active/passive is being forward declared in its own section, it becomes more like a type, and it would make sense to make init on an active segment a validation error, since the information is there anyway. As @binji said, this seems more heavy-weight. |
If I think the general argument in favor of load-time validation versus a run-time trap is that it allows eliminating a run-time check. However, in this case, there is little impact on the run-time cost of the operation, and there is an impact on either the spec (for a declaration segment) or implementations (for deferring validation) to validate at load-time. That seems like a reasonable trade-off to me. To summarize the options discussed:
1a is the current state of the proposal. #30 is 1b. I think I'd rank my preference as 3, then either 1b or 2a, then the rest. |
This is a great rundown of the space of options! 3 seems to me to go too much against the conventions already established by other instructions. {get,set}_local, {get,set}_global, and call all have the same pattern of an immediate indexing into a field of the current instance/frame, and all of these are bounds-checked during validation. This is also one of the few bounds-checks that we can always guarantee ahead-of-time no matter how wacky our concurrency gets, so I think it's important to be consistent in preserving it. We have more freedom to decide whether active/passive is a validation or runtime error, because we've not yet fixed the extent to which these designations act like types or purely like a runtime syntactic sugar. Some of the decisions we've already made push "active" towards being a sugar; either 2a or 3 would be the apotheosis of this, in the sense that there would be no difference during validation between active/passive segments. I currently prefer 2a. I'd be interested to know if 1b offers practical advantages over 2a to implementations. I'm not so keen on the others. |
Agreed but, even without the active/passive distinction, the number of data segments could still be considered the static type of the data section and so it'd be beneficial and symmetric to declare it up-front. @AndrewScheidecker Yes, that's a great classification of the options; so I think that means I prefer 2b (although taking into account what @conrad-watt's pov, the "active/passive" distinction goes away at runtime; there are simply "has the segment already been dropped", with active segments having been desugared into |
To be clear, I was not thinking of option 3 as changing the segment indices from immediates to general operands, just deferring the validation until they are executed. That should still allow renumbering segments. |
Ah, I agree with this. I didn't consider 2b fairly. So if we go with 2b, the validation of mem/table.init presumably fails if a data/elem segment bound has not been forward-declared respectively, and then the actual data/elem segments must be validated as having equal total number(s) to this forward declaration. And, for backwards-compatibility, it's fine not to declare a bound so long as there are no associated init instructions. EDIT: misclick, sorry |
@AndrewScheidecker d'oh, sorry for misunderstanding. You're right, that would allow merging, but I suppose still it is rather asymmetric with the rest of static index validation in wasm. @conrad-watt Oh right, I forgot about the backwards-compatibility exception! That is rather unfortunate since it means we'd have to bubble up this binary-format detail (whether data segments were declared) up into the module AST as a flag to be checked during validation (unlike the Ultimately, these are all rather minor tradeoffs which makes it hard to have a strong preference, but I guess the above makes me lean back toward the status quo of 2a. |
@titzer, I find it hard to read it that way, but if that's the way it's intended then I don't have any particular problems with it. |
Beyond just the size, if we ever did have passive element segments with a non-function type, that type should be properly declared before the code section - so my suggestion to use the flag field to signal the existence of the element type isn't sufficient. If we really don't believe that passive element segments will ever have a non-function type, then this proposal seems to collapse to |
@lars-t-hansen I wasn't sure how exactly we'd use a non-function element segment, but it wasn't too hard to imagine that, in some future version of the GC proposal, we'd be able to define new values statically and would want to initialize a table with those values. @titzer Right, I moved the section above to address the issue of not having information about the data segments when parsing the code section. But because this makes streaming compilation worse, you could split it into a separate section like @AndrewScheidecker was suggesting above. Though that means we'd have 3 new sections. @conrad-watt The element section is already before the code section, so we could use the flag field for the element type. And you're right, this ends up being similar to 1b/2b, though it's closer to 1b because you statically know that an index is passive. Given all this, I'm back to thinking that we should stick with |
Yes, I think passive element segments should have an element type and each entry should be a constant expression. With the reference type proposal, you'd then use a ref.null or ref.func expression to denote a null or function reference, and you could use other (constant) ref expressions we add in the future. You could also use an imported global, which may be handy. The only downside to this is that each entry will take up 2 additional bytes in the binary format. |
@rossberg So if I understand you correctly, we'd parse an element segment in the following ways:
|
@binji, that seems rather elaborate. I was thinking of having just one version for each:
|
I see, but we don't currently have |
Could the byte representation of |
Ah, right, that makes sense. Though it seems like it may be somewhat clumsy to spec. So we'd have an additional constraint that an active segment that references a table (0 or otherwise) that is not an So the format would be:
And for the bulk memory proposal, And as @rossberg said, we'd then allow the expr |
There are several places now where we have observed a dependency from bulk table stuff on the reference types proposal. Fortunately, it seems like the latter is already further along than this one, so I have some hope that the dependency becomes a non-issue in the end. |
I don't think that is quite correct. reftypes has ref.null but not ref.func, but generalized bulk table operations really want ref.func - unless we specialize for anyfunc, which I think we should, and I've noted this on the latest PR. It is true that the bulk table operations are more elegant if they are made to depend on the reftypes proposal and some subsequent proposal that provides ref.func, but now we're no longer in the situation where the bulk table proposal is trailing proposals it's depending on. |
Good point about ref.func. But if we decided to keep funcref in the reftypes proposal then it would be straightforward to include ref.func with it in a forward-compatible manner: for now, it would simply return type funcref, later we can refine that to a subtype. |
Indeed, and I'd not want to have funcref in reftypes without ref.func, even if I can emulate it with table.get if I am desperate. I might also not want to have funcref without ref.call, even if I can emulate it with a table and call_indirect, for that matter. (BTW I'm not agreeing to adding ref.func to reftypes - my position is still that funcref should be removed from reftypes. But that's a discussion for elsewhere.) |
Our original motivation here is passive data segments in the presence of the somewhat peculiar Web Worker embedding of wasm threads which have a single shared Returning to @binji's comments above about the root comment of this issue: if we only have to declare the set of passive data segments in a new section above the code section, what's the downside? This seems pretty simple to specify (we're basically binary-encoding a single |
@lukewagner one reason against it was a general preference against adding new sections. I've also been advocating for the interpretation of active segments as a syntactic sugar of passive segments + init/drop. One consequence of this is that there should be no (stack) validation that relies on the active/passive distinction anyway, obviating the need for forward declaration of active/passive. My reasoning for this is that we've come to dislike active segments for various reasons, and this approach comes as close to removing them from the spec as we can go while remaining backwards compatible. We've consistently made design decisions in this direction whenever a space of options has come up (e.g. relaxing instantiation bounds checks). |
I think those are good reasons, all else being equal. However, given the practical downsides and the fact that streaming-, parallel-friendly compilation is a very clear high-level design goal which we've made numerous design choices in favor of (including splitting a single logical section into two binary sections, for the same basic reason), it seems to me that all else is not equal and the practical reasons outweigh the theoretical benefits of removing the "is passive" spec-flag on segments. I'll also note that, as best as I can predict, even once passive segments are supported everywhere, active segments would likely be preferable to emit from optimizing toolchains when possible because:
Moreover, the only current hard dependency on passive segments derives from the highly-goofy worker=thread arrangement on the Web. However, the goal is to move past this by adding "pure wasm threads" which would necessarily share a single wasm instance between multiple threads, avoiding the need to instantiate a single module multiple times on the same memory, allowing active data segments. Passive segments of course unlock all sorts of new use cases; my only point here is that active segments likely aren't simply a wart that will go away once the more general feature is available. |
The current approach (afaik) doesn't prevent streaming compilation/parallel validation or the use of active segments - it merely relaxes a potential validation error that is only relevant for active segments, making it a runtime trap instead. Adding an additional section declaring "is passive" ahead of time, instead of in the segment itself, would recover this (imo very marginal) validation error, but would come at the cost of the spec complexity/abstractions I motivated above. I agree that active segments will still be useful as an optimised representation. I'm mainly interested in keeping them out of the abstract semantics as a distinguished thing (as far as possible). I argued against an earlier suggestion to disallow them entirely for shared memories, for this reason. |
I guess I should have said more precisely: without any up-front declaration, bulk memory ops introduce a dependency from function body validation on bits later in the binary which is something that, up until now, we have carefully avoided. But I apparently keep forgetting option 2b (sorry!) and conflating "up front declaration" with "active vs. passive". So yes, all we really need here to preserve this validation property I mentioned is an up-front declaration of "how many data segments" which, for backcompat, is only required to be present when the bulk-memory operations are actually being used. So bulk indices are statically validated in-bounds, but dynamically fail for already-dropped segments which active segments always are. And elem segments don't have this problem since they're all up-front anyway. That seems to check all the boxes, at the modest cost of introducing a (trivial) new section. Does that sound good to everyone? |
I don't mind 2b, although we had this exchange earlier (#27 (comment)) where the fact that the count declaration must bubble up to the validation rules was taken as a negative. I agree that, in the case of parallel validation (for 2a), it seems annoying to work out the "max |
Right, and sorry for waffling here; it's always a hard question whether an impl wart should just be accepted or avoided in the spec and @binji having the same reaction I did made me re-question and realize that we have actually intentionally maintained this invariant that we'd be breaking here. And actually, thinking more about the spec: this "if any |
Right, this could be a binary format detail. I'm still mostly against adding new sections where not required, but this does seem like a place where it wouldn't be so bad to do so. |
Agreed with the above. In fact, I think it may be worth making an explicit invariant that all information needed to validate function bodies in the future must be declared before the code section. |
OK, so it sounds like we're moving toward 2.b. Just so we don't have to keep scrolling up, that was defined as:
Is everyone OK with this? If so, I'll start updating the document. |
I assume the silence is tacit agreement :-) PTAL at the PR here: #42 |
Closing this issue since that PR landed. |
…nit` be passive The validation rules do not require it: * https://github.com/WebAssembly/bulk-memory-operations/blob/eca021a3a9aaa3cca336ecdd5379e570023bbe3a/document/core/valid/instructions.rst#memoryinitx * https://github.com/WebAssembly/bulk-memory-operations/blob/eca021a3a9aaa3cca336ecdd5379e570023bbe3a/document/core/valid/instructions.rst#datadropx And the spec tests check that you can do this (and potentially get runtime errors for already implicitly dropped active segments): * https://github.com/WebAssembly/bulk-memory-operations/blob/07c57f16ff67b29e30719e1a692aad486f582021/test/core/bulk.wast#L148-L168 See also these discussions: * WebAssembly/bulk-memory-operations#107 * WebAssembly/bulk-memory-operations#27 (comment)
Per my understanding, the new insns mem.init and mem.drop have an immediate (literal) operand, which is an index indicating the data segment from which the initialising data is to be copied. I ran into a problem implementing validation for them, in that it's not possible to check that the literal value is in range at the time when validation passes over the code, because at that point the number of data segments isn't known -- the data segments appear in the binary format only after the code.
But I might misunderstand. Can someone please clarify?
The text was updated successfully, but these errors were encountered: