-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unwinding as a modifier to branching #124
Comments
And just in case I wasn't clear in #123, in my proposed version of |
My primary intent for this post was to help develop conceptual understanding of unwinding, which I thought might be helpful for ongoing discussions. The instructions I use here are primarily pedagogical devices used to illustrate how the interwoven concepts in exception handling can be broken down. Sorry, I should have made my intentions clearer. |
I see, but that leaves my other questions. Unless they are answered, it's hard to continue discussions, or understand what the takeaway or suggestion from this post is. What I understand is |
This post seems to be mostly about factoring our understanding of programmer (or language designer) intent with regard to various exception/unwinding mechanisms into separate notions of control flow transfer and resource cleanup. @RossTate, since this post does not have a concrete call to action, I am unsure of what your goal in posting it is. Are there specific other discussions you think these ideas should influence? |
It seemed pertinent to the discussion in #123, giving a motivation for why to change the design for unwinding. (@aheejin asked me to not give this motivation in that issue.) It also seemed pertinent to the discussion in WebAssembly/design#1356, illustrating how stack inspection can be used to implement two-phase exception handling provided a suitable unwinding design is in place. |
It sounds like this was meant as background reading for discussions in several other issues. In future, @RossTate, I'd recommend not opening new issues in such cases, and instead creating a gist or similar external resource and linking it from the relevant places. Particularly on proposal repos, an issue generally implies that there's something about the relevant repo which the issue creator thinks ought to be changed. |
I don't remember I asked you something specific like that. Maybe I might have said that I wished to keep the issue post on the topic in our Zoom meeting maybe...? It's been a while, but so that's my best guess, but I don't think I asked you not to post about something specific. I agree sometimes it is better to post something as a separate post if it gets too long. In that case I think it will help people figure out the context if you say this is a spin-off from a discussion from comments from another issue and for additional info for that part of the discussion. Also gist, as @ajklein suggested, is a good tool too. |
Ah, okay, thanks for the suggestions. I'll incorporate them in the future to avoid causing so much confusion! |
Here's a bit more information from the point of view of a Common Lisp programmer - posting it here in order not to clutter #87.
This is also the case in Common Lisp. Copying the basic information from the former thread: Common Lisp has three operators for performing non-local exits, which may cause stack unwinding:
This means that we have two classes of situations with regard to knowing the destination:
That's all. All unwinds in Common Lisp happen because of these three primitive operators. However, one other important thing in Common Lisp is that non-local (defun foo (function)
(funcall function))
(defun bar ()
(block nil
(let ((function (lambda () (return-from nil 42))))
(foo function)
24)))
(defun baz ()
(let ((result 42))
(tagbody
(let ((function (lambda () (go :end))))
(foo function))
(setf result 24)
:end)
result)) Slightly simplifying, we have defined three functions:
In both situations, we can infer that, if there is no non-local exit, each function is supposed to return However, by calling In both cases, we can see that the anonymous functions that perform the non-local exits are created in the lexical scope of As mentioned in #87, The aforementioned non-local goto behavior chains with What would be required to make such unwind-preserving non-local goto possible in WASM? (Perhaps it is already possible; as a newcomer, I apologize if I am trying to lockpick an already open door.) |
If I understand correctly, The general scheme could use an event like this: |
I wonder if it is possible to bend this construct to make it possible to signal errors when the tagbody in question is not found. If we consider the following code: (let ((function (block nil (lambda () (return-from nil 42)))))
(funcall function)) Then the extent of If I understand the above correctly, then this means that we should first search for the matching event on the stack. If it is found, we can perform a jump; if it is not found, then we should signal an error via normal CL means. |
Thanks for the great question, @phoe. I'll note that there seems to be a little bit of flexibility in how to solve the problem due to the fact that the language spec gives some flexibility when Let's try to go for the When control enters a When control enters a This implementation works in a Common Lisp world. But in a wasm world there are two things that can go wrong:
As for the longer term, we could more directly support these lexically-scoped constructs (rather than emulate them through dynamically-scoped constructs) with invalidatable heap references to wasm labels. The reference would be invalidated whenever the referenced label is popped off its stack, and when you try to dereference the (valid) reference, the engine would do a (quick) check that the current control point is on the same stack (or continuation) as the referenced label. Then you could do an |
Thanks for the elaborate explanation.
Multiple values can be returned from a block, as CL supports multiple values. Is that possible?
This flexibility is because of the CL definition of undefined behavior. Generally, in case of UB, the implementation is allowed to do whatever it wants, including crashing. But, if the implementation decides to handle this aforementioned situation gracefully in some way (which is done by all CL implementations I know), then it must do it by signaling a
What do you mean by it being untyped? A tagbody has two traits: that its tags can be either integers or symbols (which can be reduced to more integers during compilation stage), and that it always returns |
Yes, with the recently standardized multivalue proposal, WebAssembly can generally send multiple values wherever it can send one value. Note that all the control flow is statically typed, though, so a single WebAssembly block or event cannot carry one value on some code paths and multiple values on other code paths. I don't know if that would be a problem for CL, but it is, it could be solved by boxing the potentially-multiple values. |
In Common Lisp, if a given block of code returns multiple values (e.g. by calling the For a short example, Is such a kind of primary value extraction and |
Not directly, but it shouldn't be too hard to insert some WebAssembly to bridge the difference between the externally expected types and the internally provided types given that both are statically known. (Or are they not both statically known?) |
Common Lisp is a strongly dynamically typed language and the number and types of the values may be known at compilation time, but need not be known in the general case. If we have a function named
A sufficiently smart compiler™ might be able to infer some or all of this information statically at compilation time, but e.g. in case of |
Yep. And there are a lot of implementation strategies for this. WebAssembly's goal is to support at least one of these strategies reasonably efficiently. It's up to the Common-Lisp-to-WebAssembly compiler to figure out which strategy works well. And if there's an extension to WebAssembly that can be made (and is more broadly useful), then one can develop a proposal to add such functionality to WebAssembly. My suspicion is that there's already a reasonable way to support this particular dynamism of Common Lisp in WebAssembly, but I'd be interested to learn if I'm mistaken or if there's a substantially better technique that we should be made aware of. |
I think there's a way to view unwinding as a modifier to branching. But before I get into that, I first want to make sure I'm on the same page as everyone with respect to branching.
Branching
Contrary to what their names suggest, branching instructions like
br
do not correspond at the assembly level to jumping instructions. Rather, that's only half of the story. The other half has to do with the stack.Suppose an engine has some type that is managed using reference counting—call it
refcounted
. Then consider the branch in the following code that calls some function$foo : [] -> [refcounted]
:This branch does not compile to simply a jump to the assembly-code location corresponding to
$target
. It also cleans up the portion of the stack that is not relevant to$target
. In this case, that involves decrementing the reference count of therefcounted
value returned by$foo
that is sitting on the stack. In a single-threaded engine, that might in turn result in freeing the corresponding memory space and consequently require recursively decrementing the counts of any values contained therein. Depending on how large this structure is and how much of it is no longer necessary, this might take a while. So a simple instruction likebr
can actually execute a fair amount behind the scenes depending on how the engine manages resources and the stack.All this is to illustrate that a label like
$target
more accurately corresponds to an assembly-code location and a location within the stack, and likewise an instruction likebr $target
more accurately corresponds to "clean up the stack up to the stack location corresponding to$target
and then jump to the assembly-code location corresponding to$target
".Note, though, that "clean up" here only corresponds to the engine's resources on the stack. But what about the application's resources? That's where unwinding comes into play.
Unwinding
For the sake of this discussion, I am going to say that unwinders are specified using
try instr1* unwind instr2* end
, which executesinstr*
but indicates thatinstr2* : [] -> []
should be used to "unwind the stack", i.e. to perform application-level clean up. In a second, I'll get to how one causes the stack to be "unwound" rather than just "cleaned up".Now consider some surface-level code using the common
finally
construct:Normally a
break
in awhile
loop would translate to abr
in WebAssembly, but thefinally
clause in this snippet requires that its body be executed no matter how control leaves thetry
body. We could consider inlining the body of thefinally
at thebreak
, but that results in code duplication, plus it would result in incorrectly catchingSomeException
if one gets thrown byclose(file)
.nor does it work as well in other examples where there are othertry
/catch
clauses surrounding thebreak
).Really what we want to do is to extend the semantics of "clean up the stack" that is already part of branching to incorporate "and execute unwinders". That is, we want to modify the branch instruction so that it also unwinds the stack.
One way we could enable this is to introduce an
unwinding
instruction that must precede a branching instruction, and its semantics is to modify that branching instruction to execute the unwinders inunwind
clauses as it cleans up the stack. With this, thebreak
instruction in the example above would translate to the instruction sequenceunwinding (br $loop_exit)
.Exception Handling
So far I haven't talked about exception handling, just unwinding. This illustrates that unwinding the stack, like cleaning up the stack, is its own concept. And although unwinding is an integral part of exception handling, bundling it with exception handling as the current proposal does is a misunderstanding of the concept.
But if unwinding is a separable component of exception handling, what is the other component? The answer to that depends on whether you're talking about single-phase exception handling or two-phase exception handling.
Single-Phase
Again, for the sake of this discussion, I am going to say that single-phase exception handling is done using
try instr* catch $event $label
, which indicates that any$event
exceptions thrown frominstr*
should be caught and handled by$label
(where the types of$event
and$label
match).Now, consider the following WebAssembly program:
We can reduce this program to the following:
When we can see the contents of the stack, we can replace a
throw $event
with aunwinding (br $label)
to whatever label the event is currently bound to in the stack. That is, events are dynamically scoped variables that get bound to labels, andthrow
means "branch-and-unwind to whatever label the event is bound to in the current stack". (Of course, an important optimization is to unwind the stack as you search for these dynamically-scoped binding.)This suggests that we can break
throw
up into two parts:unwinding
andbr_stack $event
. The latter is an instruction that just transfers control to and does necessary cleanup up to some label determined by the current stack. This instruction on its own could even have utility, say for more severe exceptions that want to bypass unwinders or guarantee transfer.Two-Phase
In two-phase exception handling, you use some form of stack inspection to determine the target label before you execute the
unwinding
branch to that label.For the sake of this discussion, I'll say that an inspection is begun by using the instruction
call_stack $call_tag
, which looks up the stack for contexts of the formanswer $call_tag instr1* within instr2* end
(where execution is currently withininstr2*
, in which case the instructionsinstr1*
are executed as the body of a dynamically-scoped function (see WebAssembly/design#1356 for more info).As an example of unwinding in two-phase exception handling, consider the following C# code:
This would be compiled to the following WebAssembly code (assuming C#
throw
compiles tocall_stack $csharp_throw
):Notice that the
answer csharp_throw
has a bunch ofunwinding
branches. Which of these gets executed depends on the state of theflag
variable at the time the exception reaches thetry
in the C# source code. (Note that there is notry
nor events in the compiled WebAssembly.) Depending on thatflag
, we'll either having anunwinding
branch to$first
or anunwinding
branch to$second
. In either case, the semantics is "clean up and unwind the stack up to the stack location corresponding to the chosen label and then jump to the assembly-code location corresponding to the chosen label". The difference between here and the original examples usingfinally
is that the portion of the stack that needs to be cleaned up and unwound is not known statically. That is important for implementation (e.g. because it requires stack walking), but semantically speaking it is straightforward and aligns well with the existing abstractions in WebAssembly.Summary
Regardless of whether we want to actually make an
unwinding
instruction, the important thing to note here is that unwinding is always done with a destination. How that destination is determined varies, but in most of the examples above the destination is known before unwinding begins.The current proposal is about single-phase exception handling. But as I've tried to illustrate here, single-phase exception handling is really two concepts combined: dynamically-scoped branching and unwinding. So for the proposal to be extensible, it is important that its design for unwinding is compatible with other notions of destination, even if this proposal on its own solely enables dynamically-scoped destination labels (i.e. events). Ideas like those in #123 would help achieve this goal.
The text was updated successfully, but these errors were encountered: