-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inefficiency Due to Eager Stack Tracing #102
Comments
For some background, there seem to be two predominant alternatives to eager stack tracing.
|
Speaking just about the JS API: my working assumption has been that an uncaught wasm exception would not turn into a NativeError and would thus not implicitly have the Is there anywhere else where eager stack tracing is implied by the current design? |
Hmm, maybe there's a miscommunication somewhere then. Over e-mail, @aheejin has been helping fill me on the details of this proposal. I had noticed that
I interpreted this to mean that the expectation is that
Unsure. But the current design also makes as-you-go stack tracing (i.e. option (2) above) impossible. So people compiling to WebAssembly will have to decide between stack traces and efficient exceptions, which is not exactly a great choice to be forced to make. |
I don't think the spec forces eager stack tracing, and the embedder may or may not embed a stack trace based on its preference. But I acknowledge that if the embedder chooses to embed a stack trace, it should do that eagerly, because the spec says
If it were not for this, stack traces would be able to be appended as stack unwinding goes. I agree it would be better to allow stack traces be generated incrementally if possible, but I'm not sure if we should loosen the immutability requirement. I think I should note that the proposal was mainly designed to ensure zero-cost for the code pass in which exceptions don't occur, and optimizing the exception handling capability itself was not its main goal. It'd be better of course if we can optimize that part though. |
This is not true. Because the That is, the problem stems from the proposal having too much functionality to be efficiently implementable (assuming the engine/language wants stack traces). And that means the problem cannot be solved by adding more functionality. Plus, we know from .NET that the functionality this proposal provides really is more than is necessary to support its intended language features, like C++ destructors and exception rethrowing. |
Mixup of stack traces is possible in this case as you said, but this is not a recommended use case and stack trace itself is not a mandatory part of the spec anyway. Stack traces are after all some auxiliary info mainly useful for debugging that can be embedded into exnref if the embedder chooses to do that, and I don't think the spec has to guard against some bizarre use case like this. If a toolchain (= wasm generator/compiler) wants to support reliable debugging experience, it simply should not generate such code.
Not sure what you mean here. Do you mean there is a functionality we can get rid of while we have stack traces?
Again, I'm not sure what you'd like to imply here. What functionality do you want to get rid of from the current spec? What do you suggest as an alternative? Also I don't know much about .NET's primitive instructions on exception handling, so I'm not sure if what they're doing can be applied to us. I'd appreciate if you give some introduction on what their primitives are like and how their environment compares with us. |
Sure. .NET has a As for throwing exceptions,
This gets to another problem. The current proposal does not seem to support good debugging. Many debuggers want the stack to remain in tact when an uncaught exception is thrown. That means the process of determining whether the exception is or is not caught should preserve the stack. The reason why .NET has P.S. In case it helps, this is the reference I am using: http://www.ecma-international.org/publications/standards/Ecma-335.htm |
I think here you are again suggesting that we go back to the first proposal..? I'm not sure if we should repeat all discussions happened in #101 and over our emails (which I prefer we do here in the repo next time, if we should really do that). And your reason for that is possible mixup of stack traces, which I still think is a very contrived use case. As I said, if a toolchain wants to provide reliable stack traces, it should not generate such code.
I'm little confused by what you'd like to suggest in this paragraph (and the next). Are you suggesting that we should have two-phase stack unwinding, in which the first phase only tries to determine whether the current exception should be caught or not and the second phase actually unwinds the stack? I think this is a good thing to have in general, and C++ EH in x86/ARM has it, but they can have it because they actually unwind the stack within their libunwind library that communicates with the personality function in libc++abi. Can we make our language-neutral proposal have that kind of functionality, which probably should involve a callback-like function that the VM calls to check in every stack frame or something..? Can a wasm VM call a callback??
I'm not very sure how the separate stack unwinding thing you mentioned work, but it will certainly require a very special VM support. (Can it be done by coroutines?) Do you suggest that we require all VMs that support the EH proposal should have that kind of capability? If so, I think we'd have to have a better reason than "There might be a toolchain that tries to mixup stack traces". |
No. The first and second proposals are not the only two options. There are many options. I am pointing out that .NET, a system that supports C++ and many other languages, provides another option, and I am illustrating the rationale behind its design, something that does not appear to have happened in any prior discussion.
I am pointing out that this is important for good debugging support, and I am illustrating how the current proposal does not support this. I am also not making it necessary to support; I am trying to be compatible with multiple implementation strategies.
I mentioned .NET's
No. I am pointing out the implementation complexity such a design seems to require so that we can decide if it is appropriate. That said, I also believe there is a clean way to make "filterable" exceptions an additional feature, rather than part of core exception handling, so that not every engine has to support it. In fact, it might even be that there's a way to make this support both double-walking and single-walking strategies, with the former providing better debugging support, so that there's no implementation constraint at all. |
I briefly checked how .NET supports general C++ About your suggestions on possible two-phase stack unwinding or exception filtering, please note that we have two levels of testing or filtering. The first level, what |
Nice find! And thanks to the link to EHScheme.md; it's so far been easy to follow (though I can't say I've managed to read it all yet).
If we provide the restricted
Right. Any approach that effectively bundles "filtering" and "unwinding" together makes two-phase unwinding impossible. If one separates these two, then the big question tends to be how to do "filtering". If the filtering process is "nearly pure" (put aside details for now), then the semantics supports both two-phase and single-phase unwinding. If the filtering process is built-in, then it's faster to implement in general and easier to implement while still supporting two-phase (and generally trivial to guarantee is nearly pure). The challenge is that usually the built-in process is specialized for a given language, and WebAssembly needs a language-agnostic filter process, which might force our hand into customizable filters. |
I haven't managed to read all of your last comment yet, and I have to go now for today (I'm in a different timezone now), but I forgot to mention that the EHScheme doc has some out-of-date parts. It was first written in 2017, when we tried to use Itanium IR, so I think this part does not hold anymore, and I didn't mean to ask you to read all of that anyway. Mostly, what I wanted to refer to in that doc is why we need to insert calls to the personality function in the user code, which is because we don't have the two-phase stack unwinding. |
Ah, thanks for the clarification. I do worry that such an involved scheme won't scale well to other languages or to stacks crossing between multiple modules. |
I don't understand what you mean by "the restricted
Yes, the very nature of stack unwinding done by VM in wasm makes two-phase unwinding very hard. The idea of a VM calling into a filter function, which depends on the data structures within the application, is hard. Making it language-agnostic is harder, if not infeasible.
This infeasibility of VM calling into filter functions is why the scheme was designed that way. I don't necessarily agree that the scheme is 'complex'; the doc is long, but the part I'm referring to can be summarized in a sentence: "Because the unwinder cannot call a filter function, compiler inserts filter function calls to landingpads." This is at least language agnostic. I'm not sure what you suggest as a better language-agnostic alternative. Also please note that the EH scheme doc I linked above is not a part of the spec; it is an example of illustration on how a C++ toolchain can work with the spec. Other C++ toolchains and toolchains for other languages can choose other schemes as long as they provide semantically correct functionalities. Overall, I have little idea what you are suggesting as a whole. I'm not even sure if all these things you mentioned pertain to preventing eager stack tracing, let alone if that is a very serious problem that warrants a whole redesign. |
Here is one small concrete suggestion to support at least first-level filtering: add an optional tag list to a catch clause. The clause would only be invoked for a thrown exception if its tag is in the list. That would make the difference between catch-all clauses and more specific handlers explicit. While that can't express filtering on more than the outermost tag (such as needed for C++ exceptions, or ML-style exception pattern matches), it still allows more efficient handler dispatch in many cases, depending on how user code makes use of the tags. C++ likely would benefit least, because of its ubiquitous destructors mapping to catch-all handlers dominating most unwinding, but other languages might be much better off. |
This won't help. catch-all clauses are more frequent than filtered-catch clauses in the current design because they are used to implement destructors.
I am suggesting that this is a bad tradeoff [revised below]: Pros for this design over something closer to .NET:
Cons for this design over something closer to .NET:
And those are only the cons it sounds like we all agree on. And the only pro can likely be addressed by adding more features later, making it only a short-term advantage. And it's not even clear to me that that's properly handled by the current design, since its implementation assumes all C++ programs on the stack are using the same |
In C++, yes, as mentioned in the 2nd paragraph. However, other langs would benefit.
This isn't specific to C++. Other language implementations also want to preserve stack traces when rethrowing an exception value previously caught, even if they don't use explicit syntax for it. JS and Ocaml are examples I know off the top of my head. |
Oops, yeah, I got mixed up with the other complication that C++'s
Other languages still have Revised trade-off summary: Pros for this design over something closer to .NET:
Cons for this design over something closer to .NET:
|
I don't think we have any clear idea on what you mean by "something closer to .NET" in the first place. As I said, .NET has a kind of weird model, with two kinds of rethrows, for a reason. |
Here are the three most important steps:
For custom filters, I realized how .NET does this, and it's much simpler than I had thought. (One of those "duh" moments.) You just run the filter on the tail of the stack being walked while giving it the pointer to its relevant stack frame, similar to how generators are implemented. |
What's the benefit of this? As long as
I don't understand what this part means. What is 'the tail of the stack'? Are you talking about the first-level check (C++ or not) or the second-level check (Is it At this point I'm not even sure the series of things you are suggesting is relevant to preventing eager stack tracing anymore, which I'm still not convinced as a critical problem that warrants a whole redesign and reimplementation in the first place, also which does not improve anything (and rather make things hacky) for C++, currently the target language we have most client requests for. |
The stack trace is no longer implicitly associated with the
Then add a
No. If you're still in the
So C++ stands to benefit as well from more direct debugging support, from faster exceptions due to less frequent stack tracing, and from smaller binaries. |
Can you share this as a gist? |
Sure. Here's the diff: https://linediff.com/?id=5e741e53687f4bf8498b4567 Changes:
|
Oh, forgot to talk about C++'s rethrow more. Suppose code in module A catches some C++ exception. While handling that exception, it calls some function imported from module B. That imported function than executes the compilation of With the current compilation strategy, I suspect the answer will be the wrong exception. Module A and Module B, both being compiled from C++, will each be maintaining their own stacks and, in particular, their own stacks of exceptions currently being handled. The compilation of Because of this, I suspect the ideal long-term solution to We then compile As for compiling a C++ catch, we translate it to the following pattern:
This pattern will work even across C++ module boundaries. It also makes it unnecessary to maintain exception stacks. If you want to maintain stack traces, then we just tweak the strategy above. We add This pattern lets us stack trace lazily in the common case, even for C++, only using eager stack tracing when a compilation of |
Isn't
OK now it is not even about eager tracing anymore.
That's... such a bold claim without much justification.
This condition is equivalent to when there's not a single function call within a whole catch body, which I think is basically none. (There are at least some library function calls.)
We discussed the infeasibilty of making VM call some random application code. You just assume it's solved automagically. You also removed not only a personality function call but also all the necessary code after that. What the personality function returns is the exception pointer and a selector. Given the selector, we need to run a series of instructions to find the right code (= C++ catch clause) to run, which compares the selector to other values. After we arrive at the right code to run, we call library functions like
You cannot replace a call to
I think I addressed why introducing
As I said, |
And while these discussion and brainstorming can continue for a long time, I'd like to paste what I wrote to you in one of the replies to your email:
|
In most cases, the
No it isn't. There's a
No, I mentioned that it's solvable using a standard technique, say by adapting the technique on slide 8 of this lecture from our undergrad compilers class.
Right. That's why I changed the payload to be an rtti and a value (i.e. the exception pointer). The first thing my code does is check the rtti of the exception with the rtti of
Right. These are used to keep track of the current "catch" stack so that we know which exception to rethrow when a nested
40% of the code you are generating is just dealing with the fact that WebAssembly has poor control-flow/stack-walking primitives. Isn't that reason enough to improve WebAssembly? Not to mention that your code assumes that every C++ wasm module on the stack shares the same "catch" stack.
I've already read that code. I know what it does. It isn't necessary for this example, or more generally for whenever you can guarantee So, we know my code is different, but the real question here is, is there a bug in my code? (Supposing we put in the correct magic constant for the rtti for |
I’d like to repeat the objection that it’s not fair to compare the current output of the compiler (which is currently in the state of “let’s get it working first, for the general case, using something close to the existing C++ ABI”) against hand-written assembly, written without any regard for the existing ABI and optimized for this specific simple case.
Before we start accusing each other of falling asleep in undergrad compilers class.... Just because something can be implemented using a "standard" technique, doesn't mean it can be done anywhere. Web VM folks have historically been very resistant to this kind of tight mixing of trusted and untrusted code (a concrete example that comes to mind: issues with calling untrusted code during GC and exposing GC details mean JS has never had the anything like the kind of finalizers that Java and .NET have). WRT dynamic exception filters, running them on a separate stack or stack segment or imposing some other kind of restriction might address some of those concerns (I don't recall the details of that discussion if that we had before, I'd be interested if VM folks have thoughts on that). But let's step back to the original issue. If we keep the requirement that we should at least allow for disassociated rethrow to work (which the .NET scheme does not) then we need to allow In the current scheme, can this be accomplished by only adding frames to the trace on rethrow (or on pass-through, for frames that do not have any catches)? Ideally we would also allow also user-controlled 2-phase unwinding. But your assertion here seems to be that the current scheme forever precludes it. I don't think I agree with that; e.g. I can imagine making that work well with resumable exceptions. |
Oops, I totally see how that sentence reads like that. Just meant to illustrate that I wasn't relying on something magical, and that was the first reference that came to mind. I am sorry for the harmfully sloppy writing.
I get this principle, but I don't see how it applies here. The stack is the program's stack, and the code being executed is the program's code. There aren't different levels of trust being mixed together here. If you can explain, I'd greatly appreciate it.
Okay. I have some updates on this.
|
You made a bunch of different suggestions within the last several weeks.
... maybe more. I don't remember. At this point I'm honestly not sure if there is any specific part of the proposal you want to change. It looks like you want to remake all things from scratch. As I said, we are not in the initial design and brainstorming stage anymore, and while some of what you suggest might be good to have, I don't think that necessarily means we should overhaul the proposal from scratch. Some of them might be not trivial to implement in Web VM settings. Some of them might not be critical enough to nullify all the engineering work done so far. Some of them can be implemented as a future complementary proposal, like a resumable exception. I'm not sure if "My proposal is the best and the simplest so you should have it right now in this proposal" kind of attitude really helps. And I suggest, if you really want to discuss all of those N things, it'd be better to make a single issue per suggestion. In this issue alone you made N different suggestions which are not about eager tracing anymore and it's really hard to track them. I can't say we are able to take all those suggestions, but some of them might invoke good discussions so we can either make some small adjustments or it can be carried over to the next proposal so it's not lost, if discussions are done in a more organized and trackable way. A bit of comments on your last comment:
Thanks for letting me know that And
I think this paragraph basically means "This proposal, without two-phase stack unwinding, does not support resumable exceptions", right? I don't know why this proposal has to support resumable exceptions in the first place. It will be a separate proposal, and it may have two-phase stack unwinding with filters if it needs to. It will be likely to have a separate set of primitives anyway. |
No, this paragraph explains why you can't add resumable exceptions. Per your request, I filed a separate issue going into this topic in more depth (see #104). I don't want to make a bunch of simultaneous interrelated issues because then we get cross-referencing conversations that are hard to review or jump into. So for possible future reference I'll just make notes on other unresolved issues that came up in this thread (but not expected to be resolved in this thread, except possibly the first):
|
Preface: Although stack tracing is not explicitly part of the semantics of the proposal, it certainly is part of the rationale of the design and there are many implicit expectations about how the proposal supports stack tracing, so here I'm discussing the consequences of implicit expectations.
When I sought advice on designing for exceptions, the first and strongly stated piece of advice I received was to not trace the stack eagerly. I was given three main reasons for this: 1) tracing the stack is slow; 2) tracing the stack is proportionate to the size of the stack, so eagerly tracing it makes the performance of throw-catch depend on the stack size; and 3) in the common case the exception is caught and handled, making this expensive effort wasted. At a higher level, the advice was that good exception design makes the performance of throw-catch proportionate to just the number of stack frames that were unwound, i.e. the distance the exception travelled. (In case it turns out to be relevant, I was also reminded that some languages distinguish between exceptions and errors and ensure that exceptions get caught, making stack traces useless for exceptions.)
Unfortunately, the current design seems to necessitate eager stack tracing. Even just to run destructors and
finally
s, the exception is reified into anexnref
, which can be stored or otherwise escape arbitrarily, and which can be rethrown from anywhere with the expectation that its stack trace is preserved. This problem can be mitigated through escape analysis, but my understanding is that WebAssembly is intended to not rely on things like escape analysis for good performance.While I know that it's important to support this stack-trace functionality, exception handling is a cross-language feature, and it seems problematic to bake inefficient stack-trace functionality into the exception-handling primitives of WebAssembly. If it helps to have a comparison point, .NET is able to support this functionality without requiring eager stack collection in the common case. Its exception-handling primitives support (at least as best as I can tell) throw-catch performance proportionate to the distance travelled, as suggested by the advice I was given.
The text was updated successfully, but these errors were encountered: