Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add exception specifier to function signature #68

Open
PoignardAzur opened this issue Nov 13, 2018 · 31 comments
Open

Add exception specifier to function signature #68

PoignardAzur opened this issue Nov 13, 2018 · 31 comments

Comments

@PoignardAzur
Copy link

PoignardAzur commented Nov 13, 2018

Exception specifiers are a common-enough feature in strongly-typed languages.

These specifiers have a few advantages, that warrant integrating them into WebAsm:

  • If a language uses a monadic or state-machine error model (eg Rust's Result<T, E> type), exceptions specifiers would allow them to interface with functions that may throw exceptions, by automatically transforming int fooBar() @mayThrow into Result<int, GenericException> fooBar().
  • If an interpreter decides to implement a "branch at every call site" strategy for functions that will frequently throw (see also Should producers/consumers assume throwing is "rare" and, if so, can the spec note this? #19), it's very important to be able to tell the interpreter which functions won't ever throw to avoid unnecessary overhead.
  • Especially in C++, noexcept specifiers can enable both compiler optimizations (better control flow analysis) and user optimizations (eg STL move optimizations).

More generally, there's an argument to be made that whether or not a function can interrupt the control flow of your program should be a part of its API, and therefore its signature.

@tlively
Copy link
Member

tlively commented Aug 21, 2019

I think that this is useful for some languages where exceptions are declared or otherwise part of the type of a function, but would be a showstopper for other languages where it is not possible to determine in general the compete set of exceptions a function may throw after dynamic linking is taken into account. I agree that it would be very useful to be able to reason about thrown exceptions and automatically convert them to other formats when necessary, so this may be a good thing to raise in the interface types proposal.

@rossberg
Copy link
Member

Also keep in mind that Wasm isn't a user-facing language and the role of the Wasm type system is not to guide programmers. It's only purpose is to help engines to ensure memory safety while enabling efficient compilation. It's not clear what exception annotations would add to that.

AFAICS, the same applies, more or less, to interface types.

@aheejin
Copy link
Member

aheejin commented Aug 22, 2019

I don't think we can practically analyze the set of types of exceptions a specific function can throw unless every function signature embeds thrown exception signature within it. And changing the function signature format altogether would not be something we want, would it?

Having said that, adding some specifier like noexcept/nothrow can be doable and even compatible with the MVP, because they are conservative so we don't need exact analysis, and we can take hints from langauge itself (such as C++'s noexcept). And it's fine that all the MVP functions don't have it, it's a conservative hint anyway. Maybe we can add it if we it is shown to be useful for optimizations from the VM. (As @rossberg said, this hint wouldn't be very useful for users)

@aheejin
Copy link
Member

aheejin commented Aug 22, 2019

I think that this is useful for some languages where exceptions are declared or otherwise part of the type of a function, but would be a showstopper for other languages where it is not possible to determine in general the compete set of exceptions a function may throw after dynamic linking is taken into account.

I'm not sure if it would be doable even with compiled languages with static linking. Note that wasm exception types do not correspond with a language's internal types, such as SomeClass* (they are all gonna be i32 in the end for C++, for example)

I agree that it would be very useful to be able to reason about thrown exceptions and automatically convert them to other formats when necessary, so this may be a good thing to raise in the interface types proposal.

What would it be useful for? Could you provide some examples?

@lukewagner
Copy link
Member

lukewagner commented Aug 22, 2019

I agree that when we take the perspective of "we're just an ISA for a source language" that exception specifications don't add anything that the source-language compiler couldn't do itself. But I think there's a bit more to the story.

First, two observations:

  • The implicit assumption with "trap" today is that, even though it's technically possible to execute code in an instance after a call to the instance's export results in a trap, you shouldn't. Thus, it makes sense that traps don't need unwinding code because (1) they don't have to (the instance is dead) and (2) they shouldn't (the instance is corrupt, executing further code, even destructor code, in the instance is a bad idea). This even includes JS exceptions that unwind into wasm frames; those are turned into traps by the JS API in step 3 of create a host function. I think this should stay true when exceptions are added and traps shouldn't be catchable in core wasm.
  • In contrast, exceptions are explicitly intended for resumption after catching, even if an exception unwinds out of a call to an instance's export (because the caller can still catch it and Do The Right Thing).

Now let's imagine, in the future we're slowly moving toward where a single app can contain wasm code from multiple packages, each compiled by possibly-different toolchains, that we have a module A, which was compiled with -fno-cxx-exceptions (or was compiled before exceptions existed), and A calls an export of module B, which was compiled with -fcxx-exceptions, and B throws an exception: what happens?

I expect the proposal today says that the exception unwinds through A as an exception and, since it's just an exception, some other module (JS or wasm) that called A could catch the exception and expect to call an export of A in the future. A, not supporting exceptions, could be in a corrupt or leaking state, however, so we can say this is simply a bug, an invalid combination of A and B, and the bug will manifest when A crashes b/c it's state is corrupt or it leaks to death. But this seems unfortunate and it would be nice for the ecosystem as a whole if this bug could be caught earlier.

So what if we specify that:

  • Function types are extended with an optional "throws" effect that simply says "this function may throw". The default (inherited by all MVP wasm code) is that the effect is not present.
  • If an exception tries to unwind into a caller which is called through a function signature that does not have the "throws" effect, the exception is converted into a trap.

This wouldn't include any static validation rules that non-"throws" functions must wrap calls to "throws" functions in a try block; the enforcement would be dynamic (and I think also "free" during non-exceptional execution, because this can just be a bit on the unwind metadata).

We could also say that "throws" is ignored by type equality/subtyping, so that this "throws" effect doesn't create widespread annoyance (like the need to wrap "throws" functions so that they were importable/call_indirect-able with non-"throws" types). (I'm not positive this is a good idea.)

Without fancy inter-procedural analysis, fcxx-exceptions would set the "throws" effect for all functions in the module and, by being the default option, pre-exception-handling wasm and -fno-cxx-exceptions wasm would continue to not set the throws flag and get the early-error behavior if one was thrown. Thus one could imagine replacing the "throws" effect on functions with a "supports exceptions" flag on the module as a whole... but thus far we've avoided module flags like this and technically I can imagine use cases where you want the flag per-import.

@tlively
Copy link
Member

tlively commented Aug 22, 2019

I agree that it would be very useful to be able to reason about thrown exceptions and automatically convert them to other formats when necessary, so this may be a good thing to raise in the interface types proposal.

What would it be useful for? Could you provide some examples?

For example, perhaps the binding layer could transform a C++ exception thrown from one module into a Rust Result return type in another module. Or more simply, transform a C++ exception into a C error code return. Or more generally transform a language A exception into a language B exception. Of course the bindings layer would have to be very expressive and have its own abstract "Exception" type(s) that real exceptions could be lifted to and lowered from, so this may be a rather complex feature of the interface types proposal. But it would definitely be useful!


I expect the proposal today says that the exception unwinds through A as an exception and, since it's just an exception, some other module (JS or wasm) that called A could catch the exception and expect to call an export of A in the future. A, not supporting exceptions, could be in a corrupt or leaking state, however, so we can say this is simply a bug, an invalid combination of A and B, and the bug will manifest when A crashes b/c it's state is corrupt or it leaks to death. But this seems unfortunate and it would be nice for the ecosystem as a whole if this bug could be caught earlier.

I agree that if an exception silently bubbles up through A and is caught higher in the call stack, this could cause problems if A required destructors to be run. However, that just means that the toolchain for A should make sure to catch all exceptions, even those it does not understand, and run destructors before rethrowing. This makes modules compiled with and without the exceptions feature incompatible, but that's the kind of problem we already solve in the toolchain, for example making it a link error to try to unsafely link objects compiled with and without the atomics feature. Since these are tool problems, I'm not sure its worth the extra spec complexity to do runtime checks in engines as well.

@lukewagner
Copy link
Member

@tlively For that to work, I think everyone would have to use the same toolchain and agree on a meta-convention for identifying and early-error on incompatibilities. Without the aid of JS, I'm not even sure what a pure-wasm meta-convention would be.

Since these are tool problems, I'm not sure its worth the extra spec complexity to do runtime checks in engines as well.

Agreed there is some additional complexity, but I think it could have minimal practical implementation complexity if designed as I proposed. Also, as I said before, I don't think there would be any extra dynamic checks required on non-exceptional control flow, and the cost of the extra check during unwinding would be negligible, I expect.

@tlively
Copy link
Member

tlively commented Aug 22, 2019

We have the target features section specified in the tool-conventions repo, which already functions as such a meta-convention. It's a feature of the WebAssembly object file format, but it's not tied to relocations or anything so it could also be adopted by any sort of WebAssembly loader, even if it doesn't use object files directly. Basically toolchains have to solve this problem anyway whether or not engines are specified to do any checking, so the additional benefit of having engines trap on errors seems minimal.

@rossberg
Copy link
Member

@lukewagner:

We could also say that "throws" is ignored by type equality/subtyping, so that this "throws" effect doesn't create widespread annoyance

In that case it would be completely meaningless to put it on function types. Instead, it would more adequately be an annotation on function definitions that is simply a shorthand for a catch-all-and-trap around the function's body (in a special case that does not break tail calls).

@lukewagner
Copy link
Member

@tlively I think you're thinking in terms of dynamic linking, where toolchains have to collaborate tightly. When combining wasm modules created by different toolchains in the more loosely-coupled context of a package manager (esp. using a more-declarative loader like ESM), I don't think there is a single loader that is in a position to check for mismatches. (How would it be implemented? And outside a JS environment? A conventional custom section could be elevated to the role of a standard that is checked by the engine, but then we're standardizing so it's a question of what's the best to standardize.)

@rossberg What about imports? What matters is the caller's expectation, not the definition of the callee.

@tlively
Copy link
Member

tlively commented Aug 23, 2019

@lukewagner If a toolchain wants to defend against untrusted imports throwing exceptions, it can already do that by catching them then trapping or cleaning itself up and rethrowing. No further collaboration is necessary. Your proposal goes further by making the trapping behavior default for MVP modules, and perhaps that's helpful in the short to medium term, but in the long run it won't be necessary. I also worry that the core spec is the wrong layer of abstraction for this problem. Since this is an issue of communication between modules, wouldn't interface adapters be a better place to solve it?

@lukewagner
Copy link
Member

It seems like this would cause, in practice, for defensive purposes, every wasm module built with -fno-cxx-exceptions to emit try/catch around every import call. I suppose that's possible, but it seems a bit unfortunate. Is that what Emscripten would do by default?

@rossberg
Copy link
Member

@lukewagner:

What about imports? What matters is the caller's expectation, not the definition of the callee.

Subtyping also applies to imports, so if it can ignore throw annotations, then their presence or absence on imports likewise provides zero information. Also, why would the expectations for calling an import be any more relevant than for calling a funcref?

@mstarzinger
Copy link

@lukewagner:

This wouldn't include any static validation rules that non-"throws" functions must wrap calls to "throws" functions in a try block; the enforcement would be dynamic (and I think also "free" during non-exceptional execution, because this can just be a bit on the unwind metadata).

I agree that this model sounds like it won't introduce any runtime overhead for non-exceptional execution, at least for engines that use stack unwinding without explicit checks at call sites.

@rossberg:

What about imports? What matters is the caller's expectation, not the definition of the callee.

Subtyping also applies to imports, so if it can ignore throw annotations, then their presence or absence on imports likewise provides zero information. Also, why would the expectations for calling an import be any more relevant than for calling a funcref?

Wouldn't that imply that it is essentially impossible to catch an embedder exception (e.g. thrown from JavaScript by an imported JavaScript function) in wasm? Since imported functions cannot be marked as "throws", all exceptions they throw will be converted to traps. I am not arguing for/against this, just want to double-check that I am understanding the implications correctly.

@rossberg
Copy link
Member

@mstarzinger, I think whether JS exceptions are mapped to Wasm exns or to traps is a separate question.

When done correctly, effect annotations like "throws" ought to be purely a type-checking mechanism, and as such, should not affect runtime behaviour, only restrict what's a valid program (though Luke seems to suggest some sort of coercive behaviour).

So I'm not implying that exceptions should be converted to traps. I'm merely saying that the type system would not assert anything about their presence.

Because, in fact, if it did, then we would have to require JS functions to only match imports with throws-annotation, since there is no way to validate that they do not throw. That would be a backwards-incompatible change, however.

@lukewagner
Copy link
Member

@rossberg My point was that what matters is the caller's expectation and putting a flag on a function definition that was a shorthand for "catch-all-and-trap around the function body" seemed to describe the callee more than the caller. But on second thought, I suppose such a flag describes the caller's expectation as well; the only difference is whether an exception is converted to a trap when unwinding into a frame vs. unwinding out of a frame; and if there isn't a try inside the function body, there's not really a difference. And I do see the point that, if there's not any static validation rules, it's not really part of the "type".

Another (inverted) framing of the flag could be "this function is exception-safe".

@aheejin
Copy link
Member

aheejin commented Oct 21, 2019

@PoignardAzur

  • If a language uses a monadic or state-machine error model (eg Rust's Result<T, E> type), exceptions specifiers would allow them to interface with functions that may throw exceptions, by automatically transforming int fooBar() @mayThrow into Result<int, GenericException> fooBar().

Is this compatible with our exception proposal too? Our EH proposal's try-catch-based exception model does not return to the same place when an error occurs. The control flow is transferred to a catch clause, which may not even be in the same function.

Could you elaborate how can we map languages that use try-catch based exception (e.g. C++) to this "branch at every call site" strategy, and why does it have less overhead than the stack unwinding based scheme in case we frequently throw? (I don't know much about this strategy or existing implementations of it, so I'm asking)

  • Especially in C++, noexcept specifiers can enable both compiler optimizations (better control flow analysis) and user optimizations (eg STL move optimizations).

I don't think adding specifiers to wasm function signatures would benefit language optimizations, such as C++'s noexcept-related optimizations. This happens before we generate wasm instructions, and this relies on not wasm signature. This kind of optimizations happen in the frontend.

@aheejin
Copy link
Member

aheejin commented Oct 22, 2019

@lukewagner

Now let's imagine, in the future we're slowly moving toward where a single app can contain wasm code from multiple packages, each compiled by possibly-different toolchains, that we have a module A, which was compiled with -fno-cxx-exceptions (or was compiled before exceptions existed), and A calls an export of module B, which was compiled with -fcxx-exceptions, and B throws an exception: what happens?

I expect the proposal today says that the exception unwinds through A as an exception and, since it's just an exception, some other module (JS or wasm) that called A could catch the exception and expect to call an export of A in the future. A, not supporting exceptions, could be in a corrupt or leaking state, however, so we can say this is simply a bug, an invalid combination of A and B, and the bug will manifest when A crashes b/c it's state is corrupt or it leaks to death. But this seems unfortunate and it would be nice for the ecosystem as a whole if this bug could be caught earlier.

Can't this happen in MVP already? For example, when the call stack is like A (JS) -> B (wasm) -> C (JS) in MVP, when an exception is thrown in C and caught in A, B is in a corrupted state.

So what if we specify that:

  • Function types are extended with an optional "throws" effect that simply says "this function may throw". The default (inherited by all MVP wasm code) is that the effect is not present.

This feels rather like 'supports EH' flag than 'throws' then. I think there are also cases in which linking of modules with different feature flags doesn't make sense so we error out in the linker. How would this case be different from those other cases? Maybe cc @tlively

It seems like this would cause, in practice, for defensive purposes, every wasm module built with -fno-cxx-exceptions to emit try/catch around every import call. I suppose that's possible, but it seems a bit unfortunate. Is that what Emscripten would do by default?

Compiling with -fno-exceptions does not generate try/catch around every call. Code simply does not know about exceptions in that case, like C. Emscripten currently support EH by basically wrapping every invoke by a wrapper function that calls out to JS code which throws a JS exception if necessary, which is very slow. This EH feature is disabled by default because it's slow. But with it disabled it behaves the same as the host toolchain: it does not know anything about exceptions. No try-catch.

@lukewagner
Copy link
Member

Can't this happen in MVP already?

Technically, if a JS exception unwinds into wasm today, the core wasm host function call rules say that that is a trap, and, as a general rule, after a trap, a wasm instance should be considered to be in a corrupt state and not reentered, thus, this is not actually an allowed thing today. (Yes, I know that the current impl of EH in Emscripten uses JS exceptions to unwind wasm in exactly this manner, but the JS code in that case is intimately coupled to the wasm code, so it's allowed to play with fire (traps).)

This all changes with the exception-handling proposal, though: presumably both JS and wasm exceptions both turn into exceptions, not traps when they unwind from a cross-instance call. Thus, allowing an exception to unwind from module A into module B is, in general, a valid (non-trapping) thing to do.

That all being said, after some discussion with @fgmccabe, it does seem like this is strictly a concern at (shared-nothing) module interface boundaries, not something one would want to use in a fine-grained manner within a core wasm module, and thus probably the "right" solution is to not have this in core wasm but instead to put some form of exception specification into the module interface type, with the net effect being a default convention that, when you don't support exceptions but call a function that declares it might throw, the call gets wrapped with a try block where the catch traps.

So I'm happy to close this issue; thanks for the discussion.

@aheejin
Copy link
Member

aheejin commented Oct 22, 2019

Can't this happen in MVP already?

Technically, if a JS exception unwinds into wasm today, the core wasm host function call rules say that that is a trap, and, as a general rule, after a trap, a wasm instance should be considered to be in a corrupt state and not reentered, thus, this is not actually an allowed thing today. (Yes, I know that the current impl of EH in Emscripten uses JS exceptions to unwind wasm in exactly this manner, but the JS code in that case is intimately coupled to the wasm code, so it's allowed to play with fire (traps).)

This all changes with the exception-handling proposal, though: presumably both JS and wasm exceptions both turn into exceptions, not traps when they unwind from a cross-instance call. Thus, allowing an exception to unwind from module A into module B is, in general, a valid (non-trapping) thing to do.

You're right, and we should change this JS API part too. But I'm thinking that we might not catch RuntimeError, which includes traps, and maybe catch other 'normal' thrown foreign exceptions.

That all being said, after some discussion with @fgmccabe, it does seem like this is strictly a concern at (shared-nothing) module interface boundaries, not something one would want to use in a fine-grained manner within a core wasm module, and thus probably the "right" solution is to not have this in core wasm but instead to put some form of exception specification into the module interface type, with the net effect being a default convention that, when you don't support exceptions but call a function that declares it might throw, the call gets wrapped with a try block where the catch traps.

I'm not very sure what this means. Do they want throws specifier not on function signatures but on module interfaces instead? How is it computed? Is it something like 'supports EH', so that all modules compiled witt wasm EH will have it? And what is it gonna be used for? And I'm not sure where we should put those try-catch at the module boundary?

@lukewagner
Copy link
Member

Sorry for not being more clear: I mean throws specifiers on the functions in a module interface type (which, importantly, are a superset of core wasm function types, and thus can be enriched with a throws specification). It's up to the toolchain for how to emit these throws specifications, but I would imagine that, as a minium -fno-cxx-exceptions would not declare anything thrown (guaranteeing dynamically that no exceptions were thrown by calling the export), and -fcxx-exceptions would add a throws(...) implying anything could be thrown.

There is actually value in saying something more precise than throws(...) at a module interface boundary: let's say I throw a std::string in module A and wish to catch that string in module B and I'm using shared-nothing-linking. Then it's necessary at the unwind boundary between module A and B to copy the std::string from A's linear memory into B's linear memory. This can be accomplished by declaring throws(runtime_error(string)), allowing the same lifting/lowering of the string exception payload as done for normal params/results. How this is surfaced to the source C++ program is a separate question, of course, especially now that dynamic throws specifications are deprecated in C++.

I think the high-order bit is that, particularly with shared-nothing-linking, exceptions are a very meaningful part of a module's interface.

@rossberg
Copy link
Member

The fundamental problem with throws-annotations -- which essentially are a form of effect type system -- is that they are largely impractical in the presence of anything higher-order, like objects or function references, unless you also introduce (first-class) effect polymorphism. That's the problem languages like C++ and Java kept bumping into and the reason why they introduced ad-hoc escape hatches that ultimately made the whole thing more or less pointless and unloved. I would question that it's worth going there, even for interface types.

@fgmccabe
Copy link

fgmccabe commented Oct 23, 2019 via email

@rossberg
Copy link
Member

Well, so far polymorphism has not been on the table for Wasm (though I think GC types will eventually necessitate it). Let alone first-class polymorphism, which is what you would need for funcrefs.

Furthermore, how would you enforce these annotations in the funcref case? You'd potentially need to wrap every funcref at the interface boundaries into a function inserting the appropriate try handler.

@lukewagner
Copy link
Member

The intention is already that funcrefs are wrapped at interface boundaries producing a semantically distinct funcref value on the other side. This is essential because the adaptations performed on params and results are highly effectful/visible (not just enforcing type contracts like the gradual typed coercion calculii) so the adapted function is fundamentally a different function. Given that, it's easy to add in the dynamic throws-specification checking.

@rossberg
Copy link
Member

rossberg commented Oct 23, 2019

Okay, that's interesting. :)

How will that work if two modules share a mutable funcref global or a table? A funcref can tunnel through those without giving the interface layer any chance of wrapping it. Or is the intention that interface types do not support stateful im/exports?

Similarly, one can tunnel a function through type anyref and then (e.g. with the GC proposal) downcast on the other end. How would that be handled or prevented?

Edit: Answering the second question myself, that case may not be a problem, at least none specific to functions. I suppose interface types simply do not say, promise, or prevent anything regarding the ability to downcast.

@fgmccabe
Copy link

fgmccabe commented Oct 23, 2019 via email

@rossberg
Copy link
Member

But the question is: how would they prevent it? The only way I can see is by not allowing higher-order state in an interface, i.e., no mut globals or tables of function type.

@lukewagner
Copy link
Member

lukewagner commented Oct 23, 2019

As currently described, specifying an interface adapter does not force you to ensure any sort of impermeable membrane, so if you want to export a (global (mut funcref)) directly from the adapted module, go nuts, there is no adaptation provided or assumed; if a funcref ultimately gets passed taking i32 memory offsets to the "wrong" memory, that's on you. But, the convention established by toolchain defaults should be that you're not exporting shared-mutable anything (memories, tables, globals), which is, after all, implied by the name "shared-nothing linking" (which is I think a good basis for an interoperable multi-language/toolchain package ecosystem).

@aheejin
Copy link
Member

aheejin commented Oct 25, 2019

  • Where would interface types take that throws info from then, if not from function types? As I said earlier, it is not practical to transitively scan the whole call graph to compute that throws signature for every function. If it can be conservative, we can attach throws to every interface function in a module compiled with EH feature though.

  • About specifying exact types to throws in the interface types like throws(runtime_error(std::string)), many languages don't throw a raw type data itself. For example, C++ throws a i32 pointer, which is __cxa_exception* or something, and __cxa_exception data structure has a pointer to the buffer that contains the payload. I think other languages have their own exception class for that. Are we gonna make adapter functions for these language-specific exception types? And can we do that for pointers to those types too? If we're doing that, the adaptor should include translation rules for the buffer contents, which I'm not sure if it is possible, because it might contain basically anything. Anyway, the only type any C++ function can throw is currently i32.

@fgmccabe
Copy link

fgmccabe commented Oct 25, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants