-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIT Syntax: Structured Annotations #58
Comments
I think you're right we need to add some form of annotations to wit. Incidentally, there's also a custom annotations proposal in core wasm which has a vaguely similar motivation. Just to see if we're in the same part of the design space here: do you imagine these annotations existing solely as inputs to wasm code generators, thereby influencing codegen but not being directly interpreted by wasm engines as part of the runtime semantics of a component? As for how much or little structure to put into the annotation syntax: that's a great question and I don't myself have a great intuition about what's the right answer here. On the one hand, I guess there's plenty of precedent in C#, C++, Rust, etc where they just define an expression syntax but not much beyond that in terms of scoping or validation (iiuc, or maybe they do?). If we went this route, I guess we could do likewise, defining syntax for literal values of all the interface value types (which could be reused for default values -- or maybe default values are just specified via attributes?). Going beyond that as, e.g., CapnProto has done with explicit declarations and validation looks pretty neat; I guess I could see this becoming useful at a certain scale of wit and attribute usage. Does anyone have any direct experience with this or a related more-structured/typed annotation system? |
You can see the syntax definition for Rust attributes here. Basically these are the valid syntaxes for attributes: #[foo]
#[foo = expr]
#[foo(tokens)] In this case #[foo([)]
#[foo(])]
#[foo([})] But as long as the brackets match, any token is allowed: #[foo(some = { "yes" => 5 + 10 }, [$x; T::bar])] It doesn't even need to be valid expressions, because it's using syntax tokens. This is absurdly flexible, it means that every attribute gets to define its own syntax, essentially creating a sub-language. That flexibility is probably overkill for Wasm. Also note that it's possible to have multiple different attributes on the same item: #[foo]
#[bar = 5]
#[qux(some ? fancy => syntax)]
struct SomeStruct {
...
} The attributes are parsed one at a time, top-to-bottom. |
tossing another example into the mix .. Smithy was designed to solve many of the same problems. AWS uses it to define interfaces for 250+ web services, and code generators generate SDKs for all the supported languages and platforms that they support (a large cross product). The annotation feature in smithy is called "Traits" https://awslabs.github.io/smithy/1.0/spec/core/model.html#traits. Some annotations can be used at runtime, for example Most of the tooling developed by aws for smithy is written in Java, however there's a pure Rust library https://github.com/johnstonskj/rust-atelier that has a full parser, AST, and other tools. I am not affiliated with aws, but I have used the rust-atelier crates to build code-generators for webassembly sdks. |
So, what's the best way to move this forward? A concrete proposal? |
Yes, a PR to this repo making a specific proposal we can discuss would be welcome. I'm imagining a PR would add the proposed syntax to |
An other use for annotations could be to instruct code generators to emit specialized types, rather than their low-level representation: #[type-hint("wasi:snapshot1/timestamp")]
type timestamp = u64;
#[type-hint("wasi:snapshot1/local-date-time")]
type local-date-time = u64;
#[type-hint("wasi:snapshot1/duration")]
type duration = u64;
#[type-hint("wasi:snapshot1/time-zone")]
type iana-time-zone = string;
export localize-date: func(utc: timestamp, tz: iana-time-zone) -> local-date-time; As far as the component-model is concerned, the |
My previous post relates to user defined types, but I guess the same can be done for the built-in value type specializations. Suppose the flags my-flags {
lego,
marvel-superhero,
supervillan,
} could be lowered into the ComponentModel1.0-compatible form: #[flags]
record my-flags {
lego: bool,
marvel-superhero: bool,
supervillan: bool,
} On first glance, it looks like this can be done for every specialized value type currently defined (tuple, flags, enum, union, option, result, string). However, whereas the user-defined type hints of my previous post are only used at codegeneration time, for this latter usecase the wasm runtime would need special knowledge of these annotations if, for example, custom subtyping rules are desired. |
In the context of WASI, annotations could also be used to hint at the host on how to resolve imports and specify which permissions each import requires. Potentially replacing external manifests. @env("HOME") // Hints at the host to resolve this import with the value of the HOME environment variable.
import home-dir: string;
// Signals to the host that the component intends to use this socket factory to set up UDP connections to cloudflares DNS service.
@firewall(outbound = "udp:1.1.1.1:53", reason = "To resolve domain names.")
import sockets: "https://github.com/WebAssembly/wasi-sockets/spec.wit#socket-factory";
// Instrument the host to provide a filesystem with two directories mounted inside them:
@mount("/tmp", kind = FsMount::Temp)
@mount("/app-user-data", access = FsAccess:ReadWrite, reason = "To store your precious photo's.")
import fs: "https://github.com/WebAssembly/wasi-filesystem/spec.wit#fs";
export main: func(@from-command-line args: list<string>) -> unit; // Note the `@from-command-line` |
Unfortunately, if we make Structured Annotations encode as Custom Sections, then
In a way, what we want from Structured Annotations seems to be similar to what is being discussed for the "URL" that is attached to interfaces/functions/etc. in Components to identify e.g. that an interface is referring to a given WASI capability. It's information that we're, in many cases, going to want to supply to the runtime so that it can provide us the correct implementation of our imports or correctly interpret our exports. This raises an interesting question: what if Structured Annotations are just a syntax sugar for encoding data in the "URL" (which might become more of a structured text field than specifically a URL) on each import. We would have to come up with a way to encode annotations in this field, but it would have the benefits that
|
Good points! Agreed on the problems with custom sections. I like the observation that arbitrary annotations can already be stuffed into the URL field of an In general, we can observe an emerging pattern in the component model where we've been taking semantic data that could have been stuffed into an import/export |
That sounds good to me, how structured do you think this subfield would need to be? |
Good question. One requirement we seem to be converging on is that Wit should be co-expressive with component types so that we can render an arbitrary component's type as Wit and also do a rough roundtrip. So if the Wit syntax has complex structured annotations, then probably so should what goes in a component. The harder question then is whether to encode the structured annotation via a mini binary-format grammar or via some grammar layered on top of a string, as we've done with But another route might be to pare down structured annotations to the bare minimum that meets our requirements so that we don't even have to ask how to encode complex tree structures in
Scenario 1 is more open-ended and makes me vaguely worried about composability. As long as we say "you can always strip these sorts of annotations", then I suppose that makes them fine. But then this makes me think that this use of annotations really do belong in a new kind of "annotations" custom section; they're not really part of the "name". In Scenario 2, since the interpretation of the annotation is implied by the URL, a single raw string subfield of |
Note that support for custom derives has landed in Not quite full annotation support, but at the very least covers the custom |
Many tools might want to add additional metadata to WIT declarations that modify code generation behaviour.
A concrete example would be the
async
option ofwit-bindgen-wasmtime
, which marks functions as async and currently has to be specified in a macro or on the command line.I realize that that is only temporary until the component model gains async support, but there are many other use cases.
Examples:
Clone
in Rust), or make functions consume a value(like
Eq
,Hash
,serde::Serialize
, ...)Some languages like Rust make this easy with wrapper modules that selectively re-export, but that's not the case for others like Python, Javascript, etc where you might want to keep the generated types private (or
protected
in eg Java) and provide custom wrappersIf there is no standard way to declare these, generators will always need some custom metadata layer for customization, which seems suboptimal.
Prior Art
Protobuf
Protobuf has custom options, which is a particularly fancy system that allows defining well-typed options that are even restricted to specific scopes.
Cap'n Proto
Capn' Proto also has a well-typed annotation system.
Proposal
I'd personally love a well-typed annotation system inspired by the above, but I also understand if that is currently not appreciated / too complex.
I'd be happy to come up with a concrete proposal and implement it in
wit-bindgen
, but I wanted to get some opinions first.An alternative would be untyped annotations that can be attached to a set of AST items (type declarations, fields, variants, functions, ...) and allow all valid tokens within delimiters.
For example:
The text was updated successfully, but these errors were encountered: