Streaming vs Buffering interfaces #20

calebschoepp · 2024-11-25T23:23:57Z

calebschoepp
Nov 25, 2024
Maintainer

As we implement WASI Observe one of the fundamental decisions we'll need to make is whether the interface is designed for buffering or streaming telemetry data¹. In this discussion I will explain what I mean by these two terms; cover the pros and cons of each approach; and suggest what approach I think we should take. Ultimately the goal of this discussion is two-fold: help build shared vocabulary for us to use and solicit feedback on which approach we should use.

I've already spilt a decent amount of ink on this topic in a PR here where I was doing exploratory work on what a tracing interface could look like. This discussion can be viewed as a refinement of that work using new terminology that I think is more clear. Reading the original PR isn't necessary context to understand this, but may be helpful.

Most of my examples will be focused on the tracing use case, because that is what I've done the most work on. However, I think everything here should extrapolate to metrics and logging.

Buffering

In a buffering interface it is the responsibility of the guest to collect and buffer telemetry data before sending the completed data over to the host. There are two obvious interfaces that could be used for a buffering interface. We'll explore them one at a time.

In my earlier PR both the exporter and processor interfaces were buffered.

Buffering over WASI HTTP

One option is having the developer send telemetry data over HTTP like they would with a normal non-WebAssembly application. In this case the developer would instrument their application as they normally would and they would use a standard HTTP exporter pointing at their collector. They would need to make sure the host gives the guest access to the collector HTTP endpoint.

Pros

This works out of the box today — kind of. In practice you have to do some finagling with libraries to get this to work in some languages.
Developers get to use familiar exporters that they're already using in their applications today.

Cons

Spans emitted this way cannot be tied into existing auto instrumentation traces (technically in some cases you could do inbound trace context propagation via something like HTTP headers, but not in all cases, and regardless you can't handle propagating across composition boundaries).
Requires always giving access to the collector http endpoint which adds noise to capabilities based security around http endpoints.

Buffering over a custom interface

Another option is sending the telemetry data over a custom WIT interface rather than WASI HTTP. The developer would still buffer telemetry data in the application like above, but would use a custom "WASI exporter" in their application. This "WASI exporter" would be backed by a custom WIT interface that might look something like this:

interface observe {
  // Emit a given completed read-only-span to the o11y host.
  emit-span: func(span: read-only-span) -> result<_, string>;
}

Pros

Simple interface and a simple host implementation.

Cons

Spans emitted this way cannot be tied into existing auto instrumentation traces (technically you could add something to the WIT interface to support inbound trace context propagation, but there is no good way to support propagating across composition boundaries).
Hard to consume this interface directly if you want to hand write a component that does tracing without something like OpenTelemetry libraries.

Streaming

Rather than having the guest buffer telemetry data before sending it to the host, we can have it immediately stream the data to the host with a WIT interface that might look something like:

interface tracer {
    use wasi:clocks/wall-clock@0.2.1.{datetime};

    /// Starts a new `span` with the given name and options.
    start: func(name: string, options: option<start-options>) -> span;

    /// Represents a unit of work or operation.
    resource span {
        /// Get the `span-context` for this `span`.
        span-context: func() -> span-context;

        /// Set attributes of this span.
        ///
        /// If a key already exists for an attribute of the Span it will be overwritten with the corresponding new value.
        set-attributes: func(attributes: list<key-value>);

        /// Adds an event with the provided name at the curent timestamp.
        ///
        /// Optionally an alternative timestamp may be provided. You may also provide attributes of this event.
        add-event: func(name: string, timestamp: option<datetime>, attributes: option<list<key-value>>);

        /// Signals that the operation described by this span has now ended.
        ///
        /// If a timestamp is not provided then it is treated equivalent to passing the current time. Dropping this resource is the equivalent of calling `end(none)`.
        end: func(timestamp: option<datetime>);
    }
}

A developer would instrument their application with the standard libraries, but they would not specify an exporter. Rather the library itself would be backed by this WIT and thus always streaming telemetry state to the host.

In my earlier PR the span interface was streaming.

Pros

Easy to consume this interface directly if you want to hand write a component that does tracing without something like OpenTelemetry libraries.
Spans can seamlessly be tied into existing auto instrumentation traces.

Cons

Pushes more complexity into the host implementation.

Conclusion

What follows is my opinion...

TL;DR Buffering over WASI HTTP is a good stop gap until we can land streaming support.

I think we should standardize around the streaming interface primarily because it will play nice with existing auto instrumentation traces. Secondarily because it keeps needless complexity out of the guest.

Given that buffering over WASI HTTP mostly/kind of works today I think it can act as a stop gap until we can land streaming support. In the long term some developers may continue using this approach if they don't want to adopt a new WIT interface, but this would not be my recommended approach given its downsides.

It turns out this is probably an important question for lots of WASI interfaces, but for obvious reason I'm going to keep this scoped to just WASI Observe. ↩

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming vs Buffering interfaces #20

{{title}}

Replies: 0 comments

Select a reply

Streaming vs Buffering interfaces #20

calebschoepp Nov 25, 2024 Maintainer

Buffering

Buffering over WASI HTTP

Pros

Cons

Buffering over a custom interface

Pros

Cons

Streaming

Pros

Cons

Conclusion

Footnotes

Replies: 0 comments

calebschoepp
Nov 25, 2024
Maintainer