-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Review ABI vNEXT. #1
Conversation
Signed-off-by: Piotr Sikora <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this. Very well done and I appreciate the work you have put into the docs. I left some comments below. Happy to hop on a quick call about some of the more contentious points. I think my high level comments are:
- We should document the key enum values in this doc with full API docs.
- I still find the overloaded context concept to be incredibly confusing (vs. having discrete context types). Assuming we stick with that (which I assume we will), can we have a section the doc that talks about context types and especially how it all flows together with
proxy_on_context_create
?
Thank you very excited to see this land upstream!
cc @htuch to take a look at this also. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One other meta-point from discussions with @htuch -- this ABI has the potential to be impactful beyond WASM. At some point (this may take a while) it would be great to have dynamically loadable Envoy filters (see Apache HTTPD) and for that we need a stable ABI, possibly accessed via a C API. No one wants to tackle that right now as Envoy is moving too fast. But it's useful to think about how to make it possible in the future.
Perhaps this puts too much of a burden on this as a pure WASM solution, and that should be tackled in a layer between WASM and native Envoy. But I wanted to put that on the radar, in addition to asking what the future of this ABI would be in a non-Envoy proxy (which appears to be a goal).
@PiotrSikora can you give an update? |
With no action in 4 months, this PR is abandoned. Closing it. |
Signed-off-by: Piotr Sikora <[email protected]>
I believe most (all?) comments were addressed, but some things might have been lost in the rewrite, so please take a look. Thanks! |
Signed-off-by: Piotr Sikora <[email protected]>
Signed-off-by: Piotr Sikora <[email protected]>
@mattklein123 @htuch @yskopets @yuval-k @gbrail @jplevyak @kyessenov could you please review the latest iteration of the ABI spec? Thanks! |
Returns `InvalidMemoryAccess` when `message_data` and/or `message_size` point to | ||
invalid memory address. | ||
|
||
### Callbacks optionally exposed by the Wasm module |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think , it's worth clarifying to what extent proxy_on_memory_allocate
is optional.
Without this function many of essential Host APIs won't work, e.g. proxy_get_buffer
, proxy_get_map_values
, etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, this one isn't really optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I would have a required section and be clear that a module will fail to load unless it implements all required exports.
that will be used by the host environment in subsequent calls. Returning `0` | ||
retains the original context identifier. | ||
|
||
#### `proxy_on_context_finalize` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this function a replacement for a former proxy_on_log
?
At what moment will proxy_on_context_finalize
be called in the context of Http Filter
, Network Filter
, Access Logger
? (e.g., before / after access loggers)
With Access Logger
extension in mind, does this function imply there must be a separate "Wasm context" for every HTTP request/TCP connection to be able to log them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, where is on_log
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a strong pushback against having a generic proxy_on_log
in previous round of reviews, so it was merged together with proxy_on_done
and proxy_on_delete
into proxy_on_context_finalize
.
One alternative could be to switch to specialized context functions (see: #9), and we could add proxy_on_http_log
then... but there is no reason not to log in the "finalize" function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand motivation to exclude proxy_on_log
from the lifecycle of Http Filter
/ Network Filter
.
But why not have this callback to be implemented by Access Logger
extensions ?
#### `proxy_on_vm_start` | ||
|
||
* params: | ||
- `i32 (uint32_t) vm_id` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How the value of vm_id
is supposed to be used ? Or is it context_id
of a VM ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's the context identifier, just named more accurately to make it clear what it represents, instead of having generic context_id
in each function.
#### `proxy_on_plugin_start` | ||
|
||
* params: | ||
- `i32 (uint32_t) plugin_id` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How the value of plugin_id
is supposed to be used ? Or is it context_id
of a Plugin ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's the context identifier, just named more accurately to make it clear what it represents, instead of having generic context_id
in each function.
should be forwarded upstream and that this extension shouldn't be called again | ||
for this HTTP request. | ||
- `Pause` instructs the host environment to pause processing until | ||
`proxy_resume` is called. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no proxy_resume
function in this spec
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be proxy_http_resume
.
* returns: | ||
- `i32 (proxy_result_t) call_result` | ||
|
||
Removes item from the tail of the queue (`queue_id`). Returned value is written |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will be returned if the queue is empty ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, it looks that I accidentally removed it during rewrite. It should return Empty
.
- `i32 (uint32_t) new_context_id` | ||
|
||
Called when the host environment creates a new context (`context_id`) of type | ||
`context_type` (e.g. `VmContext`, `PluginContext`, `StreamContext`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this spec supposed to cover Access Logger
extensions ?
Do Access Logger
extensions have a dedicated context type ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, since logging can happen in the "finalize" calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, since logging can happen in the "finalize" calls.
Handling access logs typically involves buffering, timeout, http/grpc call. All these actions have less sense in the context of an individual HTTP request / TCP connection than in the context of a dedicated Access Logger
extension.
|
||
### Callbacks optionally exposed by the Wasm module | ||
|
||
#### `proxy_on_context_create` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to use proxy_on_context_create
callback to achieve a behaviour where an extension can have multiple instances within a VM each with a different configuration ?
E.g.,
- 1
Envoy
Listener
withHttpConnectionManager
and 2 instance of the sameHttp Filter
extension but different configs - 2 different
Envoy
Listener
s, each withHttpConnectionManager
and 1 instance of the sameHttp Filter
extension but different configs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
connection is closed. | ||
|
||
|
||
## Custom extension points |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it possible to describe example use case ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 not totally clear how this would be used. An example would be great.
|
||
### Callbacks optionally exposed by the Wasm module | ||
|
||
#### `proxy_on_custom_callback` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to call back into Plugin context / Http Stream context / TCP stream context ?
How can extension understand for what context this callback is intended ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is, but the list of parameters to custom callbacks is not part of this spec.
|
||
### Callbacks optionally exposed by the Wasm module | ||
|
||
#### `proxy_on_new_connection` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you specify the state machine of the callbacks? E.g.
new_connection (downstream_data | upstream_data)* (downstream_close upstream_data* upstream_close | upstream_close downstream_data* downstream_close)
Also, specify the interleavings with configuration callbacks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I just had a few comments but I was able to remember a ton of things that confused me initially.
Thanks -- there are a bunch of good clarifications in this version and some simplifications of formerly-confusing things like the return codes and the three different ways for a context to be "done".
I'm a bit worried that the arrays of iovecs in some places are going to make it harder to use the raw ABI at the expense of efficiency in some cases, but if others think they're important then people will figure it out.
|
||
Adding new enum values requires increasing the patch version. | ||
|
||
Wasm modules should export `proxy_abi_version_<major>_<minor>_<patch>` function, indicating which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that "Wasm modules must export..." otherwise they will not work, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct.
Wasm modules should export `proxy_abi_version_<major>_<minor>_<patch>` function, indicating which | ||
version of the ABI they were compiled against. | ||
|
||
Host environments should be able to support multiple versions of the ABI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is "should," then what should a host do if it does not? "Must" it return a readable error and refuse to start the module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct.
* returns: | ||
- `i32 (proxy_result_t) call_result` | ||
|
||
Logs message (`message_data`, `message_size`) at the log level (`log_level`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any guidance on the data? "Should" it be UTF-8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should, but should we be enforcing this in the spec? Do we need to define this for all "string" types, then?
module, which is expected to free this memory when it no longer needs it. | ||
|
||
SDKs implementing this ABI are expected to automatically handle freeing of the memory-managed data | ||
for the developers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to say anything about the data that the Wasm module sends TO the host? For instance, when calling proxy_log, can the module assume that the host takes ownership of the data passed and that the caller is free to free it when the call returns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the host doesn't take the ownership of the data referenced in the hostcalls.
Yes, the caller is free to free all parameters as soon as the function returns. This is the standard behavior for virtually (literally?) any programming language, so I'm not sure if that's something worth pointing out... it might result more confusion than anything else, IMHO.
implicitly created in `proxy_on_vm_start`, `proxy_on_plugin_start`, | ||
`proxy_on_new_connection` and `proxy_on_http_request_headers` callbacks. | ||
|
||
Return value (`new_context_id`) can be used to assign new context identifier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dumb question but are all 32 bits of "new_context_id" always required to be uninterpreted by the host? Like, could a filter return a pointer to memory in its own "address space?"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the custom context identifier is opaque to the host, and using it to point to the context object inside WasmVm is the intended use case for it (this requested by @yuval-k)... But it feels a bit unsafe to me, to be honest.
|
||
Returns `NotAllowed` when requests to given upstream are not allowed. | ||
|
||
Returns `InvalidMemoryAccess` when `upstream_name_data`, `upstream_name_size`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I learned that Envoy will return an error in this call if the ":path", ":method", and "host" headers aren't set. In general we should say something about what the various header fields mean to the host. Do we want to document that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks.
#### `proxy_dispatch_grpc_call` | ||
|
||
* params: | ||
- `i32 (const char*) upstream_name_data` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in this version can this just be the name of the upstream? As discussed elsewhere this is currently a gnarly protobuf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct.
Returns `Ok` on success. | ||
|
||
Returns `NotFound` for unknown service (`service_name_data`, | ||
`service_name_size`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't that error come back from the upstream gRPC server? In that case would the error really be there or in the response? OTOH, I imagine that an invalid upstream name would immediately return an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, this should be for unknown upstream.
Dispatch a call to a gRPC service (`service_name_data`, `service_name_size`) | ||
using upstream (`upstream_name_data`, `upstream_name_size`). | ||
|
||
Once the response is returned to the host, `proxy_on_grpc_call_response` and/or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we mean proxy_on_grpc_call_response (which I can't find) or something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant proxy_on_grpc_call_response_message
, although maybe we should have "all-on-one" proxy_on_grpc_call_response
similar to proxy_on_http_call_response
and leave proxy_on_grpc_call_response_{header_metadata,message,trailer_metadata}
only for the gRPC stream? What do you think?
* returns: | ||
- `i32 (proxy_result_t) call_result` | ||
|
||
Returns map containing key-values for keys from the provided list of vectors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this map is going to be used for HTTP headers, then do we want to standardize what names are there? Envoy has a particular set of system-defined headers like :path, :method, etc, and the rest are just lower-cased versions of the HTTP header. Should the spec discuss this so that people can write modules that can theoretically support multiple hosts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:path
, :method
and other pseudo-headers come from HTTP/2 transport, not Envoy. But yes, we should clarify that, since exposing those together with other headers is unique to Envoy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback. I responded in comments for now, but I will add the necessary clarifications and changes to the document tomorrow.
that will be used by the host environment in subsequent calls. Returning `0` | ||
retains the original context identifier. | ||
|
||
#### `proxy_on_context_finalize` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a strong pushback against having a generic proxy_on_log
in previous round of reviews, so it was merged together with proxy_on_done
and proxy_on_delete
into proxy_on_context_finalize
.
One alternative could be to switch to specialized context functions (see: #9), and we could add proxy_on_http_log
then... but there is no reason not to log in the "finalize" function.
#### `proxy_on_vm_start` | ||
|
||
* params: | ||
- `i32 (uint32_t) vm_id` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's the context identifier, just named more accurately to make it clear what it represents, instead of having generic context_id
in each function.
#### `proxy_on_plugin_start` | ||
|
||
* params: | ||
- `i32 (uint32_t) plugin_id` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's the context identifier, just named more accurately to make it clear what it represents, instead of having generic context_id
in each function.
Note: `downstream` means the connection between client and proxy, and `upstream` | ||
means the connection between proxy and backend (aka “next hop”). | ||
|
||
Note: The same unique context identifier (`stream_id`) is shared by a pair of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the context identifier, just named more accurately to make it clear what it represents, instead of having generic context_id
in each function.
Return value (`next_action`) instructs the host environment what action to take: | ||
- `Continue` means that this HTTP request should be accepted and that HTTP | ||
headers should be forwarded upstream. | ||
- `EndStream` means that this HTTP request should be accepted, that HTTP headers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, "afterwards" mean "after the headers are sent".
It's used to set end_stream=true
in the HTTP/2 HEADERS
frame, since we don't have any other mechanism for doing that, which can be used e.g. when you're converting GET
request to HEAD
request and want to forward stream without forwarding original request body.
extension shouldn't be called again for the current stream. | ||
- `Pause` means that further processing should be paused until | ||
`proxy_resume_stream` is called. | ||
- `WaitForMoreData` means that further processing should be paused and that host |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have that right now. I can add proxy_send_bytes_downstream
if you want.
should be forwarded upstream and that this extension shouldn't be called again | ||
for this HTTP request. | ||
- `Pause` instructs the host environment to pause processing until | ||
`proxy_resume` is called. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be proxy_http_resume
.
* returns: | ||
- `i32 (proxy_result_t) call_result` | ||
|
||
Removes item from the tail of the queue (`queue_id`). Returned value is written |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, it looks that I accidentally removed it during rewrite. It should return Empty
.
|
||
### Callbacks optionally exposed by the Wasm module | ||
|
||
#### `proxy_on_queue_ready` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not correlated with any context_id
, since queue is a standalone object.
proxy_open_shared_queue
returns unique queue_id
handle, which is used in proxy_on_queue_ready
.
Always notifying all WasmVms seems wasteful, so we should do some kind of load-balancing (probably with ACK/NACK), but I guess we could declare each queue as either unicast or multicast upon creation, if you have a use case for it?
Returns `InvalidOperation` when `metric_type` doesn't support incrementing | ||
and/or decrementing values. | ||
|
||
#### `proxy_delete_metric` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory, it's here to remove metrics from the host and avoid resource leaks, but I'm not sure how useful this will be in practice. It probably should match number of proxy_create_metric
calls, but I could be convinced otherwise.
i don't have any further comments. |
connection should be closed afterwards. | ||
- `Done` means that data chunk should be forwarded upstream and that this | ||
extension shouldn't be called again for the current stream. | ||
- `Pause` means that further processing should be paused until |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between Pause
and WaitForMoreData
?
What should happen to data that arrive while the filter is in the Pause
state ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also would be good to comment on watermarking/buffering as part of this.
- `Done` means that this HTTP request should be accepted, that HTTP headers | ||
should be forwarded upstream and that this extension shouldn't be called again | ||
for this HTTP request. | ||
- `Pause` instructs the host environment to pause processing until |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should happen to data that arrive while the filter is in the Pause
state ?
Return value (`next_action`) instructs the host environment what action to take: | ||
- `Continue` means that this HTTP request body chunk should be forwarded | ||
upstream. | ||
- `EndStream` means that this HTTP request body chunk should be forwarded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the use case for this?
It invalidates Content-Length
header that has already been forwarded upstream
upstream and that the underlying stream should be closed afterwards. | ||
- `Done` means that this HTTP request body should be forwarded upstream and that | ||
this extension shouldn't be called again for this HTTP request. | ||
- `Pause` means that further processing should be paused until `proxy_resume` is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between Pause
and WaitForMoreData
?
What should happen to data that arrive while the filter is in the Pause
state ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WaitForMoreData would call this callback when more data arrives where Pause would swallow that event until the wasm filter resumed.
I assume this maps to StopAllIterationAndBuffer, rather than StopIteration (which is kind of the Envoy default) or StopAllItreationAndWatermark. Might be worth clarifying?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general it seems like we are missing the options around watermarking vs. buffering. I think that is needed. I commented on this elsewhere.
this extension shouldn't be called again for this HTTP request. | ||
- `Pause` means that further processing should be paused until `proxy_resume` is | ||
called. | ||
- `WaitForMoreData` means that further processing should be paused and host |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to document the complete workflow with regards to buffered data ?
E.g.,
- if a filter returns
WaitForMoreData
, what exactly data will be available in the context of the next call toproxy_on_http_request_body
?- Only new data since the last call ?
- A single buffer with both old and new data together ?
- Only old data in the buffer ? (surprisingly, current behaviour of
envoy-wasm
)
Context:
- I've noticed that current implementation (
envoy-wasm
) implemented a different workflow with regards to buffered data:- native
Envoy
extensions have access both to the new chunk of data and the data buffered previously;Proxy Wasm
extensions can see only one of those at a time - when a native
Envoy
extension returnsStopIterationAndBuffer
, next time it will be called with the new chunk of data;Proxy Wasm
extensions get called next time with data in the buffer (which doesn't include new chunk yet) - native
Envoy
extensions observe similar behaviour forStopIterationAndBuffer
andStopIterationAndWatermark
; Proxy Wasm` extensions observe different behaviour
- native
If Proxy Wasm
needs a different model for buffered data than Envoy
, it should be documented.
* returns: | ||
- `i32 (proxy_result_t) call_result` | ||
|
||
Resume processing of paused downstream or upstream (`stream_type`) connection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to document what exactly should happen on call to proxy_resume_stream
?
E.g.,
- in what states is it valid to call this function ?
Pause
/WaitForMoreData
/WaitForEndOrFull
- will
proxy_on_upstream_data
/proxy_on_downstream_data
be called again with all data in buffer ? will behaviour be different forPause
/WaitForMoreData
/WaitForEndOrFull
?
* returns: | ||
- `i32 (proxy_result_t) call_result` | ||
|
||
Resume processing of paused HTTP request or response (`stream_type`) in current |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to document what exactly should happen on call to proxy_resume_stream
?
E.g.,
- in what states is it valid to call this function ?
Pause
/WaitForMoreData
/WaitForEndOrFull
- will
proxy_on_request_body
/proxy_on_response_body
be called again with all data in buffer ? will behaviour be different forPause
/WaitForMoreData
/WaitForEndOrFull
?
that will be used by the host environment in subsequent calls. Returning `0` | ||
retains the original context identifier. | ||
|
||
#### `proxy_on_context_finalize` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand motivation to exclude proxy_on_log
from the lifecycle of Http Filter
/ Network Filter
.
But why not have this callback to be implemented by Access Logger
extensions ?
- `i32 (bool) is_done` | ||
|
||
Called when the host environment is done processing context (`context_id`). | ||
Any finalization (e.g. logging) should be done within this callback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_property
was also used by SDKs to get ahold of Root ID and implement support for multiple extensions packed inside a single Wasm module.
- `i32 (uint32_t) new_context_id` | ||
|
||
Called when the host environment creates a new context (`context_id`) of type | ||
`context_type` (e.g. `VmContext`, `PluginContext`, `StreamContext`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, since logging can happen in the "finalize" calls.
Handling access logs typically involves buffering, timeout, http/grpc call. All these actions have less sense in the context of an individual HTTP request / TCP connection than in the context of a dedicated Access Logger
extension.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, this is a very readable spec. Since I haven't spent any time writing Wasm yet, my comments reflect those of a new reader and someone coming from the Envoy world; clarifying these would help make the ABI clearer to folks with this perspective.
|
||
The ABI is in the form of: `<major>.<minor>.<patch>` | ||
|
||
Breaking changes require increasing the major version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to clarify somewhere what is meant by a breaking change. One experience we've had from xDS is that distinguishing between build and runtime breakage is important (I imagine it's both in this case). Also, bug fixes can be pretty ambiguous, e.g. a fix that introduces the "correct" behavior as per the specification, but where the existing implementation(s) all followed the "incorrect" behavior can be a challenge to handle.
## Naming conventions | ||
|
||
All functions are using `snake_case`, and they start with the `proxy_` prefix. Implementation | ||
specific extensions should use different prefixes (e.g. `envoy_`, `nginx_`, `ats_`, etc.) in order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markdroth will we ever have Wasm gRPC interceptors? Have you folks been tracking these ABI specs? This is another purportedly multi-client ABI/API that might benefit from gRPC input if you plan on supporting at some point in the future.
|
||
## Memory ownership | ||
|
||
No memory-managed data is passed to Wasm module in any of the event callbacks, and Wasm module must |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the term "memory-managed data" needs to be spelled out in a specification like this. It means different things in different situations, e.g. depending on who is managing the data, whether you have GC, etc. I think the point you want to make is that ownership of all memory that participates in ABI calls belongs to the Wasm code, it's allocated/freed by it. So, you might want to say that "No host managed memory is referencable by a Wasm module" or something like that as well. This is all semantics, but I think it makes the spec more readable.
|
||
Allocates continuous memory buffer of size `memory_size` using the in-VM memory | ||
allocator and returns address of the memory buffer to the host environment using | ||
`memory_data`. Returning `0` indicates failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you using 0
or null
in this spec for pointers?
shouldn't be called again for the current stream. | ||
- `Pause` means that further processing should be paused until | ||
`proxy_resume_stream` is called. | ||
- `Close` means that this connection should be rejected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, in Envoy we handle these via a callback model, which is strictly more general as you can also do things like create local replies. What is the rationale for moving to return?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I see why this applies for proxy_on_new_connection
, but not for proxy_on_downstream_data
.
- `Continue` means that this connection should be accepted. | ||
- `Done` means that this connection should be accepted and that this extension | ||
shouldn't be called again for the current stream. | ||
- `Pause` means that further processing should be paused until |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you thought about how watermarking, flow control and/or other buffer management is going to work in Wasm? The Envoy experience is that this is hard to add after the fact and would best be designed from the get-go. @jplevyak @mattklein123 @alyssawilk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last we talked about this the API was going to replicate all of the filter return codes we already have, so watermarking should be included, though I haven't read that far yet.
|
||
### Functions exposed by the host environment | ||
|
||
#### `proxy_dispatch_grpc_call` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is there a difference in the ABI between a call and a stream? Why can't we just treat everything like a stream and let higher level code deal with the distinction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only difference I think is in terms of ideas like timeout, which the SDK can potentially handle itself.
|
||
### Functions exposed by the host environment | ||
|
||
#### `proxy_open_shared_kvstore` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't the kvstore be used to model headers? It seems a bit awkward to have two different ABI mechanisms for key-values.
* returns: | ||
- `i32 (proxy_action_t) next_action` | ||
|
||
Called when HTTP response headers for stream `stream_id` are received from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you planning on explicitly modeling 1xx in the ABI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool to see this moving forward!
Got through up to the L7 section - some thoughts from my end
portability. | ||
|
||
Function names are in the form of: `proxy_<verb>_<namespace>_<field>_<field>` (e.g. | ||
`proxy_on_http_request_headers`, `proxy_get_map_value`, etc). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super nitty, but on is a preposition or adverb?
when its internal buffers are full. | ||
- `Close` means that this connection should be closed immediately. | ||
|
||
#### `proxy_on_downstream_close` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is full close, and half close is indicated by end stream?
Return value (`next_action`) instructs the host environment what action to take: | ||
- `Continue` means that data should be forwarded downstream. | ||
- `EndStream` means that data should be forwarded downstream and that this | ||
connection should be closed afterwards. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Envoy has 3 close types - NoFlush (which seems to correspond to Close), FlushWrite, which flushes the data then closes the socket once it's all written, and FlushWriteAndDelay to handle the case where there's bidirectional data, and we want to leave time for the peer to receive the FIN as it might be sending data and if you immediately close the connection the peer could get their connection RST and not actually receive the data.
Do we want to allow all 3 from wasm? Else I'm not sure if Done maps to FlushWrite or FlushWriteAndDelay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I commented on this above. It's not super clear to me that closing should be a return code from the filter vs. a callback API, but I agree we need to be able to do all of these types.
* returns: | ||
- `i32 (proxy_result_t) call_result` | ||
|
||
Close downstream or upstream (`stream_type`) connections in current context. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comments above about Envoy close types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if we have this why do we need the return codes for end stream, etc.? Can we just have a stop iteration return code?
* returns: | ||
- `i32 (proxy_action_t) next_action` | ||
|
||
Called for each chunk of HTTP request body for stream `stream_id` received from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
chunk in the context of HTTP could be confused for HTTP/1.1 chunks. rephrase or clarify?
|
||
* params: | ||
- `i32 (uint32_t) stream_id` | ||
- `i32 (size_t) body_size` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
body size or size of this chunk?
- `EndStream` means that HTTP trailers should be forwarded downstream and that | ||
the underlying stream should be closed afterwards. | ||
- `Done` means that HTTP trailers should be forwarded downstream and that this | ||
extension shouldn't be called again for this HTTP response. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are EndStream and Done redundant for trailers calls?
- `EndStream` means that this HTTP request body chunk should be forwarded | ||
upstream and that the underlying stream should be closed afterwards. | ||
- `Done` means that this HTTP request body should be forwarded upstream and that | ||
this extension shouldn't be called again for this HTTP request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for HTTP bodies specifically it would be useful to have DoneWithData.
I could easily imagine filters which don't want to process body data (or incur the context switch to even be informed about it) but do want to interact with trailers (and maybe metadata).
upstream and that the underlying stream should be closed afterwards. | ||
- `Done` means that this HTTP request body should be forwarded upstream and that | ||
this extension shouldn't be called again for this HTTP request. | ||
- `Pause` means that further processing should be paused until `proxy_resume` is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WaitForMoreData would call this callback when more data arrives where Pause would swallow that event until the wasm filter resumed.
I assume this maps to StopAllIterationAndBuffer, rather than StopIteration (which is kind of the Envoy default) or StopAllItreationAndWatermark. Might be worth clarifying?
Returns `InvalidMemoryAccess` when `response_code_details_data`, | ||
`response_code_details_size`, `response_body_data`, `response_body_size`, | ||
`additional_headers_map_data` and/or `additional_headers_size` point to invalid | ||
memory address. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we've added the API yet but it's on my todo list to allow trySendLocalReply on the encode path (see comments in sendLocalReply for the 3 best-effort modes)
given wasm APIs are supposed to be stable it's worth considering starting with this from the get go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very excited to see this happening. Flushing out some comments from a first pass.
Returns `InvalidMemoryAccess` when `message_data` and/or `message_size` point to | ||
invalid memory address. | ||
|
||
### Callbacks optionally exposed by the Wasm module |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I would have a required section and be clear that a module will fail to load unless it implements all required exports.
- `i32 (proxy_result_t) call_result` | ||
|
||
Indicates to the host environment that the Wasm module finished processing | ||
current context. It should be used after previously returning `false` in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you might want to mention proxy_set_effective_context
which I assume might need to be called if the contexts have switched before finalizing.
- `Continue` means that this connection should be accepted. | ||
- `Done` means that this connection should be accepted and that this extension | ||
shouldn't be called again for the current stream. | ||
- `Pause` means that further processing should be paused until |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last we talked about this the API was going to replicate all of the filter return codes we already have, so watermarking should be included, though I haven't read that far yet.
Called when the host environment starts WebAssembly Virtual Machine (WasmVm). | ||
|
||
Its configuration (of size `vm_configuration_size`) can be retrieved using | ||
`proxy_get_buffer`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I find this proxy_get_buffer dance to be a bit magical from an API design perspective. What if in the future we have two buffers that we want to get? IMO for this type of thing I think it would be much cleaner and more future proof to pass an opaque_handle that could then be passed back to proxy_get_buffer, or a size and handle if that makes more sense. This would make it very explicit what the user is fetching if we ever need to add the ability to fetch multiple things on the context of an API call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. I was going to comment on this, but I realized that for the practical use cases in the existing ABI that the key/value method seemed to work fine. But, I suspect that the proposal here will future proof and make the ABI a little more regular.
- `EndStream` means that data chunk should be forwarded upstream and that this | ||
connection should be closed afterwards. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this mean in terms of half duplex? It's a bit odd to me that this is a return code. Would it be better to have this happen via a callback on the connection?
|
||
Returns `InvalidOperation` when stream wasn't paused. | ||
|
||
#### `proxy_close_http_stream` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment about if we have this why do we need the special return codes from the filter callbacks. Seems like it would be simpler to reduce the return codes there and just have people call this before they stop iteration or whatever, though I don't feel strongly about it.
slices pointed by a list of vectors (`return_buffer_data`, | ||
`return_buffer_size`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify this? The memory is allocated by callout via the allocate API, right? I'm not completely clear on what are in/out params here and how this works. I think more text would help.
|
||
Sets content of the buffer `buffer_type` to the values from memory slices | ||
pointed by the provided list of vectors (`buffer_data`, `buffer_size`), | ||
replacing `size` bytes in the existing buffer, starting at the `offset`. Host |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add this type of thing into the spec? It would help similar to my comments above.
`proxy_create_timer`. | ||
|
||
|
||
## Stats/metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where did we land on the differences between counters/gauges/histograms? Do we need a histogram API? It's not clear to be from reading the spec how a generalized API makes sense, or it seems like it should be more clear about what APIs are supported on what types and otherwise will return an error.
connection is closed. | ||
|
||
|
||
## Custom extension points |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 not totally clear how this would be used. An example would be great.
Hi @PiotrSikora, is there something I can do to help you with this changes? |
Signed-off-by: Piotr Sikora [email protected]