Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Review of the ContextBase include file. #2

Merged
merged 19 commits into from
Mar 30, 2020

Conversation

jplevyak
Copy link
Contributor

@jplevyak jplevyak commented Mar 9, 2020

Tests will follow after the Wasm header when the .cc files are sent for review.

Signed-off-by: John Plevyak [email protected]

the Wasm header when the .cc files are sent for review.

Signed-off-by: John Plevyak <[email protected]>
Copy link
Collaborator

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat. I have some high level questions. Thank you!

include/proxy-wasm/context.h Outdated Show resolved Hide resolved
include/proxy-wasm/context.h Outdated Show resolved Hide resolved
Signed-off-by: John Plevyak <[email protected]>
Signed-off-by: John Plevyak <[email protected]>
Signed-off-by: John Plevyak <[email protected]>
Signed-off-by: John Plevyak <[email protected]>
Signed-off-by: John Plevyak <[email protected]>
Copy link
Collaborator

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this is super awesome. Sorry for the review delay. Flushing out some API comments. @htuch I would recommend reviewing this PR.

context_test.cc Outdated Show resolved Hide resolved
include/proxy-wasm/context.h Outdated Show resolved Hide resolved
include/proxy-wasm/context.h Outdated Show resolved Hide resolved
include/proxy-wasm/context.h Show resolved Hide resolved
include/proxy-wasm/context.h Show resolved Hide resolved
/**
* Enables a periodic timer with the given period or sets the period of an existing timer. Note:
* the timer is associated with the Root Context of whatever Context this call was made on and
* there is only one timer available per Root Context.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a single timer seems like a pretty big limitation. Why not have a timer id and an ability to allocate a timer? I'm pretty sure this is going to come up very quickly if it hasn't already?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SIGALRM is a pretty clear counter-example. However, we can do that. Could you make the comment on the spec and get it changed there? Then we can update this file to match the spec.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I make a comment on the spec?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let let me talk to @PiotrSikora. We need to create a reviewed branch.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any timer functions in the ABI doc here? https://github.com/proxy-wasm/spec/pull/1/files

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

`proxy_set_tick_period_milliseconds is used. We could add a token_ptr and if the target token was zero we could use that to indicate that the system should allocate a new timer, but if it was non-zero then it would be the timer to set a new period for. WDYT? I'll add this comment on the spec as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My personal feeling is that the ABI should directly support timer allocation, since many/most host systems would support that. If a host only supports SIGALRM for example, it would be responsible for doing it's own scheduling which seems reasonable to me given such an uncommon case. So my advice would be to have the ABI directly support timer allocation in addition to a tick mechanism. If for whatever reason we don't want to do that I think what you propose is fine, though it would require implementing our own scheduling logic which to me seems not worth it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set_tick_period allocates a timer if the period is non-zero as per the documentation. It deallocates the timer if the period is zero. What I am proposing is that we have a way for the system to return a token representing the allocated timer and for the SDK to provide a token. The on_tick call would also be augmented with a uint32_t timer_token

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure sgtm

include/proxy-wasm/context.h Outdated Show resolved Hide resolved
include/proxy-wasm/context.h Outdated Show resolved Hide resolved
include/proxy-wasm/context.h Outdated Show resolved Hide resolved
include/proxy-wasm/context.h Outdated Show resolved Hide resolved
Signed-off-by: John Plevyak <[email protected]>
@jplevyak jplevyak requested a review from mattklein123 March 17, 2020 00:48
};

/**
* ContextBase is the interface between the VM host and the VM. It has several uses:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately doing as you suggest for the onXXX functios would require replicating the full definitions in the inheriting class with an override since they are actually implemented there. This (at least in part) defeats the purpose of clarity.

Sorry I don't follow. If you ignore my point about pure virtual, and you are OK with multiple inheritance, can't you just have a bunch of base classes that are conceptually related that are split out from below and then derive from all of them in ContextBase? I think this would be way, way easier to understand for SDK writers. This class is going to keep growing and it will get even harder to understand. I'm confused why we would want this structure if the goal is to create an ecosystem of SDKs where we try to make it as easy as possible to write SDKs.

/**
* Enables a periodic timer with the given period or sets the period of an existing timer. Note:
* the timer is associated with the Root Context of whatever Context this call was made on and
* there is only one timer available per Root Context.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I make a comment on the spec?

include/proxy-wasm/context.h Outdated Show resolved Hide resolved
include/proxy-wasm/context.h Outdated Show resolved Hide resolved
include/proxy-wasm/context.h Outdated Show resolved Hide resolved
include/proxy-wasm/context.h Outdated Show resolved Hide resolved
include/proxy-wasm/context.h Outdated Show resolved Hide resolved
* to make a unique identifier for the queue.
* @param token_ptr a location to store a token corresponding to the queue.
*/
virtual WasmResult resolveSharedQueue(string_view /* vm_id */, string_view /* queue_name */,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When does someone use resolve() vs. register()? Couldn't this be combined into a single function with a parameter that says whether to "create if not exist" or similar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Register is used to create a queue for reading from.
Resolve is used to get a reference to a previously created queue for the purposes of writing to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are called by different VMs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if they are called on different VMs I'm not sure why we need 2 different functions here that basically do the same thing? If you want to keep them separate I might consider switching "resolve" to "lookup" to be more clear, but up to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The do not do the same thing. One creates a queue for reading. The other gets a reference to an existing queue for writing. That is like saying that accept() and connect() do the same thing. Yes they both return a socket, but that is where the similarity ends.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's take this so the spec then. If they do different things the naming should be more clear such as registerForRead() or registerForWrite. Even in that case, they can still be a single function with a param that specifies the read/write direction.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on preferring lookup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, lookup is reasonable. I am kind of surprised by the desire to combine these while splitting apart the metric calls. To me these are completely different whereas the metrics calls are very similar.

Signed-off-by: John Plevyak <[email protected]>
Signed-off-by: John Plevyak <[email protected]>
Signed-off-by: John Plevyak <[email protected]>
virtual ProxyAction onResponseMetadata() = 0;

/**
* Call when a stream should log the final status of the stream. Occurs after onDone().

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is onLog unique to HTTP streams and not supported by TCP streams ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it is only support by HTTP streams in Envoy AFAICT. @mattklein123 ?

* @param token_ptr contains a pointer to a location to store the token which will be used with
* the corresponding onHttpCallResponse.
*/
virtual WasmResult httpCall(string_view /* target */, const Pairs & /*request_headers */,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, httpCall() method should go into a dedicated HttpClient interface, group of grpc*() methods - into GrpcClient interface, group of *Metric() methods - into Metrics interface, group of *SharedData() - into SharedData interface, group of *SharedQueue() methods - into SharedQueues interface, etc.

Envoy will always be providing more services for extensions to use than a standard that aims to suit all proxies. That is no different from browsers.

As an Envoy contributor, I wouldn't want to start development of every new Wasm feature by making changes to the standard.

So, API should be optimised for a scenario where Envoy is ahead of the standard.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 this is what I had in mind also. It will vastly help with readability even if we continue to use the "single context rules them all" approach.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, but to @yskopets point, I think we need to balance API stability against Envoy developer productivity. We need some way to keep growing/pushing the API forward, without breaking backward compatibility. I almost wish we had done these API definitions in proto and used something like protolock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use a syscall approach: one call to rule them all. I wouldn't use protos though as protos are very slow, particularly in Wasm for small values (3x+ slower than native code). We could use something like flatbuffers. but we need to consider different language support. The current ABI format is built in part to make it easier to add additional languages where the lowest common denominator is often a "C" call with primitive values.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't speak for @htuch but I think he was expressing a desire to describe the API in a structured way whether that is swagger/proto/etc. IMO that's aspirational. :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Aspirational at least for the MVP)

/**
* Call on a stream context to indicate that the request headers have arrived. Calls onCreate().
*/
virtual ProxyAction onRequestHeaders() = 0;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why doesn't onRequestHeaders() also have an end_of_stream argument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. It doesn't need that because that information is cached in the Context object. We don't need that information or the body_buffer_length either in onRequestBody() because the buffer is stored in the Context object as well so that it can handle calls out of the VM to retrieve bytes.

IMO we should remove both of these from onRequestBody etc.

This would simplify the Context API. The calls into the VM still contain that information, but they get the values form those cached in the Context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

onRequestBody can be implemented from the getBuffer() and end_of_stream() calls on Context without those arguements in a generic way.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, it looks fine to remove all these args.

/**
* Call on a stream context to indicate that the request trailers have arrived.
*/
virtual ProxyAction onResponseHeaders() = 0;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

/**
* Call when a stream should log the final status of the stream. Occurs after onDone().
*/
virtual void onLog() = 0;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit strange naming/description wise. The other callbacks in this file describe stream events based on stream lifecycle. The log event is confusing imperative action with description. I think you want to call this onStreamFinalization() or something, when you might log something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

onFinalize sounds reasonable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have onDone, so I guess what you're saying is that we should combine onLog with onDone?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like that @PiotrSikora. I think we need to have a very crisp description of the request lifecycle and it should be somewhat abstracted from whether a proxy decides to log or not, etc.

* HTTP
* Note: headers and trailers are obtained by the code in the VM via the XXXHeaderMapXXX set of
* functions. Body data is obtained by the code in the VM via calls implemented by the
* proxy-independent host code using the getBuffer() call.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a pointer for these to help review? It would be good to understand the data model.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec covers the data model for buffers and data. There is a buffer interface at the top of the file as well.


// TODO: use the SDK version when it is updated.
enum class ProxyAction : uint32_t {
Illegal = 0,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this expressive enough to capture the range of buffering behaviors that existing HTTP filters can express when returning from something like decodeHeaders() or decodeBody()?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not, there is a comment from me on this elsewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are going to be pulling this from the SDK repo, but that has not been updated yet. This is a placeholder till the spec review completes and then the SDK is updated. Added comments.

* allocated memory block which contains the copied bytes (i.e. length).
* @return a WasmResult with any error or WasmResult::Ok.
*/
virtual WasmResult copyTo(WasmBase *wasm, size_t start, size_t length, uint64_t ptr_ptr,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will future WASM VMs support read-only zero copy or move operations for memory transfers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely. However we need to work through the SharedBuffer support issues. We can update this at that time.

* @return a WasmResult with any error or WasmResult::Ok.
*/
virtual WasmResult copyTo(WasmBase *wasm, size_t start, size_t length, uint64_t ptr_ptr,
uint64_t size_ptr) const = 0;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of this is probably due to my confusion around what is/isn't allowed, but this seems a strange interface. This is C++ but we're returning data via integer addresses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Wasm uses integer offsets into it's memory as "pointers". When communicating into the VM we allocate memory in the VM obtaining an integer offset then write that offset into an offset provided by the call out of the VM.

};

/**
* ContextBase is the interface between the VM host and the VM. It has several uses:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, as someone fresh to this code, starting with the interfaces was super helpful in gaining an intuition about how things work at high level. I would strongly be in favor of this pattern. The comments in the interface file are super helpful.

* Called on a Root Context to indicate that an Inter-VM shared queue message has arrived.
* @token is the token returned by registerSharedQueue().
*/
virtual void onQueueReady(uint32_t token);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there only ever one kind of token? or are there different tokens for shared messages, HTTP calls etc? Should these have wrapping Struct types? Should they be uint64_t if it's a global namespace?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are different types of tokens. Sadly C++ doesn't have a native way to form by-name primitive types. Google has wrappers for that but I don't know of an OSS or STD library. Rather than take a dependency we are doing it with raw uints. Perhaps naming and documentation or should we add a struct wrappers. Personally I feel like the latter is overkill, but I can be convinced if others disagree.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine either way. You might consider some type of using type alias just for readability?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm happy with using just for the readability win.

* @param body_size is the size in bytes of the response body.
* @param trailers is the number of trailers in the response.
*/
virtual void onHttpCallResponse(uint32_t token, uint32_t headers, uint32_t body_size,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the callback get at the headers/trailers/body? I would be in favor of these docs expecting a fairly inexperienced developers and needing prompt on things like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the API for people who are integrating this library into a proxy. I would very much hope and expect that they were far from inexperienced. This is not the SDK where I would expect more inexperienced programmers.


/**
* Get proxy-wide key-value data shared between VMs.
* @param key is a proxy-wide key mapping to the shared data value.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do keys come from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key is provided by the plugin. We might consider namespacing these, perhaps in the configuration. However that is something which is above this level. WDYT? We can add a "shared_data_namespace" field to the VmConfig struct or even the PluginConfig struct.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think providing guidance on naming is great here from xDS API experience, e.g. suggesting reverse DNS or making use of structural namespaces.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added guidance.

/**
* Get proxy-wide key-value data shared between VMs.
* @param key is a proxy-wide key mapping to the shared data value.
* @param cas is a number which will be incremented when a data value has been changed.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have complex low-level CAS semantics? Can't we just have the return value from the operation say "yes/no"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what you are proposing. This is the standard CAS system AFAIK. Do you have an example of an alternative API you would prefer?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it would be good to add a section on concurrency model to the ABI spec, I don't recall seeing one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add one. The model is that each VM is single threaded, so inter-VM communication is (in general) cross-thread.

* to make a unique identifier for the queue.
* @param token_ptr a location to store a token corresponding to the queue.
*/
virtual WasmResult resolveSharedQueue(string_view /* vm_id */, string_view /* queue_name */,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on preferring lookup.

// Header/Trailer/Metadata Maps

/**
* Add a key-value pair to a header map.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have to be super-precise here with semantics on header map, as they keep changing in internally in Envoy and put the WASM API stability at risk. For example, what happens when you add multiple values for the same key? Is there concatenation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. That should be laid out in the spec https://github.com/proxy-wasm/spec see pull/1 to comment. Then we can copy that information here.

Personally I would like to see the ABI handling multiple values rather than punting to the plugin writer.

virtual WasmResult getHeaderMapSize(WasmHeaderMapType /* type */, uint32_t * /* result */) {
unimplemented();
return WasmResult::Unimplemented;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other alternative to ^^ is an iterator interface for header maps (like Envoy internal), did you consider that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We chose this instead because the cost of moving data into the VM is smaller than the cost of traversing the VM boundary frequently. Consequently for large scale header searching and manipulation with bounded size headers it is likely faster to just move the data into the VM and do the work there. It is also more flexible because we can update the SDK quickly and in a language specific way without having to change the ABI version.

Signed-off-by: John Plevyak <[email protected]>
/**
* Call on a stream context to indicate that the request headers have arrived. Calls onCreate().
*/
virtual ProxyAction onRequestHeaders() = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@htuch commented on this elsewhere and sorry I missed this, but the header and body functions need end_stream params right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that information is cached in the Context object because it needs it to respond to the proxy_end_of_stream() and proxy_get_buffer_bytes() ABI calls, so that information does not to be passed in the call.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So how does a plugin know that it's end of stream? Can you clarify that in the docs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comments.

Copy link
Collaborator

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jplevyak I actually think we are quite close here and in the spec. I still think in this PR it would be useful to move more of the interfaces out for things like queues, timers, etc. just to help with readability and per your point make the contextbase class easier to scan w/ less comments. I'm confident we can resolve the remaining issues in the meeting tomorrow. Thank you!

Signed-off-by: John Plevyak <[email protected]>
Signed-off-by: John Plevyak <[email protected]>
Signed-off-by: John Plevyak <[email protected]>
@jplevyak
Copy link
Contributor Author

this has been extensively updated WRT the comments and reorganized into separate interfaces. PTAL

Signed-off-by: John Plevyak <[email protected]>
@jplevyak jplevyak requested a review from mattklein123 March 24, 2020 23:42
Copy link
Collaborator

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the iteration here. In general this looks great and is vastly easier (for me) to understand from readability perspective. Flushing out a bunch of random small comments but basically LGTM to ship and iterate.

Comment on lines +77 to +78
virtual WasmResult copyTo(WasmBase *wasm, size_t start, size_t length, uint64_t ptr_ptr,
uint64_t size_ptr) const = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this method always allocate memory in the VM? Does it support copying into an existing allocated chunk at a certain offset? I assume not? Is that something we might want in the future? Just trying to understand the overall interface.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allocates copies in and sets the location and size to the provided pointers. This is the get interface, but we also need support for modifying the buffer which will require another API.

It is an interesting question as to whether we would like to allow loading of the data into a pre-allocated region. We could do that without changing the ABI by saying that if *ptr_ptr != nullptr then we use the given address.

@PiotrSikora

That is something we could add to the spec. WDYT?

Comment on lines +73 to +74
* @param size_ptr is the location in the VM address space to place the size of the newly
* allocated memory block which contains the copied bytes (i.e. length).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I asked this previously, but will this always end up equal to length? Or might it be less than length if the buffer was not length size for example? It would be good to clarify this a bit. At face value it's not clear to me why the length gets copied back out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just convenience. It is copied out because otherwise we would need an additional call to copy it to size_ptr, and so there would always be a pair of calls. This is more concise and safer because there is no way to forget to set the size or to get the size wrong.

include/proxy-wasm/context_interface.h Show resolved Hide resolved
include/proxy-wasm/context_interface.h Outdated Show resolved Hide resolved
include/proxy-wasm/context_interface.h Outdated Show resolved Hide resolved
include/proxy-wasm/context_interface.h Outdated Show resolved Hide resolved
include/proxy-wasm/context_interface.h Show resolved Hide resolved
include/proxy-wasm/context_interface.h Outdated Show resolved Hide resolved
include/proxy-wasm/context_interface.h Outdated Show resolved Hide resolved
include/proxy-wasm/context_interface.h Outdated Show resolved Hide resolved
Signed-off-by: John Plevyak <[email protected]>
Copy link
Collaborator

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for iterating on this. I'm fine to ship and iterate but I would maybe poke @htuch @jmarantz @yskopets etc. for a quick review tomorrow?

@jplevyak
Copy link
Contributor Author

SGTM I'll email. Happy to iterate.

uint32_t * /* timer_token_ptr */) override {
return unimplemented();
}
uint64_t getCurrentTimeNanoseconds() override {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use TimeSystem::systemTime() for this. Then you can drop the include of time.h. In any case I think this is not a good candidate for inlining.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the envoy-specific implementation. See the companion PR envoyproxy/envoy#10262 for details. We use a TimeSource for mock-ability.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we relegate this to a cc file? Seems like we don't need to expose this as a header; might also not be windows-compatible.

void onDownstreamConnectionClose(CloseType close_type) override;
void onUpstreamConnectionClose(CloseType close_type) override;

// General

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest outlining all these less-than-trivial functions into a .cc file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These have full implementations. See the master branch. This is just a review of the header. We have changed our review plan and will be reviewing on the envoy-wasm repo after this PR.

}
WasmResult log(uint64_t level, string_view message) override {
if (level >= static_cast<uint32_t>(LogLevel::error)) {
std::cerr << log_prefix() << message << "\n";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like we are re-implementing the Envoy logging here.

Thinking more broadly, it looks like this implements an interface that I'm guessing you are trying to make independent of Envoy. OK fine.

But then I think you should have an Envoy-specific implementation of this class, which uses the Envoy infrastructure for logging, time, and probably lots of other things, rather than re-inventing them. WDYT?

I'm OK with a TODO here but I think we should get directionally aligned.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log gets called in the worker thread, right? output to cerr might impact performance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The full Envoy version is already implemented in the envoy-wasm repo and in the upstreaming PR: envoyproxy/envoy#10262

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is overridden both by Envoy for production work and in tests for mocking purposes. This is just a default implementation which integrators can use during the integration process.

bool onStart(std::shared_ptr<PluginBase> plugin) override;
bool onConfigure(std::shared_ptr<PluginBase> plugin) override;
void onHttpCallResponse(HttpCallToken token) override;
void onQueueReady(uint32_t SharedQueueDequeueToken) override;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SharedQueueDequeueToken is used as a variable name rather than type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanx! Fixed.

* Called on a Root Context when an Inter-VM shared queue message has arrived.
* @token is the token returned by registerSharedQueue().
*/
virtual void onQueueReady(uint32_t SharedQueueDequeueToken) = 0;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SharedQueueDequeueToken is used as a variable name rather than type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

* @param grpc_status is an optional gRPC status if the connection is a gRPC connection.
* @param details are details of any (gRPC) error.
*/
virtual WasmResult sendLocalResponse(uint64_t response_code, string_view body,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason to use uint64_t to represent HTTP/gRPC status ? E.g., even token types are uint32_t

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. This was leftover from when I was working on the NullVM. Thanx, fixed here and elsewhere.


// Call when the stream status has finalized, e.g. for logging. See RootInterface.
virtual void onFinalized() = 0;
};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike HttpInterface, this interface doesn't have onDelete() method.

What is the reason for such asymmetry ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Yes, it should have one. Fixed.

virtual WasmResult unimplemented() = 0;

// Log a message.
virtual WasmResult log(uint64_t level, string_view message) = 0;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason to use uint64_t to represent log level ? E.g., even token types are uint32_t

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

* allocated memory block which contains the copied bytes (i.e. length).
* @return a WasmResult with any error or WasmResult::Ok.
*/
virtual WasmResult copyTo(WasmBase *wasm, size_t start, size_t length, uint64_t ptr_ptr,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method uses uint64_t to represent an address inside a VM.

But all token types on host side are uint32_t.

What is the reason for such asymmetry ? Is it a reasonable combination 32bit host and 64bit VM ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in order to support 64-bit VMs and to support the NullVm we need to have 64-bit resolution on addresses. Comments added.

* @param parent_context_id is the parent Context id for the context being created. For a
* stream Context this will be a Root Context id (or sub-Context thereof).
*/
virtual void onCreate(uint32_t parent_context_id);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this method is not pure virtual like the rest ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

* VM to be deleted.
* Called by the host code.
*/
virtual void onDelete();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this method is not pure virtual like the rest ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

SharedQueueDequeueToken *token_ptr) override {
return unimplemented();
}
WasmResult resolveSharedQueue(string_view /* vm_id */, string_view /* queue_name */,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this method hasn't been renamed to lookupSharedQueue ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Signed-off-by: John Plevyak <[email protected]>
Signed-off-by: John Plevyak <[email protected]>
Signed-off-by: John Plevyak <[email protected]>
@jplevyak jplevyak merged commit 112772f into proxy-wasm:reviewed Mar 30, 2020
Copy link

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for splitting out the interface, much easier to review. I realize this is already merged, but I have left some feedback. I +1ed a bunch of @mattklein123 comments which seem unresolved where I shared the same concern.

Request,
Response,
Downstream,
Upstream,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like apples/oranges without further comment, i.e. how is request an alternative to downstream?

/**
* RootGrpcInterface is the interface for gRPC events arriving at RootContext(s).
*/
struct RootGrpcInterface {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Can we make this GrpcInterface inside the RootContext namespace?

*/
struct RootGrpcInterface {
/**
* Called on Root Context to with the initial metadata of a grpcStream.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "to with"

struct RootGrpcInterface {
/**
* Called on Root Context to with the initial metadata of a grpcStream.
* @token is the token returned by grpcStream.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a grpcStream here? How come we are using this camel-case for some noun?

* @token is the token returned by grpcCall or grpcStream.
* For Call this implies OK close. For Stream may be called repeatedly.
* The data can be retrieved via the getBuffer function with
* WasmBufferType::GrpcReceiveBuffer as the type.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little hazy on how this works..

* Provides a BufferInterface to be used to return buffered data to the VM.
* @param type is the type of buffer to provide.
*/
virtual const BufferInterface *getBuffer(WasmBufferType type) = 0;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does buffer type combine with something like GrpcToken?

// Header/Trailer/Metadata Maps
struct HeaderInterface {
/**
* Add a key-value pair to a header map.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

string_view value) = 0;

/**
* Get a value from to a header map.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the expected performance? O(1) or O(n)? It's important to have callers aware of this IMHO.

*/
virtual WasmResult grpcCall(string_view /* grpc_service */, string_view /* service_name */,
string_view /* method_name */, string_view /* request */,
const Pairs & /* initial_metadata */,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there also be trailing metadata?


struct GrpcStreamInterface {
/**
* Open a gRPC stream.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is initial/trailing metadata handled for streams?

PiotrSikora added a commit to PiotrSikora/proxy-wasm-cpp-host that referenced this pull request Jan 5, 2022
Signed-off-by: Piotr Sikora <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants