-
Notifications
You must be signed in to change notification settings - Fork 494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for wasi-http to wasm middleware #2990
Conversation
Signed-off-by: Brendan Burns <[email protected]>
Signed-off-by: Brendan Burns <[email protected]>
Thanks @brendandburns for this contribution! Although this looks good to me, I'd like to ask for a review from @Taction and @codefromthecrypt too. I have 2 questions related to security:
PS: @brendandburns need to ask you to please sign-off your commits with DCO! |
Oh one more thing. Can you please add the same options to WASM binding too? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the spike. TL;DR; needs a user and working example code. I'll help on the integration side, to avoid adding custom properties.
I think it is better to detect the imports like we do with wasi, to avoid adding configuration.
For example, http-wasm doesn't require wasi_snapshot_01
in any way. The middleware detects imports and only adds only if needed. We don't have a "EnableWASI" field as it isn't needed for this reason.
OTOH I can see how plugging in something besides normal wasi would be difficult right now. Let me make a change to http-wasm's middleware to make this more clean.
Meanwhile, all of our features have end user requests behind them. Are you proxying an end user, or do you know someone who will use this? Having an end user helps because it makes the example a bit better.
Regardless, this needs to be integrated in practice. For example, we need an example (even if only internally tested) to show that this works. Specifically, guest code that imports the tinygo guest and also whatever it is for the http client outbound. The test of that example will highlight other configuration needed, for example if there are any rules needed to say what's allowed outbound.
For example, right now, we have strict sandbox, so you can't even access files. It seems for client to work, it minimally needs sockets or at least some sort of way to decide what the outbound client will be.
Once the example is in, it should be benchmarked, like the others.
Meanwhile, I'll try to get something in the main middleware to make adding custom imports easier, though I'm not sure if this is "import only" or not. I think proably it will just give you the wazero CompiledModule
which you can check to see if it has what is needed or not.
More product note for the maintainers and also Brendan as I think this is dipping toe in the water, but certainly not the eventual outcome of wasi-http. I think once wasi-http finishes, people who want that model will want to use the client along with the server, as both share async buffers and whatnot. http-wasm was designed differently, consciously for mostly synchronous middleware. Its client once finished will follow that model. Mixing two very different designs can cause confusion. This is a special case that client finished first (and before http-wasm's client finished). If it had finished the server first, this would be a 100pct be a separate component. Restated: As a product it is probably worthwhile thinking through that end, because they are different designs. While possible to get a quick start putting wasi-http's client into http-wasm's middleware. It may be more confusing than waiting for wasi-http to have middleware and then putting it there instead, especially as http-wasm will have a client binding and that binding will follow design principles of the middleware which are unlike wasi-http which is async in nature. Thoughts welcome. |
Thanks for the spike. Agree with needs a user and working example code. And then we know well what is needed and what is not. For example, if http client support is enough, do we need to limit the domains that can be accessed, etc... I think a configuration is needed for security purposes, if disabled, the IMHO, wasi has some ongoing proposals, so users may ask for more features in the future. Suggest using the following configuration, "strictMode" disables all unsecured behaviors(force "enableWasiHttp" to be false). Add a Here is a possible example of CRD. apiVersion: dapr.io/v1alpha1
kind: Configuration
metadata:
name: wasmStrictSandbox
spec:
wasm:
strictSandbox: true
capabilities:
wasiHttp:
enabled: true
# other configurations specific to wasiHttp, just for example.
allowedDomains: ["127.0.0.1","localhost"]
# other capabilities if have
fs:
enabled: true
mountedPath: "/tmp/dapr" |
@Taction one thing to bear in mind is that this is an addition to what's built-into wazero. In other words, it is extending the sandbox config (defined in ModuleConfig) to allow either sockets or some other mechanism. It is fine to satisfy imports even when in strictmode, just that we need to make sure everything that is sandbox related is controllable by that. concretely. most often the ENOSYS error is returned if something is not configured (more portably handled by compilers than the extension ENOTCAPABLE defined in wasi). Sometimes you want imports to work because the compiler adds them for features not actually used. So, I would separate the aspects of configuring sandbox policy from whether or not imports are detected (this is how it works today where sandbox policy is ModuleConfig and internal to the http-handler host, imports are detected regardless of sandbox policy) |
concretely, you can consider python. The interpreter will import a lot of things not ever used. We can still have a strict sandbox which disallows access to resources. If we required knowing which imports can use what, basically you could never load python regardless of if you ever access something. |
@codefromthecrypt Thanks for clarifying.
For Also, there are two types of users, one is that who writes wasm, compiles it, and uses it. And if I understand correctly, in this process, may introduce extra function imports in wasm binary. consider python, users' code may not use some function, but the interpreter may introduce that. |
@Taction aside, but as you are a contributor you probably know that due to lack of end user interest in client, yet, there wasn't much movement on http-wasm's client. There was always scope for it, but we strictly went a quality and end user first direction with the project. The two areas for confusion are:
does it make sense visa versa? Say we later introduce a wasi-http component, should it be bind http-wasm's client? Would that be less or more confusing? Also, http-wasm is not using component model or codegen to produce its ABI. Personally, I think it is very strange to mix things like this in the same go package, feels like putting jboss inside spring. |
I think since all folks here are developers, it is worthwhile actually trying to code the thing, and also review the codebase of the dependencies. You can then come to your own judgement about how the guest works (basically generated tinygo+cgo), how the server works (how and if anything is sandboxed). Basically, you shouldn't assume there's any similarity between this and http-wasm just because it is wasm. All new things need work, wasi-http needs a lot of work, and so will the http-wasm client (we need it anyway for those porting off of proxy-wasm). However, talk is cheap. Please try and see what you are buying. Having tried these things, I highly recommend a different component. It really doesn't make sense to mix them. However, despite raising PRs and maintaining for a long time, I'm not a maintainer. I don't really have a say either way. Just I implore you to look closely as there are long term maintenance aspects to these sort of decisions. Coupling these together puts me in the middle of that, but also you. |
Thanks for the discussion here. I think that there are several different threads of discussion, so I will try to address them all. As a background, here is the WasiHTTP spec: https://github.com/WebAssembly/wasi-http sandbox/security modelI think that it makes sense for http-wasm vs wasi-httpI agree that http-wasm and wasi-http are likely to be duplicates of each other for a period. We're in a time period where there are going to be multiple implementations and different users are likely to make different decisions about which ones to support. I agree that mixing them together in the same WebAssembly module is a little weird, so I would be happy to make it exclusive, e.g. if you enable I don't think there is currently enough consensus in the community currently to choose one or the other exclusively, so I think it makes sense for Dapr to support both until there is convergence. use-case/userI think the general use case is that you want to interact with Dapr or other services from the middleware, e.g. you might want to make a call out to a security scanning API, or an audit trail, or any other sort of external service. Additionally though this capability is being added into the middleware currently, I also think it is very useful to add it as a way of defining Dapr enabled services so that you don't need to have a sidecar. we can work on finding a user for this functionality, but of course until the functionality is available, it's pretty hard to get someone to use it. |
To address @codefromthecrypt concerns, I'm happy to break apart the code into the parts that are wasm middleware generic, and the parts that are specific to http-wasm or wasi-http Also, there was a request for working example code, the working example code was referenced at the beginning of the PR, but for clarity, you can find it here. |
So, there are currently users of wasm, so best case is someone already using it, who wants this feature. This is how the other features were added (for example, when we developed http-wasm it was in response to a dapr user requesting access to more properties). Doing things without end users can take attention off other things, as we all have a lot of different projects. Waiting until at least someone who is an actual wasm dapr user asks, someone who would try it and use it themselves is a nice switch. Because no user asked is the only reason http-wasm doesn't have this. Some maintainers have thought maybe.. but yeah no end user so far. Mainly, this unveils the use case, as without the concrete use case, we end up adding something maybe a bit random or tactical. For example, it is indeed the case http clients can be used for a tactical implementation of dapr sdk until a wasm one is made. If it is about that, the use case simplifies sandbox (e.g. to only localhost) Without a specific real user/ use case, the time spent can be oversolving. Also, a user may find the approach not compatible with them.
The example is not normal in go projects, if we are honest. We spent a long time on the guest for tinygo to make it normal. The current example for wasi-http involves externally generated codegen. That generator isn't like buf.build easily integrated in go, it is opaque and there isn't a Makefile or go:generate or instructions on how to generate the guest side. Moreover the guest side isn't normal code, it is CGO with tinygo. I think an end user would likely ask this to be normal, and not accept it how it is. IOTW, the investment to make this Go friendly hasn't happened yet, and I think it should. It should be a normal go import, not requiring object files checked into source code etc. SummaryI think more investment needs to be made, as this seems very proof of concept, and I haven't seen any evidence of that changing. In general, not just in dapr, the guest needs to become idiomatic. Otherwise, we are creating a poor experience in a place which was easy, unit tested and benchmarked. Without end users asking for this yet, I don't feel like steps must be skipped. We spent a lot of rigor on the current, and it seems odd that rigor isn't affordable on wasi-http. |
For the use case, please remember that we also have a new output binding that allows executing arbitrary WASM code. I think the ability to make network calls from WASM from the binding can be very helpful. |
@ItalyPaleAle I get you for sure. Can you please raise an issue about outbound HTTP because I think we are discussing this backwards. We can then ping the other people who have asked about http middleware in the past, and collect what they need and how/if they care about sandboxing, if a localhost-only sandbox is ok, etc. Meanwhile, @brendandburns maybe can work with wasi folks to see about investing on dev/x so that the impl is less polarizing. |
The reason I ask this is that all the work I do is with end users. For example, they may think they can make an SDK with this, only to find out that protobuf doesn't work or something else. Like I don't think hello world is a good user, we can converse and find all the blockers, not just raw HTTP. This will get whether they expect sequential calls or to use goroutines (even if the runtime will act as GOMAXPROC=1). I've been through this a lot and we have to be ready for the real problem, not the easy one. POCs are easy, it is solving it through why I spent literally person months on this with you all so far. |
Until we have a proper issue on outbound HTTP, there was a discussion about bindings. Main difference is that http-wasm is FFI and output binding is subprocess model. The model changes things, even if you can use FFI inside a subprocess model.. just it reduces the value a little as you aren't leveraging the compiler, rather application layer integrations. For output bindings, the wasi model is subprocess (driven by os args/env), like containers. In the python use case, it is unlikley to change that to use wasi-http IIUC, they already are doing raw outbound sockets. For example, python wasi builds include two modes.. normal (inbound socket only) and wasmedge (more featured sockets). It is possible they are using wasmedge ABI and not outbound sockets, so that needs to be fact checked. https://github.com/vmware-labs/webassembly-language-runtimes/releases Anyway, those building their own wasi commands in go can do it in the "wasmedge' way (an extension of wasip1 like wasix), and use |
FWIW one day maybe the end of the year WASI "preview2" will happen, and it is likely that will include sockets like wasix does today. In WASI component model terms, this is wasi-cli world, which would be an alternative compilation target to the wasip1 used today (which wasmedge and wasix extend) The thing I like about things binding at syscall layer is that you don't have to design a new ABI for every kind of outbound concern. You also don't need users to have to override their round trippers etc. You just need to filter on something like It isn't either or, just when compilers and libc already support sockets, and you already have to deal with sockets because of this.. it is less additional work to support this approach. Ack that cloud hosting services will prefer an http outbound ABI, and maybe disable network services made by syscalls even if they are compiled in (by returning ENOSYS etc). However, I think similar to normal FaaS and containers today, not everyone will reject calls like this. crun wouldn't for example. Regardless, these are both ways to do outbound: syscall and application layer. Both can coexist, too, but we should know both will exist when thinking about design. (end soapbox and happy to write a doc on all of this) |
If there is a preference for putting this in bindings first, I can send a PR to do that. I don't want to argue about the devx of the I will argue strongly for an http call at the abi layer, though you can clearly do clients in WASM once you have sockets, you lose the ability to do fine-grained sandbox control at the abi layer. For example, a user of the http-client ABI can restrict the methods ( But let's focus on what is useful for Dapr, not argue about the intricacies of the web assembly ecosystem. In the context of Dapr, having an HTTP client enables you to talk to all of the other pieces of Dapr (e.g. state stores, other services, key systems, etc) I think unlocking that functionality within the Dapr process without requiring a side-car is a strong value proposition for the average Dapr user. |
Additionally, as @codefromthecrypt mentions: "For example, it is indeed the case http clients can be used for a tactical implementation of dapr sdk until a wasm one is made. If it is about that, the use case simplifies sandbox (e.g. to only localhost)" This is the main value proposition for me, and so if it simplifies the discussion around merging this initial implementation to restrict the client to localhost by default, that is totally fine with me. |
With the premise that I am not a WASM expert at all, and I do not fully understand the implications of the various things discussed above... As a maintainer of this repo I would prefer for whatever solution is implemented to be available in both the binding and the middleware. This is to avoid confusing our users with nuances in what is supported. Unless, of course, you experts believe for technical reasons that's not possible. |
I think that I work with most Go wasm end users at the moment. What I don't want to do is have to divert a large amount of time to an ABI that is not stable and has no near term plan to be. This diverts attention from outstanding issues in Dapr, for example, finishing the known parts of bindings etc. I can't want to warrantee work that is not stable and not developed in a way that is of similar quality to the things that are already difficult to maintain. It seems very important to give Brendan a playground here. What I would suggest is to complete the abstraction of sandbox config, and allow the part that generically handles runtime configuration to add his project into it. Even if you do that, please do not document this at the http-wasm level, as it is not in any way like the existing code. That it happens to work, experimentally, is fine. Would that work? |
Here are some points that have been refered above:
Furthermore, here are some personal opinions: Since http-wasm will support HTTP clients, introducing wasi-http into http middleware is not necessary and adds extra complexity(also wasi-http is still in an early Phase 1). Of course, if brendandburns is interested, I think it could be implemented as a completely different component. For wasm binding, I believe introducing wasi-http could be useful because it enriches functionality. Some end users may try more radical technical solutions. http wasm middleware also started with a basic solution and then evolved based on user usage and feedback. Adding wasi-http would have no impact on the existing work. If we add a CRD configuration for wasi-http, this will be completed in a subsequent PR. |
It sounds like there is rough consensus around two things:
Does that make sense to everyone? If so I will refactor this PR to achieve those to goals. |
I think the cleanest way out is to make a different package as it would be a separate component. "middleware/http/wasm" was less redundant than "middleware/http/http_wasm", but if it helps to clarify things, we can rename the existing package to this and make "middleware/http/wasi_http". I really think that making wasi-http a virulent top-level config is a compromise but not a good way out for anyone except those who want to demo a partially functional wasi-http. Particularly, it can cause version locks with wazero, wasi-go itself is experimental, as well wasi-http within it. Even wazero I was once told "don't patch this anymore" and it really seems strange people are ok now with layers of experimental stuff as virulent config. I don't really want to spend much more energy on this topic as it displaces the backlog and we have people waiting for other things for several months. ITOW, you (project team) decide what to do |
I think the following are certain for adding wasi-http in middleware. There is no obvious benefit to adding wasi-http, even added it may be a little difficult to attract real end users for now. Since we have some in-progress PRs, supporting this also brings extra review and maintenance work. My suggestion is to temporarily abandon the use of wasi-http in middleware and reconsider this in the future when it is production-ready or has actual users. I think it may be easier to reach a consensus by first considering support in the binding component. Also, considering implementing a pluggable component may be a better start, as it will not affect the dapr codebase, so we don't have to worry about version locks with wazero. Then, with users' usage and the improvement of the component, it is entirely possible to migrate to build-in in the future (Just an idea, I am open to this since the combination of support wasm binding in pluggable also seems weird). One more thing, a good demo and document may better attract potential users. |
cc @daixiang0 |
I am happy to see wasi-http support in wasm middleware when it becomes stable, no longer experimental. For now, I do not see any real benefit to use wasi-http like performance improvement, or any end user want to use it (if I miss, please point out). @Taction when will http-wasm support HTTP client? That would be cool and support more cases. @brendandburns thanks for your contribution, could you create an issue for this to track? |
@daixiang0 so no one has formally asked in any issue in any repo for http client in http-wasm, yet. @yaron2 mentioned incidentally to me once that if http-wasm had the client dapr would probably use it. Also, some folks have mentioned in abstract that for envoy proxy-wasm ports they might also. However, as it is a high effort thing to do well, I was waiting for someone to care enough to have something concrete in mind and at least raise an issue to request it. For example, when we developed the host side, we had concrete use cases and they are in the test suite. TL;DR; if you want the client, ask and say what use cases need to be supported is a good first step.. the most specific the better as they end up as tests and benchmarks. The issue can be in the host repo, the site/abi repo, any of the repos really. Of course if you raise it and it goes nowhere, or not fast enough.. there's your justification to split things. However, because this org was created for dapr (owners are also dapr members), seems worthwhile to at least raise an issue. |
Ok, got it. |
My understanding of wasm from the development ecosystem maturity perspective is basic, I am happy to see long time Dapr+Wasm contributors giving their opinions here. I will say that the ability to call back into the Dapr sidecar via HTTP from a middleware executed Wasm module is valuable, as it gives developers the ability to inspect the incoming request empowered by knowledge of external state and the ability to publish events when a policy in the middleware needs to notify other actors of the request evaluation, as some very concrete examples. While this PR adds a valuable capability for developers writing middleware, I (and I believe other maintainers on this repo) might prefer to adhere to guidance provided by long standing Wasm contributors to the project. |
Thanks @yaron2 for any and all approaches, examples are good.
So which normal go code does this now (in dapr) because how it works changes things (is event small ok to buffer fully, proto, json, etc)? these things come out in the examples and help us work backwards for the http layer, but not skip steps for the full picture. For example, some things plainly don't compile or run regardless of http. The examples I find easiest are pointers to an existing code already in go that if converted says "yes" |
I agree with @yaron2 here. I am happy to see this contribution and I agree that, even in absence of an explicit user request, this is a valuable feature, and something that ideally we want to see in Dapr. At the same time, in light of what I'm reading in this thread, I understand from the expert contributors that it may be too early for this feature. I hope that we can still see support for making HTTP calls from WASM in Dapr when the ecosystem has matured enough. Hopefully @brendandburns or someone else will be able to help contributing that. |
Two things to add:
The basic code looks like: res, err := client.Get("http://127.0.0.1:3500/v1.0/state/inmemory/count")
if err != nil {
panic(err.Error())
}
defer res.Body.Close()
count := 0
if res.StatusCode == http.StatusOK {
data, err := ioutil.ReadAll(res.Body)
if err != nil {
panic(err.Error())
}
count, err = strconv.Atoi(string(data))
if err != nil {
panic(err.Error())
}
}
count = count + 1
res, err = client.Post("http://127.0.0.1:3500/v1.0/state/inmemory", "application/json", makeBuffer(count))
if err != nil {
fmt.Println(err.Error())
} That is idiomatic Go code for performing basic HTTP, but it's using the wasi-http transport. There's more that we can do to clean that up including sending in some environment variables for the Dapr host and port, but I think that it is a reasonable place to start. Given that the complete Dapr API is available over raw JSON + HTTP, I think that this is a viable path forward for now to enable basic connectivity to Dapr from WASM middleware, I think that is a valuable use case. Additionally, this client will support a wide variety of languages, I'm happy to provide examples in Typescript, Go, C, .NET or Rust, all of which are supported for |
TLDR; Suggest putting this in bindings first, adding a well documention. Collect user's feedback and fix the issues of upstream sdk that may arise in this process. Then it would be a good time to bring it into middleware also. Not only the ABI standard but whether the Go host SDK is production-ready and is actively maintained are important factors too. I've had a cursory look at the abi and upstream repositories, it is definitely fantastic, but I think that dapr is probably an early adopter, and there may have potential issues with the upstream sdk, so I incline not to introduce it to all wasm components at once. Therefore, I hope we can first introduce it into the wasm binding, where we can collect user feedback and improve the upstream sdk. Then integrating it into middleware will be a relatively simple task. For maintainers who may concern, I think it is acceptable for this temporary inconsistency between middleware and binding. End users may not be familiar with wasm, but they may be willing to try new features. When a user discovers this feature supported in binding but also wants to use it in middleware, then we will have end users with use cases. The end users' use case is also important, for example, the demo code is definitely a good example, but should never be used in production. In practical use, wasm often has multiple instances, so the example given may have concurrency issues in a real environment. As an example it is good, but use cases and good documentation are also important. Additionally, I think I have misunderstood that http-wasm is going to support http client, apologize for any misinformation it may cause. I created an issue upstream to discuss the possibility of http-wasm supporting an http client and the definition of its ABI. Proxy-wasm also has similar capabilities, which I think would be an enhancement for http-wasm. |
fwiw, in case it wasn't clear, I'm the one who wrote the Sounds like output binding is the best place to start, so I will send a different PR to add that capability. |
@brendandburns Thanks for the commitment and bringing wasi-http to Dapr, also all the awesome work you have done! This is really useful in Dapr. I am working on the global CRD config of Wasm in Dapr, and feel free to ping me if there is anything I can help. |
#3007 adds this to the wasm binding. |
@brendandburns since the other PR was merged, do you think this is still relevant or should we proceed to close it? |
This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
This pull request has been automatically closed because it has not had activity in the last 37 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
Description
This adds the optional ability to make HTTP calls outbound from WebAssembly middleware.
There is an example of such a middleware here:
https://github.com/dev-wasm/dev-wasm-go/blob/codespace-brendandburns-orange-sniffle-4r79rxw4x5h76xr/dapr/main.go
Which uses an
inmemory
Dapr state store to count the number of requests.Checklist
Please make sure you've completed the relevant tasks for this PR, out of the following list: