-
Notifications
You must be signed in to change notification settings - Fork 256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Capabilities granularity too low #1
Comments
@sunfishcode hm, you're totally right, that turing-complete stuff in security is a nightmare. In my opinion it's though possible to engineer it (the syntax, the bytecode, etc. - see Ethereum IR) in a way that it's easy to mathematically prove its correctness. I have to admit I find the current proposal still too coarse (but sane on the other hand 😉). Maybe we shall put most of the effort into making capabilities very easily extensible in the future (both on the WASI specification level and especially on the WASI implementation level reflecting fine-grained needs - I can imagine e.g. some minimal WASI implementation and then some security-oriented WASI implementation which will provide this additional fine-grained capability specification with a "standard library" of fine-grained capability building blocks). There are also more things to consider (this relates to API as a whole, not just capabilities):
|
@sunfishcode Your explanation covers why you don't want to use iptables. It doesn't explain why capabilities are an API-level concern. |
File descriptors are just indices into a table of currently open resources, and are not inherently tree-oriented. The path APIs (currently named From feedback here and offline, it seems to make sense to at least move the path APIs into a separate module. If someone is interested in designing a database-oriented API, that would also be an interesting module (or modules) to consider. I don't have experience with multipath TCP myself, but from the blog post there, it looks like the main things needed are something like |
The answer is directly dependent on "granularity depth" of all rights/capabilities. The issue with multipath TCP is, that the "view" depends on the needs of the programmer/user at the moment. I.e. whether she wants to see So I believe we need to offer both in the API (I don't dare to propose how such an API shall look like as due to virtual networking we could in theory build even deeper trees of TCP (sub)flows...). |
In that case a Also Therefore a more general term like |
POSIX does use the term "file descriptor" for sockets, shared memory, and other resources. It's well-established traditional usage, though I can also see how it can cause confusion. In some UNIX circles the terms "object descriptor" or just "descriptor" appear, although they're sometimes used interchangeably with "file descriptor" even in the same document. I'd be ok switching to these other terms if there's general support for it; my only concern would be that they may not be as widely recognized. My guess is that |
The low granularity that exists is really nice. The paramtricity / separation logic arguments one can make are both stronger and simpler, and, coupled with https://webassembly.org/docs/future-features/#multiple-tables-and-memories, will support a rich distributive lattice of "processes" and capabilities, rather than Unix's meager discrete processes. @npmccallum If you want more legacay support, that could be some standardized extension. Proposing that certain functions always return Speaking of layering, has there been any effort going back to CloudABI and changing things to match on their end? Until/unless WASI starts leveraging the specifics of WASM's memory model, it would be nice to have a defined interface orthogonal to WASM itself. Even once WASI does employ that stuff, it would be interesting to keep around the "old WASI" as a layer above or below (depending on how things shake out.) It is my experience from both Rust and https://github.com/NixOS/nixpkgs/blob/master/lib/systems/parse.nix that combinatorially exploding interfaces are good for portability, as they focus a fine-grained capability-like view of portability (depend on exactly functionality you need, not on some arbitrary Boolean expression of larger interfaces that happen to provide it.) |
Do I understand it correctly, that the lower granularity of capabilities will then need to be managed on a lower layer in which the whole WASM runtime will run? This basically says, that to achieve higher capability granularity we'll need to deconstruct WASI apps to "services" (or alike) even smaller than the current usual "microservices" and undergo the burden of specifying in a non-portable non-standard way (each combination of a specific operating system and a specific WASM runtime implementation will have their own way) additional capabilities for the separate "microservices" (corresponding to separate memories/tables) in WASM runtime. I hope I completely misunderstood as this would basically make the current situation even worse (because of the pressure to decompose everything into even many more "microservices" than is usual).
👍 |
@dumblob I loath "microservices" so I hope not :). I could rants reams on why, but I think the short answer here is normal processes/containers are bad because while the granularity of encapsulation is small, the granularity of degrees of isolation is huge: mmap and whatnot is too hard to use on it's own so the app devolves to share everything or share nothing. Share nothing means business logic gets balkanized amid endless marshalling nonsense, and all hell ensues. The dream with capabilities is basically |
Sounds good to me, but I still can't figure out how could I accomplish that with the current "low granularity" WASI API 😢. |
Can a WASI process (are they called "processes"?) implement a file descriptor that another process calls? Normally in capability systems (and, indeed, object-oriented programming in general), the way we theoretically support arbitrary granularity is to allow anyone to implement their own classes representing whatever granularity they need, possibly layered on top of a courser-grained underlying API. In a capability-based operating system API, this means one process should be able to create a file descriptor such that when another process performs operations on that file descriptor, they end up calling back into the first process. Of course, this context switching may be a performance problem for many use cases. So we then add optimizations for common use cases. The platform/runtime has built-in support for these common cases, and can add support for new ones that prove useful over time. There's no need to predict everything in advance -- developers can get started implementing types "in userspace" and then push for runtime support later as an optimization.
Terminology nitpick: In the capability-based security theory sense, we would say that file descriptors are in fact unforgeable, in that you cannot reference some other process's file descriptors by simply using the same numbers. Compare this to capability systems based on secret strings (e.g. API keys), where anyone in the world who knows the secret string can access the capability. We say that secret strings are "forgeable" although still "unguessable". |
Two wasm instances can share a WASI file descriptor index space. And right now, if you want to pass an open file from one instance to another, it's required that they do. So in the current system, instances can indeed forge each others' file descriptors. To the extent that instances are being used like processes, that's not ideal. (This is one of the reasons I've been saying that WASI isn't Unix. WebAssembly doesn't have the a Unix-style process concept. There are fundamental differences, many of which derive from the ES6 module system that wasm inherits from JS.) In the future, reference types are coming to wasm. These are unforgeable values, in every sense of the word. One function can't even forge a reference held by another function in the same instance. This is the fine-grained capability model that some people are really excited about. And, instances can pass references to each other directly. This will likely become the core capability value primitive of WASI, with integer file descriptors being interpreted as indices into a table of references (though actual implementations may do other things internally). |
Whoa, that's exciting! |
This sounds quite solid and finally like a "full-featured" capability system (with more fine-grained possibilities), thanks. |
Just for the record, Wasm modules were not derived from ES6 modules, although there are natural similarities. There are substantial differences, too. Particularly relevant to this discussion is that Wasm modularity is much stronger, by the virtue of every module being completely closed by construction, i.e., not having any reference to an ambient outer scope or library; there are only imports, which are easily controlled. |
@rossberg Call it "designed from the outset to be compatible with" instead of "derived from" then :-). The original questions here seem answered. Certainly there's more to talk about on the topics of capabilities and networking, but these can proceed in #20 and elsewhere. Feel free to file new issues to raise new topics and questions! |
Fill out more implementations, rename to "wasi"
@npmccallum @dumblob
Continuing the discussion from bytecodealliance/wasmtime#90, now that we have a dedicated repository (Github doesn't permit issues to be transferred between repositories).
"Put the runtime in a cgroup" is a privileged operation on typical Linux systems. And, it requires a dedicated process, which many WebAssembly runtimes won't otherwise need -- WebAssembly's ability to be easily sandboxed without a process boundary is one of the key things many people are interested in it for (caveat: Spectre is a complex topic).
Lin Clark's blog post about WASI has an explanation of the capability model and why we're pursuing it for WASI. In many use cases, it's possible to set up
cgroup
s or AppArmor or SELinux or other things, however these systems all have sufficient obstacles (configuration complexity, and the need for privileged operations), and in practice they aren't always used. And, they're all process-oriented, while the capability model allows WASI to provide more fine-grained protections. And they're all tied to Linux, while we expect WASI will be desirable in a very diverse set of environments.(File descriptors are integers, and are therefore foregeable, but (a) even so they still provide some protections, and (b) in the future WASI will be able to represent capabilities as references which aren't forgeable.)
Assuming all of this is sensible, the next step is to apply these concepts to networking. We could create the analog of a directory for TCP-like sockets, which might look like a set of address/port/address/port/protocol tuples, possibly involving wildcards and/or netmasks or so, bundled up into a "capability", and presented to application code in the form of a file descriptor. That would then be the basis for something like a
bindat
system call, which would be tobind
asopenat
is toopen
.The text was updated successfully, but these errors were encountered: