-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use method schemas to limit inbound argument shapes #5955
Comments
idea: liveslots gives the schema checker the serialized arguments and a function named One downside is a reduction in generality of our platform: promises-in-arguments becomes the exception, rather than the rule, and userspace must do something explicit to accept them. Objects without a schema would receive lame Promises, which would never fire, and that would hard to discover. OTOH, if the schema is expressed as part of the Far() declaration, we could say that the default schema just accepts everything and subscribes to all promises it gets, maintaining the status quo (and local method-bearing objects cannot be serialized at all unless they're explicitly marked as Far). We'd be increasing the coupling of liveslots and userspace a bit: liveslots currently accepts anything that Ideally all of this would be driven by userspace calling |
absent mailbox access, the scope goes down |
From Wallet meeting today:
|
Done: #6098 |
@erights Is your side of this done? |
In review at #6057 |
What is the Problem Being Solved?
#5954 seeks to reduce memory-consumption attacks by unilaterally rejecting old unresolved inbound argument promises. It assumes that most inbound argument promises are short-lived, so any that remain long enough are part of an attack. If userspace did not see the doomed promise, it will not notice the (surprising, not kernel-sanctioned) rejection.
If an attacker sends:
(or, to avoid a per-message size limit constraint):
and the victim implements:
then those extra promises will be ignored, and we use the #5954 technique to eventually reject them and shed the memory obligations. This wouldn't kick in until some number (perhaps 10,000) promises had been accumulated that way, to reduce (but not eliminate) the risk of rejecting an innocent promise that just took too long to resolve.
But, it would be great if we could learn that they were unused by userspace earlier, so we could remove them from
importedVPIDs
right away, during thedispatch.deliver
that unserialized them a moment before.@erights is working on a scheme to allow Far objects to declare a schema. This would name all of their exposed methods, and for each one it would define a pattern that limits the acceptable arguments it accepts. On one level, this acts as a standardized declaration of argument preconditions: the kind of thing we usually implement with a batch of
assert
s at the top of the function. The primary goal is safety and brevity: methods could declare their schema instead of writing a set of asserts (which might be incomplete). It might be possible to extract TypeScript data about the method and semi-automatically construct the schema record.But if this structure is visible to liveslots (or other holders of the Far object), then it could also be used to test recently-unserialized arguments before invoking the target. If liveslots knew that the schema was not met, it would also know that the userspace function was never invoked. Therefore any new promises in those arguments could not possibly have had their
.then
invoked, removing liveslots' obligation to track and resolve them. The liveslotsdispatch.deliver
code would look something like:makePipelinedPromise
somehow)This could either complement the #5954 work, or serve as an alternative. If attackers cannot submit extraneous arguments or properties to userspace, they can't trick it into retaining extra data (depending upon how tight the schema limits are), and if liveslots can learn that it doesn't need to track the promise, they can't force liveslots to retain it either. If both were implemented, we might hope that innocent promises would never get rejected.
One concern is how practical it is for a vat to fully-schema-limit every externally-visible object. These are all defined by Far, but since schemas aren't even implemented yet, it may be a while until all vats (and contracts, and supporting libraries) have added schemas for all their Far objects. Until that point, any such unadorned object would be a vector for spurious arguments, and thus excess memory consumption.
Foolscap "Constraints"
Incidentally, this overlaps precisely with my own Foolscap serialization work, some 15-ish years ago. I designed the constraints system to work closely with the deserializer, so the inbound wire parser (tokenizing length-prefixed atoms) could reject a token that was too large to meet the constraints of the next node of the expected object graph. This allowed the deserializer to compute the largest memory burden an attacker could impose before either 1: the token was accepted as meeting the schema and added to the growing argument object graph, or 2: the connection was dropped.
In practice, I found it difficult and annoying to impose meaningful constraints on enough exposed objects to keep this largest-burden limit below infinity. And tight constraints were more likely to accidentally fire on moderate-sized lists, than to catch actual attackers. I was advised to replace the one-token-at-a-time defense with a per-user memory-limited proxy process, basically pre-deserializing the arguments for any particular method in process A (which operated under a
ulimit
memory contraint), then only if that didn't kill the process, the original serialized data would be sent to the real process B (where now we know an upper bound on the memory required to deserialize).I never got around to implementing that scheme (nor ripping out the earlier one entirely), partially because it required coordination beyond a mere communications library (spawning processes, setting OS-specific
ulimit
s), and partially because it exposed a difficult question of who the guilty principal should be. Which messages should share a process A, such that one oversized message would prevent the others from being delivered?In Foolscap the easy answer was the connection: Tub X sends messages to Tub Y, and Y would react to an oversized message by severing the connection from X. And that's how the Constraint/per-token scheme did it. But that gets us into the same sort of provoked-attack-hurts-innocent-victim issue that #5953 is concerned about. If X is relaying (or incorporating data from) messages from W, and W is the one who decides to attack, then X may lose the connection even though W is to blame. We could say that X should look at its messages more carefully, but then we're also obligated to provide tools to help it do so.
In Foolscap, this was improved somewhat because each link between Tubs applied the constraints on both ends: during serialization on the sender, as well as during deserialization on the receiver. A well-behaving client would get a local error if it violated the constraints, which would reject the message but not drop the connection. (The object graph was serialized in a streaming fashion, so the good tokens were already sent over the wire, but the protocol included a cancellation/pop-the-stack atom, so the recipient could be told to discard the partially-received-but-not-yet-schema-violating object graph before the memory limits were exceeded: this salvaged the connection for later messages). Of course, the real protection is applied when the message is checked again, by the remote side during receipt, and the consequences for violation at that point are more severe/traumatic. In that scheme an unwitting relay could still be saved against connection drops by their own outgoing checks.
Description of the Design
liveslots(later)I don't know what @erights 's schema is going to look like, but what I heard sounded like it would overlap with the Patterns we currently use on Stores (either to limit what data can be put in them, or to write queries for
.keys()
and.values()
that only return matching items). And I think they would be added as part of an augmentedFar()
function, either an extra argument, or theiface
argument might grow from just a string to some larger string+schema object. And there would be some API for taking a Remotable and submitting arguments for consideration.Liveslots would need to know how much to trust this schema-checking facility. Is it under userspace control? Could it call back into liveslots (reentrantly)? Will it hold onto the arguments we submit for conformance? Is it guaranteed to be synchronous?
This will influence how (and when) liveslots will call it. Obviously we must wait until after we've identified the target (which might be a promise) and deserialized the arguments. If we aren't worried about reentrance or async, then we'd probably wait until after the schema check to
subscribe
to the surviving imported promises. If we can't do that, then we'll need some sort ofsyscall.unsubscribe
to unwind the kernel-facing obligations that a rejected message should not have incurred.Security Considerations
On the whole this should improve security, both by making it easier (and more legible/auditable) to define and enforce a method's expectations, and to give code earlier in the delivery process (e.g. liveslots) an opportunity to drop or reject things that won't end up being used. In the long run, we might expose these schemas to the kernel (they're a great candidate for #2069 auxdata, assuming we could avoid tracking a separate copy for each instance), from which we could push them upstream into other vats (as part of their Presence), which could do a schema check before even sending the message. That would result in faster (local) errors, better-correlated with the sender (still not in the same turn, but at least on the same vat).
This is unlikely to help with return values, or the resolution values of inbound Promises. Or, if the method schema includes things like "This argument must be a Promise that resolves to an Amount record", enforcement of the schema will cause a delay and/or ordering difference: we can't deliver that message until after the input promise is resolved. So, do we allow messages to be delivered out-of-order, to reduce the buffering we do? Or do we delay all subsequent messages, so they'll all be delivered in order? How many messages are we willing to buffer? Does that open up a memory attack against the schema-checking code itself?
Performance Considerations
I am worried about the performance cost of doing this schema enforcement. I think we have a history of adding nominally-invisible mechanisms like this, assuming they'll be cheap enough (based on our experience with "big" engines like V8/Node.js), and then not measuring the actual peformance impact on a "small" engine like XS (which, in the end, is the one we actually care about). So I'd want us to build a benchmark and look at the numbers before enabling this in production.
The Constraint checking I added to Foolscap had a pretty significant performance impact (admittedly, some of that was overengineering in the serialization code).
Test Plan
Lots of unit tests, first for the underlying mechanism, and then extensive tests of the liveslots integration.
The text was updated successfully, but these errors were encountered: