-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API to fork a vat (create "zygote vat") #2268
Comments
@katelynsills points out that currently Zoe's "install" merely stores the contract code in a table, while the "instantiate" step does all of (create a new vat, load ZCF into it, send the contract code to ZCF, ZCF evaluates the contract code, ZCF instantiates the contract). We'd need to split up that process just after the agoric-sdk/packages/zoe/src/contractFacet/contractFacet.js Lines 460 to 476 in 60a4adf
|
My instinct would be to follow the Unix |
Yes |
One open question: how should Promises be handled, if at all? Surely we cannot allow both parent and child vats to have Decider authority over the same promise. We might need to declare that any unresolved promises owned by the parent vat will be rejected during the fork. Maybe we can have the forker list those promises in @FUDCo hm, the advantage of forking from the outside is that we know the vat is idle. If we let vats fork themselves from the inside, we'd need to.. hm, replicate the crank that did the fork? And have it return different values in the replica? So the syscalls could diverge starting from that point (we'd replicate the delivery, but not expect the transcript to match). Interesting. I'll have to noodle on that. |
@warner You make a good point about wanting to snapshot between cranks. Perhaps (now getting really speculative here) |
I think the promise issue is a symptom of a deeper issue that disqualifies a unix-like fork model. I think Moddable's build-time vs run-time split is the better fit. This is not part of xsnap btw, but is a good model we can follow with snapshots using xsnap. The deeper issue is that the execution prior to snapshot cannot have general entanglement with the outside world. As bad as the decider problem is, the problem of incoming references to objects hosted by the vat is at least as bad. Which of the two descendants of the fork is designated by that exported capability? During XS build-time, the capabilities to I/O devices are in scope but inactive. They don't lead to anything. After build-time is when they cut a ROM. Then each device gets a separate copy of the ROM. Only in the device do these I/O caps control actual devices. These devices didn't even exist in the build-time environment. Translating this into the running of zcf initialization up through |
@erights, does this imply that the vat needs to be marked as snapshottable when it is started, so the kernel knows not to give it access to capabilities that should be reserved for the children? |
Yes, I think so. Slogan: "Zygotes are not yet in the world" |
@erights My concept would be that when the vat launched from a snapshot starts up, there are no references anywhere else that point into it (with the possible exception of a root reference that is a product of the launch-from-snapshot operation itself), nor is it at that point the decider for any promise. It could, however, hold references in its snapshotted state to outside objects to which it could send messages to establish whatever connectivity it needed to do its job. I'm not sure how that compares to Moddable's model since I'm not yet familiar with that. |
In the API I wrote up above, the original exports remain firmly attached to the original vat. The act of calling I concur that the amount of connectivity before snapshot should be minimal. Certainly whatever the vat-to-be-forked has access to needs to be non-specific to the variable descendants. If each contract instance has a distinct ZCF agent, and Zoe offers a different internal facet to each such ZCF agent, then the instantiation process needs to give Zoe a moment where it can create a new such facet. That might suggest changing the zoe/zcf relationship to avoid early backpointers. My concern with the Moddable ROM model is it might limit us to "linear lineages" of vats, in which you can't re-use snapshots for multiple children: you could evaluate the code early, perhaps while you're idle, but you couldn't amortize the evaluation/startup costs among multiple vats which share an early upbringing. I think you need something like The "IO devices which are new on each copy" are the moral equivalent of a vat import which needs to be replaced by some instance-specific version (the comity question). Rather than answering that question, I'd make the imports shared, and duplicate/replace the exports, which does require some careful planning to make sure the pre-fork vat doesn't have anything you want to withhold from the post-fork vat. During the fork, whoever drives the fork should create the new authorities that should only be held by the child, and deliver them after fork(). The handle with which you deliver them (to the child, and not to the zygote parent) is the |
This is all plausible if necessary. This is a large design space all by itself. I suggest that we first figure out how little connectivity we can get away with specifically for Zoe installations. That will let us know which subproblem we need to cover first. I am all in favor of more generality, but this will give us a more concrete notion of what we're generalizing from. I am hopeful that the answer is very little connectivity but we don't know yet. We also don't yet know what the shape of the very-little-connectivity is. I think we can figure this out quickly. |
snapshot after SES shim + lockdown(): some performance numbersI made a few performance measurements: snapshots should save ~4x on SES startup time:
When considering how often to snapshot vs. how much of a transcript to replay, the time cost of writing a snapshot is relevant:
TODO: measure duration of some cranks. Code to do the measuring is on a branch: https://github.com/Agoric/agoric-sdk/tree/2268-xs-ses-perf 9075673 It writes something like...
I made a spreadsheet with a goofy timeline chart. See also #1318 (comment) for some notes on timing diagrams from @warner . One possible approach to drawing these diagrams: Slope Chart / D3 / Observable |
I think I see a way to approach this at the vat manager level: split It seemed like we should be able to load and evaluate all the code as pure modules and take a snapshot and then thread the parts together after resuming. But that's awkward: the communications port gets curried into a The code in The C code can replace
And Once |
note to self: this will require removing We use |
What is the Problem Being Solved?
Once we have XS snapshots working (#511, #2138), we should be able to reload a saved vat from snapshot faster than replaying the vat's entire transcript.
Most of the vats on our platform will be built in the same way:
With snapshots, we ought to be able to interrupt that sequence at any point, save the resulting XS state, and re-use it multiple times. This won't save any memory (each vat still gets its own XS engine, and its own heap), but it should save a lot of startup time.
Description of the Design
We'll need an API for this. What I'm thinking is that the vat-admin facet for a dynamic vat should acquire a
fork()
method. If you hold this facet, you can call[newrefs] = await admin~.fork([oldrefs])
. For eacholdref
that was a Presence whose Remotable lived on the old vat, you'll get a new Presence that points to the corresponding object in the new vat. Any other exports will be unreachable (unless the new vat chooses to re-export them at some point). Any import that the old vat had access to will also be available to the new vat (for example, it might hold a reference to a timer service).This method is named after the Unix
fork()
syscall, which treats file descriptors in a similar way. When a process forks, the FD table is duplicated, so both parent and child have access to the same descriptors at the same offsets. The Unixfork()
is executed from the inside, however, whereas the swingset one would be applied from the outside.We might also name this
diverge()
after the E method on Arrays and Maps(?) which makes a copy of the data into a new, separately-mutable object.Zoe would create one new dynamic vat with the ZCF bundle, giving it access to the root object, but would refrain from ever sending any messages to it, preserving its independence from any specific contract. Then, when a new contract is installed (
zoe~.install(contractBundle)
), Zoe wouldfork()
that undifferentiated "ZCF zygote" vat, creating a vat for a specific contract. Then Zoe would send a command to the root object to evaluate the contract bundle, returning an object that represents the per-contract but not per-instance state (which would include the ZCF root object).Zoe would keep that "contract zygote" vat pristine too, keeping a table that maps from contract identity to a tuple of (admin facet, per-contract object). Then, when Zoe is asked to finally instantiate that contract (
zoe~.startInstance(installation, ...)
), Zoe would do something like:and then figure out the per-instance objects by asking
perContractObject
to do per-instance things, finally getting objects that are indistinguishable from what it would have gotten if it had built everything from scratch each time.It might be useful to couple this with some sort of explicit "offline vat" control. When Zoe creates the ZCF zygote vat, it's not going to interact with it, so it might tell the admin facet to save RAM by taking a snapshot and unloading the vat from memory, leaving only disk space in use (perhaps
admin~.takeOffline()
). Or, the worker scheduler might do this automatically when it sees thefork()
command, or when it notices the zygote vat hasn't had any messages sent to it for a while.Internally, the kernel needs to populate the new vat's c-list from the old one. Each import of the old vat is replicated (same kref, same vref, to match the state of the liveslots table that lives in the snapshot's heap state). The
oldrefs
are mapped to krefs, and then each export is compared against the set of krefs: for each match, a new c-list entry is created with the old vref and a newly-allocated kref. Then the krefs are put into resolution data for thefork
result promise, where they'll be translated into new import vrefs for the vat doing the fork.All per-vat data secondary storage needs to be duplicated (i.e. virtual object tables). Ideally this storage will be pretty empty, because the zygote vat should not have had any interaction with the outside world yet.
Security Considerations
Assuming the implementation is correct, I don't think the new vat will have any more authority than the one it was copied from, nor should the vat directing the fork be able to amplify its authority by forking the vat it created. There is a question of metering and resource allocation, of course, to avoid allowing a DoS attack by virtue of a forkbomb or something similar.
cc @erights @dtribble @FUDCo @Chris-Hibbert @katelynsills for your consideration
The text was updated successfully, but these errors were encountered: