-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
failed vat upgrades should be rewound #5344
Labels
Milestone
Comments
warner
added a commit
that referenced
this issue
Jul 7, 2022
This commit modifies vat-upgrade to respond correctly to failures during the upgrade process, by rewinding the vat back to it's pre-upgrade state. If a failure occurs either during the `stopVat` sent to the old version, or during the `startVat` sent to the new version, the consequences of both deliveries will be unwound, the 'upgrade-vat' event popped from the delivery queue, and the worker abandoned. The next delivery to this vat will cause a new worker to be created from the previous state. The only state change that survives is a message to vat-admin, instructing it to reject the `E(adminFacet).upgradeVat()` result promise. From the outside world, it will appear as if upgrade was never requested, except that `upgradeVat()` rejects. To implement this, I rewrote the way delivery results are handled in kernel.js . `deliverAndLogToVat()` still returns `DeliveryStatus`, but each `processXYZ` function (`processSend`, `processVatUpgrade`, etc) is now responsible for transforming that `DeliveryStatus` into a new `CrankResult`. This `CrankResult` tells `processDeliveryMessage` what to do in a uniform way: abort the crankbuffer changes, terminate a vat, deduct a meter, etc. Most processor functions use a shared `deliveryCrankResults` helper function to build their results, which takes flags to indicate whether the delivery should deduct a meter, should decrement the BOYD/reapCount counter, etc. `processCreateVat` does its conversion in a different way, to merge the results of the two deliveries (`stopVat`, `startVat`), and to abandon the process in the middle if `stopVat` failed. This will also make it trivial to begin charging a meter for the work done during `startVat`, and to report this work to the runPolicy. However it does not do so yet. closes #5344
warner
added a commit
that referenced
this issue
Jul 7, 2022
This commit modifies vat-upgrade to respond correctly to failures during the upgrade process, by rewinding the vat back to it's pre-upgrade state. If a failure occurs either during the `stopVat` sent to the old version, or during the `startVat` sent to the new version, the consequences of both deliveries will be unwound, the 'upgrade-vat' event popped from the delivery queue, and the worker abandoned. The next delivery to this vat will cause a new worker to be created from the previous state. The only state change that survives is a message to vat-admin, instructing it to reject the `E(adminFacet).upgradeVat()` result promise. From the outside world, it will appear as if upgrade was never requested, except that `upgradeVat()` rejects. To implement this, I rewrote the way delivery results are handled in kernel.js . `deliverAndLogToVat()` still returns `DeliveryStatus`, but each `processXYZ` function (`processSend`, `processVatUpgrade`, etc) is now responsible for transforming that `DeliveryStatus` into a new `CrankResult`. This `CrankResult` tells `processDeliveryMessage` what to do in a uniform way: abort the crankbuffer changes, terminate a vat, deduct a meter, etc. Most processor functions use a shared `deliveryCrankResults` helper function to build their results, which takes flags to indicate whether the delivery should deduct a meter, should decrement the BOYD/reapCount counter, etc. `processCreateVat` does its conversion in a different way, to merge the results of the two deliveries (`stopVat`, `startVat`), and to abandon the process in the middle if `stopVat` failed. This will also make it trivial to begin charging a meter for the work done during `startVat`, and to report this work to the runPolicy. However it does not do so yet. closes #5344
warner
added a commit
that referenced
this issue
Jul 7, 2022
This commit modifies vat-upgrade to respond correctly to failures during the upgrade process, by rewinding the vat back to it's pre-upgrade state. If a failure occurs either during the `stopVat` sent to the old version, or during the `startVat` sent to the new version, the consequences of both deliveries will be unwound, the 'upgrade-vat' event popped from the delivery queue, and the worker abandoned. The next delivery to this vat will cause a new worker to be created from the previous state. The only state change that survives is a message to vat-admin, instructing it to reject the `E(adminFacet).upgradeVat()` result promise. From the outside world, it will appear as if upgrade was never requested, except that `upgradeVat()` rejects. To implement this, I rewrote the way delivery results are handled in kernel.js . `deliverAndLogToVat()` still returns `DeliveryStatus`, but each `processXYZ` function (`processSend`, `processVatUpgrade`, etc) is now responsible for transforming that `DeliveryStatus` into a new `CrankResult`. This `CrankResult` tells `processDeliveryMessage` what to do in a uniform way: abort the crankbuffer changes, terminate a vat, deduct a meter, etc. Most processor functions use a shared `deliveryCrankResults` helper function to build their results, which takes flags to indicate whether the delivery should deduct a meter, should decrement the BOYD/reapCount counter, etc. `processCreateVat` does its conversion in a different way, to merge the results of the two deliveries (`stopVat`, `startVat`), and to abandon the process in the middle if `stopVat` failed. This will also make it trivial to begin charging a meter for the work done during `startVat`, and to report this work to the runPolicy. However it does not do so yet. closes #5344
warner
added a commit
that referenced
this issue
Jul 7, 2022
This commit modifies vat-upgrade to respond correctly to failures during the upgrade process, by rewinding the vat back to it's pre-upgrade state. If a failure occurs either during the `stopVat` sent to the old version, or during the `startVat` sent to the new version, the consequences of both deliveries will be unwound, the 'upgrade-vat' event popped from the delivery queue, and the worker abandoned. The next delivery to this vat will cause a new worker to be created from the previous state. The only state change that survives is a message to vat-admin, instructing it to reject the `E(adminFacet).upgradeVat()` result promise. From the outside world, it will appear as if upgrade was never requested, except that `upgradeVat()` rejects. To implement this, I rewrote the way delivery results are handled in kernel.js . `deliverAndLogToVat()` still returns `DeliveryStatus`, but each `processXYZ` function (`processSend`, `processVatUpgrade`, etc) is now responsible for transforming that `DeliveryStatus` into a new `CrankResult`. This `CrankResult` tells `processDeliveryMessage` what to do in a uniform way: abort the crankbuffer changes, terminate a vat, deduct a meter, etc. Most processor functions use a shared `deliveryCrankResults` helper function to build their results, which takes flags to indicate whether the delivery should deduct a meter, should decrement the BOYD/reapCount counter, etc. `processCreateVat` does its conversion in a different way, to merge the results of the two deliveries (`stopVat`, `startVat`), and to abandon the process in the middle if `stopVat` failed. This will also make it trivial to begin charging a meter for the work done during `startVat`, and to report this work to the runPolicy. However it does not do so yet. closes #5344
warner
added a commit
that referenced
this issue
Jul 7, 2022
This commit modifies vat-upgrade to respond correctly to failures during the upgrade process, by rewinding the vat back to it's pre-upgrade state. If a failure occurs either during the `stopVat` sent to the old version, or during the `startVat` sent to the new version, the consequences of both deliveries will be unwound, the 'upgrade-vat' event popped from the delivery queue, and the worker abandoned. The next delivery to this vat will cause a new worker to be created from the previous state. The only state change that survives is a message to vat-admin, instructing it to reject the `E(adminFacet).upgradeVat()` result promise. From the outside world, it will appear as if upgrade was never requested, except that `upgradeVat()` rejects. To implement this, I rewrote the way delivery results are handled in kernel.js . `deliverAndLogToVat()` still returns `DeliveryStatus`, but each `processXYZ` function (`processSend`, `processVatUpgrade`, etc) is now responsible for transforming that `DeliveryStatus` into a new `CrankResult`. This `CrankResult` tells `processDeliveryMessage` what to do in a uniform way: abort the crankbuffer changes, terminate a vat, deduct a meter, etc. Most processor functions use a shared `deliveryCrankResults` helper function to build their results, which takes flags to indicate whether the delivery should deduct a meter, should decrement the BOYD/reapCount counter, etc. `processCreateVat` does its conversion in a different way, to merge the results of the two deliveries (`stopVat`, `startVat`), and to abandon the process in the middle if `stopVat` failed. This will also make it trivial to begin charging a meter for the work done during `startVat`, and to report this work to the runPolicy. However it does not do so yet. closes #5344
warner
added a commit
that referenced
this issue
Jul 9, 2022
This commit modifies vat-upgrade to respond correctly to failures during the upgrade process, by rewinding the vat back to it's pre-upgrade state. If a failure occurs either during the `stopVat` sent to the old version, or during the `startVat` sent to the new version, the consequences of both deliveries will be unwound, the 'upgrade-vat' event popped from the delivery queue, and the worker abandoned. The next delivery to this vat will cause a new worker to be created from the previous state. The only state change that survives is a message to vat-admin, instructing it to reject the `E(adminFacet).upgradeVat()` result promise. From the outside world, it will appear as if upgrade was never requested, except that `upgradeVat()` rejects. To implement this, I rewrote the way delivery results are handled in kernel.js . `deliverAndLogToVat()` still returns `DeliveryStatus`, but each `processXYZ` function (`processSend`, `processVatUpgrade`, etc) is now responsible for transforming that `DeliveryStatus` into a new `CrankResult`. This `CrankResult` tells `processDeliveryMessage` what to do in a uniform way: abort the crankbuffer changes, terminate a vat, deduct a meter, etc. Most processor functions use a shared `deliveryCrankResults` helper function to build their results, which takes flags to indicate whether the delivery should deduct a meter, should decrement the BOYD/reapCount counter, etc. `processCreateVat` does its conversion in a different way, to merge the results of the two deliveries (`stopVat`, `startVat`), and to abandon the process in the middle if `stopVat` failed. This will also make it trivial to begin charging a meter for the work done during `startVat`, and to report this work to the runPolicy. However it does not do so yet. closes #5344
warner
added a commit
that referenced
this issue
Jul 12, 2022
This commit modifies vat-upgrade to respond correctly to failures during the upgrade process, by rewinding the vat back to it's pre-upgrade state. If a failure occurs either during the `stopVat` sent to the old version, or during the `startVat` sent to the new version, the consequences of both deliveries will be unwound, the 'upgrade-vat' event popped from the delivery queue, and the worker abandoned. The next delivery to this vat will cause a new worker to be created from the previous state. The only state change that survives is a message to vat-admin, instructing it to reject the `E(adminFacet).upgradeVat()` result promise. From the outside world, it will appear as if upgrade was never requested, except that `upgradeVat()` rejects. To implement this, I rewrote the way delivery results are handled in kernel.js . `deliverAndLogToVat()` still returns `DeliveryStatus`, but each `processXYZ` function (`processSend`, `processVatUpgrade`, etc) is now responsible for transforming that `DeliveryStatus` into a new `CrankResult`. This `CrankResult` tells `processDeliveryMessage` what to do in a uniform way: abort the crankbuffer changes, terminate a vat, deduct a meter, etc. Most processor functions use a shared `deliveryCrankResults` helper function to build their results, which takes flags to indicate whether the delivery should deduct a meter, should decrement the BOYD/reapCount counter, etc. `processCreateVat` does its conversion in a different way, to merge the results of the two deliveries (`stopVat`, `startVat`), and to abandon the process in the middle if `stopVat` failed. This will also make it trivial to begin charging a meter for the work done during `startVat`, and to report this work to the runPolicy. However it does not do so yet. closes #5344
warner
added a commit
that referenced
this issue
Jul 12, 2022
This commit modifies vat-upgrade to respond correctly to failures during the upgrade process, by rewinding the vat back to it's pre-upgrade state. If a failure occurs either during the `stopVat` sent to the old version, or during the `startVat` sent to the new version, the consequences of both deliveries will be unwound, the 'upgrade-vat' event popped from the delivery queue, and the worker abandoned. The next delivery to this vat will cause a new worker to be created from the previous state. The only state change that survives is a message to vat-admin, instructing it to reject the `E(adminFacet).upgradeVat()` result promise. From the outside world, it will appear as if upgrade was never requested, except that `upgradeVat()` rejects. To implement this, I rewrote the way delivery results are handled in kernel.js . `deliverAndLogToVat()` still returns `DeliveryStatus`, but each `processXYZ` function (`processSend`, `processVatUpgrade`, etc) is now responsible for transforming that `DeliveryStatus` into a new `CrankResult`. This `CrankResult` tells `processDeliveryMessage` what to do in a uniform way: abort the crankbuffer changes, terminate a vat, deduct a meter, etc. Most processor functions use a shared `deliveryCrankResults` helper function to build their results, which takes flags to indicate whether the delivery should deduct a meter, should decrement the BOYD/reapCount counter, etc. `processCreateVat` does its conversion in a different way, to merge the results of the two deliveries (`stopVat`, `startVat`), and to abandon the process in the middle if `stopVat` failed. This will also make it trivial to begin charging a meter for the work done during `startVat`, and to report this work to the runPolicy. However it does not do so yet. closes #5344
warner
added a commit
that referenced
this issue
Jul 12, 2022
This commit modifies vat-upgrade to respond correctly to failures during the upgrade process, by rewinding the vat back to it's pre-upgrade state. If a failure occurs either during the `stopVat` sent to the old version, or during the `startVat` sent to the new version, the consequences of both deliveries will be unwound, the 'upgrade-vat' event popped from the delivery queue, and the worker abandoned. The next delivery to this vat will cause a new worker to be created from the previous state. The only state change that survives is a message to vat-admin, instructing it to reject the `E(adminFacet).upgradeVat()` result promise. From the outside world, it will appear as if upgrade was never requested, except that `upgradeVat()` rejects. To implement this, I rewrote the way delivery results are handled in kernel.js . `deliverAndLogToVat()` still returns `DeliveryStatus`, but each `processXYZ` function (`processSend`, `processVatUpgrade`, etc) is now responsible for transforming that `DeliveryStatus` into a new `CrankResult`. This `CrankResult` tells `processDeliveryMessage` what to do in a uniform way: abort the crankbuffer changes, terminate a vat, deduct a meter, etc. Most processor functions use a shared `deliveryCrankResults` helper function to build their results, which takes flags to indicate whether the delivery should deduct a meter, should decrement the BOYD/reapCount counter, etc. `processCreateVat` does its conversion in a different way, to merge the results of the two deliveries (`stopVat`, `startVat`), and to abandon the process in the middle if `stopVat` failed. This will also make it trivial to begin charging a meter for the work done during `startVat`, and to report this work to the runPolicy. However it does not do so yet. closes #5344
warner
added a commit
that referenced
this issue
Jul 13, 2022
This commit modifies vat-upgrade to respond correctly to failures during the upgrade process, by rewinding the vat back to it's pre-upgrade state. If a failure occurs either during the `stopVat` sent to the old version, or during the `startVat` sent to the new version, the consequences of both deliveries will be unwound, the 'upgrade-vat' event popped from the delivery queue, and the worker abandoned. The next delivery to this vat will cause a new worker to be created from the previous state. The only state change that survives is a message to vat-admin, instructing it to reject the `E(adminFacet).upgradeVat()` result promise. From the outside world, it will appear as if upgrade was never requested, except that `upgradeVat()` rejects. To implement this, I rewrote the way delivery results are handled in kernel.js . `deliverAndLogToVat()` still returns `DeliveryStatus`, but each `processXYZ` function (`processSend`, `processVatUpgrade`, etc) is now responsible for transforming that `DeliveryStatus` into a new `CrankResult`. This `CrankResult` tells `processDeliveryMessage` what to do in a uniform way: abort the crankbuffer changes, terminate a vat, deduct a meter, etc. Most processor functions use a shared `deliveryCrankResults` helper function to build their results, which takes flags to indicate whether the delivery should deduct a meter, should decrement the BOYD/reapCount counter, etc. `processCreateVat` does its conversion in a different way, to merge the results of the two deliveries (`stopVat`, `startVat`), and to abandon the process in the middle if `stopVat` failed. This will also make it trivial to begin charging a meter for the work done during `startVat`, and to report this work to the runPolicy. However it does not do so yet. closes #5344
warner
added a commit
that referenced
this issue
Jul 14, 2022
This commit modifies vat-upgrade to respond correctly to failures during the upgrade process, by rewinding the vat back to it's pre-upgrade state. If a failure occurs either during the `stopVat` sent to the old version, or during the `startVat` sent to the new version, the consequences of both deliveries will be unwound, the 'upgrade-vat' event popped from the delivery queue, and the worker abandoned. The next delivery to this vat will cause a new worker to be created from the previous state. The only state change that survives is a message to vat-admin, instructing it to reject the `E(adminFacet).upgradeVat()` result promise. From the outside world, it will appear as if upgrade was never requested, except that `upgradeVat()` rejects. To implement this, I rewrote the way delivery results are handled in kernel.js . `deliverAndLogToVat()` still returns `DeliveryStatus`, but each `processXYZ` function (`processSend`, `processVatUpgrade`, etc) is now responsible for transforming that `DeliveryStatus` into a new `CrankResult`. This `CrankResult` tells `processDeliveryMessage` what to do in a uniform way: abort the crankbuffer changes, terminate a vat, deduct a meter, etc. Most processor functions use a shared `deliveryCrankResults` helper function to build their results, which takes flags to indicate whether the delivery should deduct a meter, should decrement the BOYD/reapCount counter, etc. `processCreateVat` does its conversion in a different way, to merge the results of the two deliveries (`stopVat`, `startVat`), and to abandon the process in the middle if `stopVat` failed. This will also make it trivial to begin charging a meter for the work done during `startVat`, and to report this work to the runPolicy. However it does not do so yet. closes #5344
turadg
pushed a commit
that referenced
this issue
Jul 14, 2022
This commit modifies vat-upgrade to respond correctly to failures during the upgrade process, by rewinding the vat back to it's pre-upgrade state. If a failure occurs either during the `stopVat` sent to the old version, or during the `startVat` sent to the new version, the consequences of both deliveries will be unwound, the 'upgrade-vat' event popped from the delivery queue, and the worker abandoned. The next delivery to this vat will cause a new worker to be created from the previous state. The only state change that survives is a message to vat-admin, instructing it to reject the `E(adminFacet).upgradeVat()` result promise. From the outside world, it will appear as if upgrade was never requested, except that `upgradeVat()` rejects. To implement this, I rewrote the way delivery results are handled in kernel.js . `deliverAndLogToVat()` still returns `DeliveryStatus`, but each `processXYZ` function (`processSend`, `processVatUpgrade`, etc) is now responsible for transforming that `DeliveryStatus` into a new `CrankResult`. This `CrankResult` tells `processDeliveryMessage` what to do in a uniform way: abort the crankbuffer changes, terminate a vat, deduct a meter, etc. Most processor functions use a shared `deliveryCrankResults` helper function to build their results, which takes flags to indicate whether the delivery should deduct a meter, should decrement the BOYD/reapCount counter, etc. `processCreateVat` does its conversion in a different way, to merge the results of the two deliveries (`stopVat`, `startVat`), and to abandon the process in the middle if `stopVat` failed. This will also make it trivial to begin charging a meter for the work done during `startVat`, and to report this work to the runPolicy. However it does not do so yet. closes #5344
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
What is the Problem Being Solved?
Currently, a successful vat upgrade works pretty well.
Upgrades fail if the new version's
buildRootObject
throws an exception (or returns a Promise that rejects or fails to resolve by the end of the crank), or it fails todefineDurableKind()
for all Kinds inherited from its predecessor.Currently, a failed upgrade is not handled correctly. What's supposed to happen is that the upgrade gets rewound, and we're left with the original version humming along, all its data still in place, ready to receive more messages. The caller of
adminFacet~.upgrade()
gets a rejected result promise, but all other changes are unwound.Instead, I think we currently leave the vat in a halfway state, and we don't notify the caller correctly.
Description of the Design
The kernel's
processUpgradeVat()
function performs two deliveries:stopVat
andstartVat
. We need to doabortCrank
if either one fails. And we need to submit the failure message to vat-admin even thoughabortCrank
has rewound the state changes. This should look a lot like when a vat is terminated for a bad syscall or meter underrun. We use a little side table to remember the state changes that need to be made afterabortCrank
unwinds everything.Security Considerations
I don't know how confused the kernel can currently get in the face of a failed upgrade, but clearly we must make it impossible to corrupt anything kernel-side by intentionally upgrading to a broken version.
Test Plan
Unit tests.
The text was updated successfully, but these errors were encountered: