add controller.reapAllVats() #8626

warner · 2023-12-06T19:13:52Z

What is the Problem Being Solved?

We might be able to survive remediating #8400 and #8401 in "one fell swoop", where we do a chain-halting upgrade, which upgrades the vats involved, to code whose buildRootObject deletes or clears the durable Sets and Maps that are holding the 100k+ objects alive. That will free a lot of objects, all of which will get GCed at the next BOYD event.

To ensure that this BOYD happens during the upgrade interval, and not some random number of blocks afterwards, we should have a way for the upgade code to request that the kernel forces a dispatch.bringOutYourDead to all vats. We can then follow this up with a controller.run() and flush out all the garbage. This may take a considerable amount of time (I'm roughly estimating ten minutes), but as long as this completes within the upgrade downtime for all validators, the process will be survivable (merely annoying). If for some reason, the large BOYD didn't happen during the downtime, then it would happen some unknown number of blocks later (depending upon how much activity those vats experienced), causing a surprising multi-minute chain stall. Worse, becomes some validator computers are faster than others, this stall would probably knock the slower ones out of the validator set.

Description of the Design

Add a controller.reapAllVats() method. This will enumerate all vats (static and dynamic) and perform a dispatch.bringOutYourDead() on each of them, or inject something into the run-queue to perform the same.

The BOYDs do not necessarily need to complete by the time reapAllVats() returns, but all the work should complete during the next full controller.run() call.

Security Considerations

This represents a performance hit, but it is only reachable by the host application (via controller), not anything inside a vat.

Scaling Considerations

This might take a long time, but better to stall at a known time than at a surprising one.

Test Plan

The unit test should create two or three vats, send them different numbers of messages (to get their deliveries-until-BOYD counters to different states), then perform the controller.reapAllVats() and a controller.run(). The test should somehow assert that the BOYDs happened (either snooping on the slogSender for type=='deliver' && kd[0]=='bringOutYourDead', or examining the swing-store transcript entries).

Upgrade Considerations

none

The text was updated successfully, but these errors were encountered:

Closes #8626

warner · 2023-12-08T20:01:25Z

Note for ourselves later: the kernel services the reapQueue before it looks at the run-queue. So our upgrade-time reaping process needs to do:

await controller.run();
controller.reapAllVats();
await controller.run();

to ensure that no GC-work-adding deliveries occur after the BOYDs are executed. The first controller.run() must be complete: it should not be limited by a runPolicy, otherwise there might still be runQueue items left over that won't happen until after the BOYDs.

mhofman · 2023-12-08T20:41:36Z

Run after reap will likely free more objects. Should we execute reap a few times until nothing is found anymore?

Edit: in that case, we only need to run reap in the vats that have seen deliveries. In some of my early experiment, I plumbed an option for the last know position for each vat, so that reap "all" could reap only vats that had new deliveries.

Closes #8626

Closes Agoric#8626

warner added enhancement New feature or request SwingSet package: SwingSet labels Dec 6, 2023

warner assigned FUDCo Dec 6, 2023

FUDCo added a commit that referenced this issue Dec 7, 2023

feat: add a reapAllVats call to the controller

cceee44

Closes #8626

FUDCo mentioned this issue Dec 7, 2023

Add a reapAllVats method to the controller #8634

Merged

FUDCo added a commit that referenced this issue Dec 8, 2023

feat: add a reapAllVats call to the controller

0356e4b

Closes #8626

FUDCo added a commit that referenced this issue Dec 11, 2023

feat: add a reapAllVats call to the controller

884bd01

Closes #8626

FUDCo added a commit that referenced this issue Dec 13, 2023

feat: add a reapAllVats call to the controller

6ba0af9

Closes #8626

mergify bot closed this as completed in #8634 Dec 13, 2023

mhofman mentioned this issue Dec 22, 2023

controller.reapAllVats() could reset reapCountdown #8665

Open

anilhelvaci pushed a commit to Jorge-Lopes/agoric-sdk that referenced this issue Feb 16, 2024

feat: add a reapAllVats call to the controller

11319d9

Closes Agoric#8626

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add controller.reapAllVats() #8626

add controller.reapAllVats() #8626

warner commented Dec 6, 2023

warner commented Dec 8, 2023

mhofman commented Dec 8, 2023 •

edited

Loading

add controller.reapAllVats() #8626

add controller.reapAllVats() #8626

Comments

warner commented Dec 6, 2023

What is the Problem Being Solved?

Description of the Design

Security Considerations

Scaling Considerations

Test Plan

Upgrade Considerations

warner commented Dec 8, 2023

mhofman commented Dec 8, 2023 • edited Loading

mhofman commented Dec 8, 2023 •

edited

Loading