snapshot / BOYD interval based on computrons #6786

mhofman · 2023-01-13T17:22:14Z

What is the Problem Being Solved?

The amount of time it takes to replay from snapshot varies widely based on the kind of work performed since the last snapshot. Similarly the amount of collectible objects depend on the amount of garbage created/released by the vat. These vary between vats, and can also vary over time for the same vat.

Currently our snapshot and BringOutYourDead interval is expressed in terms of number of deliveries to the vat. A closer approximation would be to use the amount of computrons spent by the vat, which is already tangentially in consensus through the run policy of the host, and is proposed to be more firmly in consensus through #6770.

We should also consider tying these operations together, aka perform a snapshot only immediately after bringOutYourDead so that we don't capture garbage in snapshots.

Description of the Design

An in consensus parameter would define the snapshot/bringOutYourDead interval in terms of computrons instead of deliveries.

Updating this parameter probably has interactions with parallel execution (#6447).

Security Considerations

None I can think of

Test Plan

Yes

mhofman · 2023-01-13T19:11:26Z

As requested by @dckc, here is a graph and histogram of the execution time between snapshots as captured by the replay tool for all vats on mainnet.

warner · 2023-01-14T19:06:25Z

In conjunction with this, we should implement @mhofman 's earlier suggestion that we use the transcript to record the act of writing a snapshot, since they force a GC pass, which changes the state of the worker. If we tied BOYD and snapshotting together (always doing them at the same time), then the transcript will look like:

deliveryNum 4: dispatch.deliver ..
deliveryNum 5: dispatch.notify ..
deliveryNum 6: dispatch.bringOutYourDead
deliveryNum 7: writeHeapSnapshot
deliveryNum 8: dispatch.deliver ..

The writeHeapSnapshot command would be sent over the same command pipe as the normal dispatch.* deliveries, however it would be sent by a different packages/xsnap library API function (on the kernel side), and it would not be delivered to liveslots (on the worker side). These deliveries would also not appear in the kernel run-queue: they would be injected into the delivery stream by the vat-warehouse or the individual vat's manager, just after making a delivery whose results increase the cumulative computron count above the trigger threshold.

Thinking about the improved scheduler and a per-vat input queue, what we'd really do is push these two deliveries onto the front of the vat-input queue, and then let them be delivered in their own time (so not necessarily right away). They'd happen before anything else on that vat's input queue, but we could allow other vats to get time before returning attention to the one that needs a snapshot written. If the snapshot is called for because this particular vat just did a whole lot of work, then maybe we don't want to spend even more time on this vat right away.

One tricky bit is to make sure we don't accidentally re-schedule a BOYD or snapshot because of the computrons used by the BOYD or snapshot.. we need some per-vat state to remember that we pushed the pair onto the input queue already. We need some durable per-vat state anyways, to remember the cumulative computron count, since the kernel might be rebooted, e.g. just before the BOYD/snapshot was to have been scheduled. Those counters cannot be only in RAM.

The resulting transcript "spans" (#6702/#6775) will always end with a BOYD and a writeHeapSnapshot (e.g. deliveryNum 7, above). They will always start with the delivery just after the writeHeapSnapshot (deliveryNum 8).

If we incorporate vat-upgrade into this transcript-tracked scheme, then the last span of an incarnation will end with a BOYD and an upgradeVat pseudo-delivery. The upgradeVat is not a real delivery (the worker does not receive it). At present, the worker receives a stopVat, then the worker is replaced with a new one (using the new vat bundle), then it receives a startVat with the new vatParameters. So the last span of the previous incarnation could conceivably end with BOYD/stopVat, and the first span of the new incarnation could start with startVat. However I'd like to get rid of stopVat (#6650), in which case the old span would end with BOYD, and the new span would start with startVat. In either case, there is no writeHeapSnapshot in between: we don't use heap snapshots during vat upgrade.

mhofman · 2023-01-15T16:32:17Z

In the future improved vat scheduler / parallelization, I think makeSnapshot / BOYD should be driven entirely (but still deterministically) by the vat CDP itself. A logical way of thinking about the kernel -> vat input is that it focuses on direct actions such as message send, promise resolve, retire export, upgrade/exit. The vat -> kernel output is a result of execution derived from these input, such as message send, promise resolve and subscribe, but also drop import, etc.

The fact that snapshots are made by the CDP is an implementation detail that shouldn't cross this boundary, asides from the abstract artifacts that the CDP produces for state-sync.

Similarly, I consider BOYD an internal concern of the CDP in how it find its garbage. It is of course observable in some output of the CDP, which is why it needs to be deterministic. It raises the question of how the result of BYOD should be included in the vat output queue? A way to approach it would be to simply roll the results of BYOD into the output of the delivery that triggered the BYOD. The main concern is the apparent disconnection of this output from the input.

The parametrization is interesting. This is IMO something we want to control by a governance parameter, so once the kernel is updated it needs to inform the vat CDP of the new parameter. That new parameter should be processed in the same input queue since the vat may or may not have processed earlier units of work. That means vat parameter changes are like upgrade/exit commands.

Similarly the computron storage should be something kept logically inside the vat CDP.

mhofman · 2023-01-15T19:52:42Z

One tricky bit is to make sure we don't accidentally re-schedule a BOYD or snapshot because of the computrons used by the BOYD or snapshot.

Thinking more about this, I hope we do not count BYOD meter usage against the vat? BYOD is intrinsically a gc related operation which should be exempt from metering. We can "charge" the vat through other means, like the amount of "time" each object is retained, which would be higher if a vat produces a lot of garbage and thus a good proxy for the expense of running BYOD.

warner · 2023-01-24T20:09:18Z

This would be nice to have, but not strictly needed for the vaults/bulldozer release, and we think we can add it later without causing too many problems. We should definitely include computrons in the consensus state, since this will make us even more sensitive to metering differences.

warner · 2023-03-14T18:24:43Z

We should also consider telling the runPolicy about the "time" spend writing a snapshot, so it could end the block earlier. The size of the snapshot is a reasonable proxy for the amount of time we spend writing it, and is in-consensus.

mhofman · 2023-04-07T17:55:31Z

A further tweak on this would be to let the host schedule reaping of eligible vats.

Performing a snapshot in the middle of a block is potentially disruptive to the block's execution. It also raises the possibility of a snapshot being taken twice for the same vat in a block, which is useless.

BOYD in the middle of an execution is also not the optimal time, and it'd be better to perform reaping when quiescent. If snapshots are performed right after reaping, they would also be smaller.

The host could schedule a reaping (which also triggers a snapshot) after the kernel has run to completion on an inbound queue items for a block. To deal with maxed out blocks, the host could gain knowledge about the need of reaping for vats, and pre-allocate "compute time" in the run policy to perform them at the end.

Edit: the layer is the tricky bit here: vat gc is a swingset concern that may not make sense to expose to the host, even though I'd argue it should be accounted for in the run policy, and we cannot rely on execution computrons to measure its cost (similar to how the cost of taking heap snapshots should be accounted for). However on the other side, block boundaries and time passing is a concern of the host, which may not make sense to expose to swingset directly.

warner · 2023-09-13T19:57:47Z

We might consider an additional limit based on the concatenated size of the transcript span, to avoid problems like #8325 where the swingstore export artifact (the transcript span, expressed as newline-concatenated transcript entries) was larger than the cosmos-sdk state-sync import size limit. That limit was originally 64MB, but #8325 raises it to a new value. If we want to avoid the possibility of running up against the new value in the future, we could have swingset force a new span if the total span size was getting close.

@mhofman observed only a single large span, v1-bootstrap, and it was only about 65MB, just barely over the limit. It was that big because of the large amount of data we provided in the bootstrap() "argv" argument, with all the account balances/state from the pismo-era IAVL tree. That second (startVat) delivery's transcript item was 65,580,676 bytes long. We're not likely to do that again, so in the future the relationship between span size and total computrons per span will probably(?) be closer. The next-largest transcript item on mainnet is 1,985,575 bytes. I don't know the range of full span sizes is (I need more tools to compute them).

`dispatch.bringOutYourDead()`, aka "reap", triggers garbage collection inside a vat, and gives it a chance to drop imported c-list vrefs that are no longer referenced by anything inside the vat. Previously, each vat has a configurable parameter named `reapInterval`, which defaults to a kernel-wide `defaultReapInterval` (but can be set separately for each vat). This defaults to 1, mainly for unit testing, but real applications set it to something like 200. This caused BOYD to happen once every 200 deliveries, plus an extra BOYD just before we save an XS heap-state snapshot. This commit switches to a "dirt"-based BOYD scheduler, wherein we consider the vat to get more and more dirty as it does work, and eventually it reaches a `reapDirtThreshold` that triggers the BOYD (which resets the dirt counter). We continue to track `dirt.deliveries` as before, with the same defaults. But we add a new `dirt.gcKrefs` counter, which is incremented by the krefs we submit to the vat in GC deliveries. For example, calling `dispatch.dropImports([kref1, kref2])` would increase `dirt.gcKrefs` by two. The `reapDirtThreshold.gcKrefs` limit defaults to 20. For normal use patterns, this will trigger a BOYD after ten krefs have been dropped and retired. We choose this value to allow the #8928 slow vat termination process to trigger BOYD frequently enough to keep the BOYD cranks small: since these will be happening constantly (in the "background"), we don't want them to take more than 500ms or so. Given the current size of the large vats that #8928 seeks to terminate, 10 krefs seems like a reasonable limit. And of course we don't want to perform too many BOYDs, so `gcKrefs: 20` is about the smallest threshold we'd want to use. External APIs continue to accept `reapInterval`, and now also accept `reapGCKrefs`. * kernel config record * takes `config.defaultReapInterval` and `defaultReapGCKrefs` * takes `vat.NAME.creationOptions.reapInterval` and `.reapGCKrefs` * `controller.changeKernelOptions()` still takes `defaultReapInterval` but now also accepts `defaultReapGCKrefs` The APIs available to userspace code (through `vatAdminSvc`) are unchanged (partially due to upgrade/backwards-compatibility limitations), and continue to only support setting `reapInterval`. Internally, this just modifies `reapDirtThreshold.deliveries`. * `E(vatAdminSvc).createVat(bcap, { reapInterval })` * `E(adminNode).upgrade(bcap, { reapInterval })` * `E(adminNode).changeOptions({ reapInterval })` Internally, the kernel-wide state records `defaultReapDirtThreshold` instead of `defaultReapInterval`, and each vat records `.reapDirtThreshold` in their `vNN.options` key instead of `vNN.reapInterval`. The current dirt level is recorded in `vNN.reapDirt`. The kernel will automatically upgrade both the kernel-wide and the per-vat state upon the first reboot with the new kernel code. The old `reapCountdown` value is used to initialize the vat's `reapDirt.deliveries` counter, so the upgrade shouldn't disrupt the existing schedule. Vats which used `reapInterval = 'never'` (eg comms) will get a `reapDirtThreshold` of all 'never' values, so they continue to inhibit BOYD. Otherwise, all vats get a `threshold.gcKrefs` of 20. We do not track dirt when the corresponding threshold is 'never', to avoid incrementing the comms dirt counters forever. This design leaves room for adding `.computrons` to the dirt record, as well as tracking a separate `snapshotDirt` counter (to trigger XS heap snapshots, ala #6786). We add `reapDirtThreshold.computrons`, but do not yet expose an API to set it. Future work includes: * upgrade vat-vat-admin to let userspace set `reapDirtThreshold` New tests were added to exercise the upgrade process, and other tests were updated to match the new internal initialization pattern. We now reset the dirt counter upon any BOYD, so this also happens to help with #8665 (doing a `reapAllVats()` resets the delivery counters, so future BOYDs will be delayed, which is what we want). But we should still change `controller.reapAllVats()` to avoid BOYDs on vats which haven't received any deliveries. closes #8980

NOTE: deployed kernels require a new `upgradeSwingset()` call upon (at least) first boot after upgrading to this version of the kernel code. See below for details. `dispatch.bringOutYourDead()`, aka "reap", triggers garbage collection inside a vat, and gives it a chance to drop imported c-list vrefs that are no longer referenced by anything inside the vat. Previously, each vat has a configurable parameter named `reapInterval`, which defaults to a kernel-wide `defaultReapInterval` (but can be set separately for each vat). This defaults to 1, mainly for unit testing, but real applications set it to something like 1000. This caused BOYD to happen once every 1000 deliveries, plus an extra BOYD just before we save an XS heap-state snapshot. This commit switches to a "dirt"-based BOYD scheduler, wherein we consider the vat to get more and more dirty as it does work, and eventually it reaches a `reapDirtThreshold` that triggers the BOYD (which resets the dirt counter). We continue to track `dirt.deliveries` as before, with the same defaults. But we add a new `dirt.gcKrefs` counter, which is incremented by the krefs we submit to the vat in GC deliveries. For example, calling `dispatch.dropImports([kref1, kref2])` would increase `dirt.gcKrefs` by two. The `reapDirtThreshold.gcKrefs` limit defaults to 20. For normal use patterns, this will trigger a BOYD after ten krefs have been dropped and retired. We choose this value to allow the #8928 slow vat termination process to trigger BOYD frequently enough to keep the BOYD cranks small: since these will be happening constantly (in the "background"), we don't want them to take more than 500ms or so. Given the current size of the large vats that #8928 seeks to terminate, 10 krefs seems like a reasonable limit. And of course we don't want to perform too many BOYDs, so `gcKrefs: 20` is about the smallest threshold we'd want to use. External APIs continue to accept `reapInterval`, and now also accept `reapGCKrefs`, and `neverReap` (a boolean which inhibits all BOYD, even new forms of dirt added in the future). * kernel config record * takes `config.defaultReapInterval` and `defaultReapGCKrefs` * takes `vat.NAME.creationOptions.reapInterval` and `.reapGCKrefs` and `.neverReap` * `controller.changeKernelOptions()` still takes `defaultReapInterval` but now also accepts `defaultReapGCKrefs` The APIs available to userspace code (through `vatAdminSvc`) are unchanged (partially due to upgrade/backwards-compatibility limitations), and continue to only support setting `reapInterval`. Internally, this just modifies `reapDirtThreshold.deliveries`. * `E(vatAdminSvc).createVat(bcap, { reapInterval })` * `E(adminNode).upgrade(bcap, { reapInterval })` * `E(adminNode).changeOptions({ reapInterval })` Internally, the kernel-wide state records `defaultReapDirtThreshold` instead of `defaultReapInterval`, and each vat records `.reapDirtThreshold` in their `vNN.options` key instead of `vNN.reapInterval`. The vat-level records override the kernel-wide values. The current dirt level is recorded in `vNN.reapDirt`. NOTE: deployed kernels require explicit state upgrade, with: ```js import { upgradeSwingset } from '@agoric/swingset-vat'; .. upgradeSwingset(kernelStorage); ``` This must be called after upgrading to the new kernel code/release, and before calling `buildVatController()`. It is safe to call on every reboot (it will only modify the swingstore when the kernel version has changed). If changes are made, the host application is responsible for commiting them, as well as recording any export-data updates (if the host configured the swingstore with an export-data callback). During this upgrade, the old `reapCountdown` value is used to initialize the vat's `reapDirt.deliveries` counter, so the upgrade shouldn't disrupt the existing schedule. Vats which used `reapInterval = 'never'` (eg comms) will get a `reapDirtThreshold.never = true`, so they continue to inhibit BOYD. Any per-vat settings that match the kernel-wide settings are removed, allowing the kernel values to take precedence (as well as changes to the kernel-wide values; i.e. the per-vat settings are not sticky). We do not track dirt when the corresponding threshold is 'never', or if `neverReap` is true, to avoid incrementing the comms dirt counters forever. This design leaves room for adding `.computrons` to the dirt record, as well as tracking a separate `snapshotDirt` counter (to trigger XS heap snapshots, ala #6786). We add `reapDirtThreshold.computrons`, but do not yet expose an API to set it. Future work includes: * upgrade vat-vat-admin to let userspace set `reapDirtThreshold` New tests were added to exercise the upgrade process, and other tests were updated to match the new internal initialization pattern. We now reset the dirt counter upon any BOYD, so this also happens to help with #8665 (doing a `reapAllVats()` resets the delivery counters, so future BOYDs will be delayed, which is what we want). But we should still change `controller.reapAllVats()` to avoid BOYDs on vats which haven't received any deliveries. closes #8980

fix(swingset): use "dirt" to schedule vat reap/bringOutYourDead NOTE: deployed kernels require a new `upgradeSwingset()` call upon (at least) first boot after upgrading to this version of the kernel code. `dispatch.bringOutYourDead()`, aka "reap", triggers garbage collection inside a vat, and gives it a chance to drop imported c-list vrefs that are no longer referenced by anything inside the vat. Previously, each vat has a configurable parameter named `reapInterval`, which defaults to a kernel-wide `defaultReapInterval` (but can be set separately for each vat). This defaults to 1, mainly for unit testing, but real applications set it to something like 1000. This caused BOYD to happen once every 1000 deliveries, plus an extra BOYD just before we save an XS heap-state snapshot. This commit switches to a "dirt"-based BOYD scheduler, wherein we consider the vat to get more and more dirty as it does work, and eventually it reaches a `reapDirtThreshold` that triggers the BOYD (which resets the dirt counter). We continue to track `dirt.deliveries` as before, with the same defaults. But we add a new `dirt.gcKrefs` counter, which is incremented by the krefs we submit to the vat in GC deliveries. For example, calling `dispatch.dropImports([kref1, kref2])` would increase `dirt.gcKrefs` by two. The `reapDirtThreshold.gcKrefs` limit defaults to 20. For normal use patterns, this will trigger a BOYD after ten krefs have been dropped and retired. We choose this value to allow the #8928 slow vat termination process to trigger BOYD frequently enough to keep the BOYD cranks small: since these will be happening constantly (in the "background"), we don't want them to take more than 500ms or so. Given the current size of the large vats that #8928 seeks to terminate, 10 krefs seems like a reasonable limit. And of course we don't want to perform too many BOYDs, so `gcKrefs: 20` is about the smallest threshold we'd want to use. External APIs continue to accept `reapInterval`, and now also accept `reapGCKrefs`, and `neverReap` (a boolean which inhibits all BOYD, even new forms of dirt added in the future). * kernel config record * takes `config.defaultReapInterval` and `defaultReapGCKrefs` * takes `vat.NAME.creationOptions.reapInterval` and `.reapGCKrefs` and `.neverReap` * `controller.changeKernelOptions()` still takes `defaultReapInterval` but now also accepts `defaultReapGCKrefs` The APIs available to userspace code (through `vatAdminSvc`) are unchanged (partially due to upgrade/backwards-compatibility limitations), and continue to only support setting `reapInterval`. Internally, this just modifies `reapDirtThreshold.deliveries`. * `E(vatAdminSvc).createVat(bcap, { reapInterval })` * `E(adminNode).upgrade(bcap, { reapInterval })` * `E(adminNode).changeOptions({ reapInterval })` Internally, the kernel-wide state records `defaultReapDirtThreshold` instead of `defaultReapInterval`, and each vat records `.reapDirtThreshold` in their `vNN.options` key instead of `vNN.reapInterval`. The vat-level records override the kernel-wide values. The current dirt level is recorded in `vNN.reapDirt`. NOTE: deployed kernels require explicit state upgrade, with: ```js import { upgradeSwingset } from '@agoric/swingset-vat'; .. upgradeSwingset(kernelStorage); ``` This must be called after upgrading to the new kernel code/release, and before calling `buildVatController()`. It is safe to call on every reboot (it will only modify the swingstore when the kernel version has changed). If changes are made, the host application is responsible for commiting them, as well as recording any export-data updates (if the host configured the swingstore with an export-data callback). During this upgrade, the old `reapCountdown` value is used to initialize the vat's `reapDirt.deliveries` counter, so the upgrade shouldn't disrupt the existing schedule. Vats which used `reapInterval = 'never'` (eg comms) will get a `reapDirtThreshold.never = true`, so they continue to inhibit BOYD. Any per-vat settings that match the kernel-wide settings are removed, allowing the kernel values to take precedence (as well as changes to the kernel-wide values; i.e. the per-vat settings are not sticky). We do not track dirt when the corresponding threshold is 'never', or if `neverReap` is true, to avoid incrementing the comms dirt counters forever. This design leaves room for adding `.computrons` to the dirt record, as well as tracking a separate `snapshotDirt` counter (to trigger XS heap snapshots, ala #6786). We add `reapDirtThreshold.computrons`, but do not yet expose an API to set it. Future work includes: * upgrade vat-vat-admin to let userspace set `reapDirtThreshold` New tests were added to exercise the upgrade process, and other tests were updated to match the new internal initialization pattern. We now reset the dirt counter upon any BOYD, so this also happens to help with #8665 (doing a `reapAllVats()` resets the delivery counters, so future BOYDs will be delayed, which is what we want). But we should still change `controller.reapAllVats()` to avoid BOYDs on vats which haven't received any deliveries. closes #8980

NOTE: deployed kernels require a new `upgradeSwingset()` call upon (at least) first boot after upgrading to this version of the kernel code. See below for details. `dispatch.bringOutYourDead()`, aka "reap", triggers garbage collection inside a vat, and gives it a chance to drop imported c-list vrefs that are no longer referenced by anything inside the vat. Previously, each vat has a configurable parameter named `reapInterval`, which defaults to a kernel-wide `defaultReapInterval` (but can be set separately for each vat). This defaults to 1, mainly for unit testing, but real applications set it to something like 1000. This caused BOYD to happen once every 1000 deliveries, plus an extra BOYD just before we save an XS heap-state snapshot. This commit switches to a "dirt"-based BOYD scheduler, wherein we consider the vat to get more and more dirty as it does work, and eventually it reaches a `reapDirtThreshold` that triggers the BOYD (which resets the dirt counter). We continue to track `dirt.deliveries` as before, with the same defaults. But we add a new `dirt.gcKrefs` counter, which is incremented by the krefs we submit to the vat in GC deliveries. For example, calling `dispatch.dropImports([kref1, kref2])` would increase `dirt.gcKrefs` by two. The `reapDirtThreshold.gcKrefs` limit defaults to 20. For normal use patterns, this will trigger a BOYD after ten krefs have been dropped and retired. We choose this value to allow the #8928 slow vat termination process to trigger BOYD frequently enough to keep the BOYD cranks small: since these will be happening constantly (in the "background"), we don't want them to take more than 500ms or so. Given the current size of the large vats that #8928 seeks to terminate, 10 krefs seems like a reasonable limit. And of course we don't want to perform too many BOYDs, so `gcKrefs: 20` is about the smallest threshold we'd want to use. External APIs continue to accept `reapInterval`, and now also accept `reapGCKrefs`, and `neverReap` (a boolean which inhibits all BOYD, even new forms of dirt added in the future). * kernel config record * takes `config.defaultReapInterval` and `defaultReapGCKrefs` * takes `vat.NAME.creationOptions.reapInterval` and `.reapGCKrefs` and `.neverReap` * `controller.changeKernelOptions()` still takes `defaultReapInterval` but now also accepts `defaultReapGCKrefs` The APIs available to userspace code (through `vatAdminSvc`) are unchanged (partially due to upgrade/backwards-compatibility limitations), and continue to only support setting `reapInterval`. Internally, this just modifies `reapDirtThreshold.deliveries`. * `E(vatAdminSvc).createVat(bcap, { reapInterval })` * `E(adminNode).upgrade(bcap, { reapInterval })` * `E(adminNode).changeOptions({ reapInterval })` Internally, the kernel-wide state records `defaultReapDirtThreshold` instead of `defaultReapInterval`, and each vat records `.reapDirtThreshold` in their `vNN.options` key instead of `vNN.reapInterval`. The vat-level records override the kernel-wide values. The current dirt level is recorded in `vNN.reapDirt`. NOTE: deployed kernels require explicit state upgrade, with: ```js import { upgradeSwingset } from '@agoric/swingset-vat'; .. upgradeSwingset(kernelStorage); ``` This must be called after upgrading to the new kernel code/release, and before calling `buildVatController()`. It is safe to call on every reboot (it will only modify the swingstore when the kernel version has changed). If changes are made, the host application is responsible for commiting them, as well as recording any export-data updates (if the host configured the swingstore with an export-data callback). During this upgrade, the old `reapCountdown` value is used to initialize the vat's `reapDirt.deliveries` counter, so the upgrade shouldn't disrupt the existing schedule. Vats which used `reapInterval = 'never'` (eg comms) will get a `reapDirtThreshold.never = true`, so they continue to inhibit BOYD. Any per-vat settings that match the kernel-wide settings are removed, allowing the kernel values to take precedence (as well as changes to the kernel-wide values; i.e. the per-vat settings are not sticky). We do not track dirt when the corresponding threshold is 'never', or if `neverReap` is true, to avoid incrementing the comms dirt counters forever. This design leaves room for adding `.computrons` to the dirt record, as well as tracking a separate `snapshotDirt` counter (to trigger XS heap snapshots, ala #6786). We add `reapDirtThreshold.computrons`, but do not yet expose an API to set it. Future work includes: * upgrade vat-vat-admin to let userspace set `reapDirtThreshold` New tests were added to exercise the upgrade process, and other tests were updated to match the new internal initialization pattern. We now reset the dirt counter upon any BOYD, so this also happens to help with #8665 (doing a `reapAllVats()` resets the delivery counters, so future BOYDs will be delayed, which is what we want). But we should still change `controller.reapAllVats()` to avoid BOYDs on vats which haven't received any deliveries. closes #8980

mhofman added enhancement New feature or request SwingSet package: SwingSet labels Jan 13, 2023

mhofman mentioned this issue Feb 8, 2023

Slowdown of XS worker #6661

Open

mhofman mentioned this issue Apr 10, 2023

Better represent heap cost in run policy #7373

Open

mhofman mentioned this issue Apr 25, 2023

Perform snapshots only after BOYD #7504

Closed

warner mentioned this issue Dec 21, 2023

controller.reapAllVats() could reset reapCountdown #8665

Open

warner changed the title ~~snapshot / BYOD interval based on computrons~~ snapshot / BOYD interval based on computrons Feb 23, 2024

warner mentioned this issue Feb 23, 2024

dirty-vat -driven BOYD scheduler #8980

Closed

warner mentioned this issue Mar 29, 2024

fix(swingset): use "dirt" to schedule vat reap/bringOutYourDead #9169

Merged

warner mentioned this issue Jul 2, 2024

slogfile 'crankNum' is duplicated when BOYD+heap-snapshot happens #8264

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

snapshot / BOYD interval based on computrons #6786

snapshot / BOYD interval based on computrons #6786

mhofman commented Jan 13, 2023

mhofman commented Jan 13, 2023 •

edited

Loading

warner commented Jan 14, 2023

mhofman commented Jan 15, 2023

mhofman commented Jan 15, 2023 •

edited

Loading

warner commented Jan 24, 2023

warner commented Mar 14, 2023 •

edited

Loading

mhofman commented Apr 7, 2023 •

edited

Loading

warner commented Sep 13, 2023

snapshot / BOYD interval based on computrons #6786

snapshot / BOYD interval based on computrons #6786

Comments

mhofman commented Jan 13, 2023

What is the Problem Being Solved?

Description of the Design

Security Considerations

Test Plan

mhofman commented Jan 13, 2023 • edited Loading

warner commented Jan 14, 2023

mhofman commented Jan 15, 2023

mhofman commented Jan 15, 2023 • edited Loading

warner commented Jan 24, 2023

warner commented Mar 14, 2023 • edited Loading

mhofman commented Apr 7, 2023 • edited Loading

warner commented Sep 13, 2023

mhofman commented Jan 13, 2023 •

edited

Loading

mhofman commented Jan 15, 2023 •

edited

Loading

warner commented Mar 14, 2023 •

edited

Loading

mhofman commented Apr 7, 2023 •

edited

Loading