Skip to content

Commit

Permalink
fix(swingset): better snapshot scheduling, do BOYD before each
Browse files Browse the repository at this point in the history
This changes the snapshot scheduling logic to be more consistent. We
still use `snapshotInitial` to trigger a snapshot shortly after worker
initialization, and `snapshotInterval` to trigger periodic ones after
that.

However the previous code compared `snapshotInitial` to the absolute
deliveryNum, which meant it only applied to the first incarnation, and
would not attempt to take a snapshot shortly after upgrade, leaving
the kernel vulnerable to replaying the long `startVat` delivery for a
larger window than we intended. And `snapshotInterval` was compared
against the difference between the latest transcript and the latest
snapshot, which changed with the addition of the load-worker
pseudo-entry.

The new code uses `snapshotInitial` whenever there is not an existing
snapshot (so the first span of *all* incarnations), and compares it
against the length of the current span (so it includes all the
pseudo-events). `snapshotInterval` is also compared against the length
of the current span.

The result is simpler and more predictable set of rules:

* in the first span of each incarnation, trigger a snapshot once we
  have at least `snapshotInterval` entries
* in all other spans, trigger once we have at least `snapshotInterval`

In addition, when triggering a snapshot, we perform a BringOutYourDead
delivery before asking the worker to save a snapshot. This gives us
one last chance to shake out any garbage (making the snapshot as small
as possible), and reduces the variation we might see forced GC that
happens during snapshot write (any FinalizationRegistry callbacks
should get run during the BOYD, not the save-snapshot).

closes #7553
closes #7504
  • Loading branch information
warner committed Apr 29, 2023
1 parent c51df0b commit 96c70a5
Show file tree
Hide file tree
Showing 3 changed files with 179 additions and 115 deletions.
17 changes: 11 additions & 6 deletions packages/SwingSet/src/kernel/state/vatKeeper.js
Original file line number Diff line number Diff line change
Expand Up @@ -507,11 +507,15 @@ export function makeVatKeeper(
return snapStore?.getSnapshotInfo(vatID);
}

function transcriptSnapshotStats() {
const totalEntries = getTranscriptEndPosition();
const snapshotInfo = getSnapshotInfo();
const snapshottedEntries = snapshotInfo ? snapshotInfo.snapPos : 0;
return { totalEntries, snapshottedEntries };
/**
* Returns count of deliveries made since initialization or
* load-snapshot
*
* @returns {number}
*/
function transcriptSpanEntries() {
const { startPos, endPos } = transcriptStore.getCurrentSpanBounds(vatID);
return endPos - startPos;
}

/**
Expand Down Expand Up @@ -662,7 +666,8 @@ export function makeVatKeeper(
deleteCListEntriesForKernelSlots,
transcriptSize,
getTranscript,
transcriptSnapshotStats,
getTranscriptEndPosition,
transcriptSpanEntries,
addToTranscript,
vatStats,
dumpState,
Expand Down
31 changes: 24 additions & 7 deletions packages/SwingSet/src/kernel/vat-warehouse.js
Original file line number Diff line number Diff line change
Expand Up @@ -567,21 +567,38 @@ export function makeVatWarehouse({

const vatKeeper = kernelKeeper.provideVatKeeper(vatID);
let reason;
const { totalEntries, snapshottedEntries } =
vatKeeper.transcriptSnapshotStats();
if (snapshotInitial === totalEntries) {

const hasSnapshot = !!vatKeeper.getSnapshotInfo();
const deliveriesInSpan = vatKeeper.transcriptSpanEntries();

if (!hasSnapshot && deliveriesInSpan >= snapshotInitial) {
// begin snapshot after 'snapshotInitial' deliveries in an incarnation
reason = { snapshotInitial };
} else if (
totalEntries > snapshotInitial &&
totalEntries - snapshottedEntries >= snapshotInterval
) {
} else if (deliveriesInSpan >= snapshotInterval) {
// begin snapshot after 'snapshotInterval' deliveries in a span
reason = { snapshotInterval };
}
// console.log('maybeSaveSnapshot: reason:', reason);
if (!reason) {
return false; // not time to make a snapshot
}

// always do a bringOutYourDead just before a snapshot, to shake
// loose as much garbage as we can, and to minimize the GC
// sensitivity effects of the forced GC that snapshots perform
// TODO:
// * computrons consumed are not reported to the runPolicy
// * computrons consumed are not deducted from the meter
// (not sure we'd want that anyways)
// * vat-fatal errors or self-termination are not processed
//
/** @type { KernelDeliveryObject } */
const kd = harden(['bringOutYourDead']);
// eslint-disable-next-line no-use-before-define
const vd = kernelDeliveryToVatDelivery(vatID, kd);
const vs = kernelSlog.provideVatSlogger(vatID).vatSlog;
await deliverToVat(vatID, kd, vd, vs);

// in addition to saving the actual snapshot,
// vatKeeper.saveSnapshot() pushes a save-snapshot transcript
// entry, then starts a new transcript span, then pushes a
Expand Down
Loading

0 comments on commit 96c70a5

Please sign in to comment.