Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

schedule snapshot snapshotInitial deliveries after each upgrade #7553

Closed
warner opened this issue Apr 29, 2023 · 1 comment · Fixed by #7558
Closed

schedule snapshot snapshotInitial deliveries after each upgrade #7553

warner opened this issue Apr 29, 2023 · 1 comment · Fixed by #7558
Labels
enhancement New feature or request SwingSet package: SwingSet

Comments

@warner
Copy link
Member

warner commented Apr 29, 2023

What is the Problem Being Solved?

Each time we start a new vat incarnation (either when the vat is first created, or just after each upgrade), we perform several expensive steps. The first is initialize-worker, which creates the worker, including SES lockdown, and the supervisor installation. The second is startVat, which creates and initializes liveslots, evaluates the entire userspace source bundle, and runs buildRootObject. Then, for contract vats, the very next delivery is startZcf, which retrieves and evaluates the entire contract bundle, and initializes (or re-initializes) the contract state.

From the start of the incarnation until the first heap-snapshot point, any worker restarts (due to host reboot, or vat eviction and subsequent reload) will have to repeat all these expensive steps. In practice, this can take a full second, mostly dominated by the startVat time. So we have the vat-warehouse to create a heap snapshot pretty early after startup, after a small number of deliveries, configured in a parameter named snapshotInitial. #7549 sets this value to 3, enough to get both startVat and startZcf included in the snapshot.

After the initial snapshot, snapshotInterval configures how frequently we make subsequent snapshots. Whenever the size of the current transcript span (everything since the last initialize-worker or load-snapshot) grows above snapshotInterval, we perform another snapshot.

However, the counters we use don't know about upgrades or incarnations. So we'll get an early snapshot on the first incarnation (because the absolute number of deliveries has exceeded snapshotInitial), but we don't get early snapshots on the second/etc incarnations.

Description of the Design

The fix will be to add a transcriptStore API that reveals information about the length of the current incarnation, maybe called transcriptStore.countIncarnationItems(vatID). I'm not sure about the best implementation, but it's nominally:

sqlite> SELECT COUNT(*) FROM transcriptItems WHERE vatID="v49" AND incarnation=(SELECT incarnation FROM transcriptSpans WHERE vatID="v49" AND isCurrent=1);

Then vat-warehouse should be configured to trigger a snapshot whenever countIncarnationItems > snapshotInitial, or when the length of the current span (i.e. everything that isn't in a snapshot) is greater than snapshotInterval.

Except of course the first condition should not keep making snapshots over and over again. So we should also check whether we have an active snapshot for our vatID at all: snapStore.getSnapshotInfo(vatID).

Security Considerations

none

Scaling Considerations

should improve worker reload time in the window after each upgrade, and before snapshotInterval commands a snapshot be taken

Test Plan

unit tests in test-transcript-entries.js

@warner warner added enhancement New feature or request SwingSet package: SwingSet labels Apr 29, 2023
warner added a commit that referenced this issue Apr 29, 2023
This changes the snapshot scheduling logic to be more consistent. We
still use `snapshotInitial` to trigger a snapshot shortly after worker
initialization, and `snapshotInterval` to trigger periodic ones after
that.

However the previous code compared `snapshotInitial` to the absolute
deliveryNum, which meant it only applied to the first incarnation, and
would not attempt to take a snapshot shortly after upgrade, leaving
the kernel vulnerable to replaying the long `startVat` delivery for a
larger window than we intended. And `snapshotInterval` was compared
against the difference between the latest transcript and the latest
snapshot, which changed with the addition of the load-worker
pseudo-entry.

The new code uses `snapshotInitial` whenever there is not an existing
snapshot (so the first span of *all* incarnations), and compares it
against the length of the current span (so it includes all the
pseudo-events). `snapshotInterval` is also compared against the length
of the current span.

The result is simpler and more predictable set of rules:

* in the first span of each incarnation, trigger a snapshot once we
  have at least `snapshotInterval` entries
* in all other spans, trigger once we have at least `snapshotInterval`

In addition, when triggering a snapshot, we perform a BringOutYourDead
delivery before asking the worker to save a snapshot. This gives us
one last chance to shake out any garbage (making the snapshot as small
as possible), and reduces the variation we might see forced GC that
happens during snapshot write (any FinalizationRegistry callbacks
should get run during the BOYD, not the save-snapshot).

closes #7553
closes #7504
@warner
Copy link
Member Author

warner commented Apr 29, 2023

This turned out to be easy to do in the process of fixing #7504, and didn't need an extra swingStore API. Instead it just asks "is there a current snapshot" and "how many entries are in the current transcript span". The first question tells us whether to apply snapshotInitial or snapshotInterval, against the count from the second.

warner added a commit that referenced this issue Apr 30, 2023
This changes the snapshot scheduling logic to be more consistent. We
still use `snapshotInitial` to trigger a snapshot shortly after worker
initialization, and `snapshotInterval` to trigger periodic ones after
that.

However the previous code compared `snapshotInitial` to the absolute
deliveryNum, which meant it only applied to the first incarnation, and
would not attempt to take a snapshot shortly after upgrade, leaving
the kernel vulnerable to replaying the long `startVat` delivery for a
larger window than we intended. And `snapshotInterval` was compared
against the difference between the latest transcript and the latest
snapshot, which changed with the addition of the load-worker
pseudo-entry.

The new code uses `snapshotInitial` whenever there is not an existing
snapshot (so the first span of *all* incarnations), and compares it
against the length of the current span (so it includes all the
pseudo-events). `snapshotInterval` is also compared against the length
of the current span.

The result is simpler and more predictable set of rules:

* in the first span of each incarnation, trigger a snapshot once we
  have at least `snapshotInterval` entries
* in all other spans, trigger once we have at least `snapshotInterval`

In addition, when triggering a snapshot, we perform a BringOutYourDead
delivery before asking the worker to save a snapshot. This gives us
one last chance to shake out any garbage (making the snapshot as small
as possible), and reduces the variation we might see forced GC that
happens during snapshot write (any FinalizationRegistry callbacks
should get run during the BOYD, not the save-snapshot).

closes #7553
closes #7504
warner added a commit that referenced this issue Apr 30, 2023
This changes the snapshot scheduling logic to be more consistent. We
still use `snapshotInitial` to trigger a snapshot shortly after worker
initialization, and `snapshotInterval` to trigger periodic ones after
that.

However the previous code compared `snapshotInitial` to the absolute
deliveryNum, which meant it only applied to the first incarnation, and
would not attempt to take a snapshot shortly after upgrade, leaving
the kernel vulnerable to replaying the long `startVat` delivery for a
larger window than we intended. And `snapshotInterval` was compared
against the difference between the latest transcript and the latest
snapshot, which changed with the addition of the load-worker
pseudo-entry.

The new code uses `snapshotInitial` whenever there is not an existing
snapshot (so the first span of *all* incarnations), and compares it
against the length of the current span (so it includes all the
pseudo-events). `snapshotInterval` is also compared against the length
of the current span.

The result is simpler and more predictable set of rules:

* in the first span of each incarnation, trigger a snapshot once we
  have at least `snapshotInterval` entries
* in all other spans, trigger once we have at least `snapshotInterval`

In addition, when triggering a snapshot, we perform a BringOutYourDead
delivery before asking the worker to save a snapshot. This gives us
one last chance to shake out any garbage (making the snapshot as small
as possible), and reduces the variation we might see forced GC that
happens during snapshot write (any FinalizationRegistry callbacks
should get run during the BOYD, not the save-snapshot).

closes #7553
closes #7504
warner added a commit that referenced this issue Apr 30, 2023
This changes the snapshot scheduling logic to be more consistent. We
still use `snapshotInitial` to trigger a snapshot shortly after worker
initialization, and `snapshotInterval` to trigger periodic ones after
that.

However the previous code compared `snapshotInitial` to the absolute
deliveryNum, which meant it only applied to the first incarnation, and
would not attempt to take a snapshot shortly after upgrade, leaving
the kernel vulnerable to replaying the long `startVat` delivery for a
larger window than we intended. And `snapshotInterval` was compared
against the difference between the latest transcript and the latest
snapshot, which changed with the addition of the load-worker
pseudo-entry.

The new code uses `snapshotInitial` whenever there is not an existing
snapshot (so the first span of *all* incarnations), and compares it
against the length of the current span (so it includes all the
pseudo-events). `snapshotInterval` is also compared against the length
of the current span.

The result is simpler and more predictable set of rules:

* in the first span of each incarnation, trigger a snapshot once we
  have at least `snapshotInterval` entries
* in all other spans, trigger once we have at least `snapshotInterval`

In addition, when triggering a snapshot, we perform a BringOutYourDead
delivery before asking the worker to save a snapshot. This gives us
one last chance to shake out any garbage (making the snapshot as small
as possible), and reduces the variation we might see forced GC that
happens during snapshot write (any FinalizationRegistry callbacks
should get run during the BOYD, not the save-snapshot).

closes #7553
closes #7504
@mergify mergify bot closed this as completed in #7558 Apr 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request SwingSet package: SwingSet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant