-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
swing-store should keep historical artifacts outside of SQLite DB #9389
Comments
We had a discussion today about "keep" vs "archive", which might make us want to split this ticket into two separate ones, or maybe change its scope a little bit. "archive" goalMy personal goal here is to reduce the disk space usage on the follower I run, without losing the historical spans entirely (I want to have a personal archive, for analysis purposes). What I want is an
Then I can write a script that uploads the contents of that secondary directory to S3 or an external bulk drive, and then delete the files that were copied. Something like For my goal, I don't need the swingstore to retain access to these archive files. The swingstore would be configured to delete the old spans ( "keep" goalFrom our conversation today (and @mhofman please correct me if I get this wrong), @mhofman 's intention was to have the swingstore retain the old spans (assuming
The hope would be to improve performance of the SQLite DB by reducing its size, but not to reduce the size of the swingstore overall: we'd just be moving some items out of the DB and into (retained, protected, flushed) individual files. Another potential benefit of this storage mode would be to make state-sync import/export faster. Export might be faster if we could hand off (perhaps hardlinking??) the backing files as state-sync artifacts directly, instead of needing to read from the DB and then write to the disk, or writing to a pipe. Import might be faster if "import" just meant dropping the file in the secondary directory. My counter-arguments are:
next stepsOne outcome might be to change the stated intention of this ticket, maybe by renaming it to something like "add a mode to archive deleted historical spans in separate files". Another might be to make a new ticket with that title, and have @gibson042 look into implementing that sooner, and defer implementing this one ("keep") until/unless we have some performance numbers to show it would be better, and worth the added complexity. I'm happy either way; I don't want to hijack @mhofman 's ticket, but the "keep" behavior is not what I had in mind, and we had some confusion as to what the short-term development goal was. We should nail that down so @gibson042 can work on the right thing (and mark it as closing the right ticket.. if this ticket retains @mhofman 's intent, and @gibson042 implements archive files, then his work should not close this ticket). |
Not strictly needed. If it has the same start/end pos it'll just overwrite the different file. If it has different pos, it'll be a different file.
My use case is not so much performance of state-sync import, but the ability to easily reconstitute a "replay" or "archive" mode swing-store from an "operational" one by simply copying some static files. In particular, I hope the replay tool for XS acceptance testing (#6929) could avoid using a really large SQLite file.
Did we, I thought transcripts were always in the DB.
I agree it's vastly simpler for the current span transcript entries. I'm not convinced storing heap snapshots in the DB made things that much simpler, which is roughly the equivalent here.
Maybe this is something we should talk about with archive node operators, and understand the direction of the cosmos ecosystem here. We currently don't have a runtime need to read this data. It really is fully at rest for current use cases. I expect that archival operations could be optimized by keeping this historical data in separate non mutating files. |
Yeah... ok, I see what you mean. The
Gotcha. Composition of primary and archival data with
This was very early, LMDB days (May-2021), where we stored the transcript in a flat file, and the offset in LMDB. It used a thing named "StreamStore".
Hm, maybe. I know managing refcounts on heap snapshots is a lot more robust now: previously we had to be careful about not deleting the old snapshot too early, because we might crash before the DB commit and the restart needs the previous snapshot to start the replay from. Now the lifetime of the snapshot is exactly equal to the lifetime of the span that needs it.
Fair point. Ok, I'll make a separate ticket for the non-DB write-only thing, in case that's easy to implement in the short term, and then let's continue the conversation about what archive node operators could use best. #10036 |
That's independent, and because we don't attempt to deduplicate by snapshot content anymore ;) We only manage snapshot based on vatId and position, regardless of their hash. |
But, just within a single vat, if we're not retaining all snapshots, we still have to manage the transition from
When we finish with span2 and call Before we moved |
Did some digging,
toDelete Set that remembers things we might be able to delete.. the fact that it's in RAM means we might forget it if we crashed.
|
Right, with deletion we'd have to be careful to delete after commit to handle abort.
Yeah there's that too. Thankfully we'd be in append only in this case. I don't think we'd want a mode where historical entries ever need to be deleted. |
Ref #9174 Fixes #9387 Fixes #9386 TODO: - [ ] #9389 ## Description Adds consensus-independent `vat-snapshot-retention` ("debug" vs. "operational") and `vat-transcript-retention` ("archival" vs. "operational" vs. "default") cosmos-sdk swingset configuration (values chosen to correspond with [`artifactMode`](https://github.com/Agoric/agoric-sdk/blob/master/packages/swing-store/docs/data-export.md#optional--historical-data)) for propagation in AG_COSMOS_INIT. The former defaults to "operational" and the latter defaults to "default", which infers a value from cosmos-sdk `pruning` to allow simple configuration of archiving nodes. It also updates the semantics of TranscriptStore `keepTranscripts: false` configuration to remove items from only the previously-current span rather than from all previous spans when rolling over (to avoid expensive database churn). Removal of older items can be accomplished by reloading from an export that does not include them. ### Security Considerations I don't think this changes any relevant security posture. ### Scaling Considerations This will reduce the SQLite disk usage for any node that is not explicitly configured to retain snapshots and/or transcripts. The latter in particular is expected to have significant benefits for mainnet (as noted in #9174, about 116 GB ÷ 147 GB ≈ 79% of the database on 2024-03-29 was vat transcript items). ### Documentation Considerations The new fields are documented in our default TOML template, and captured in a JSDoc type on the JavaScript side. ### Testing Considerations This PR extends coverage TranscriptStore to include `keepTranscripts` true vs. false, but I don't see a good way to cover Go→JS propagation other than manually (which I have done). It should be possible to add testing for the use and validation of `resolvedConfig` in AG_COSMOS_INIT handling, but IMO that is best saved for after completion of split-brain (to avoid issues with same-process Go–JS entanglement). ### Upgrade Considerations This is all kernel code that can be used at any node restart (i.e., because the configuration is consensus-independent, it doesn't even need to wait for a chain software upgrade). But we should mention the new cosmos-sdk configuration in release notes, because it won't be added to existing app.toml files already in use.
What is the Problem Being Solved?
For #9174 we're making retention of historical transcripts configurable. While most nodes would no longer keep historical transcripts by default, we'd want archive nodes to keep them for potential future uses (like replay based upgrades #7855). However these are conceptually static artifacts that do not strictly need to be stored in the SQLite DB. Furthermore sharing them with validators that have pruned these would be easier if simply stored on disk.
Description of the Design
Based on #8318 and the related work in #8693, when being rolled, a transcript span would be compressed and stored as a file on disk alongside the SQLite DB. Similarly, if historical heap snapshots are kept by config (#9386), they would be stored as compressed files instead. The files can be flushed to disk immediately, and before the swingstore itself is committed.
The files can be named based on the "export artifact" naming scheme used for state-sync.
Security Considerations
None
Scaling Considerations
Reduce growth of SQLite DB size
Test Plan
TBD
Upgrade Considerations
Host side of the swingset kernel stack, not upgrade sensitive, but deployed as part of a chain software upgrade.
The text was updated successfully, but these errors were encountered: