-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Add design for a persistent Account storage #2769
Add design for a persistent Account storage #2769
Conversation
@sambley can you review, make sure we are on the same page about the design |
7d95418
to
eb283ce
Compare
@sakridge any objections? can you give this a review? |
@aeyakovenko it's a good start. |
@rob-solana i added a section on dealing with root forks. |
Currently the clean up happens when merge_parents() is called. Once the accounts are merged into root, all entries from the ancestor forks are removed and should get cleaned up. |
I don't see a test for that. I see the opposite in test_accountsdb_squash(), which verifies that fork=0 stuff is still present after a squash(1)... |
Yes, the cleanups are not happening right now. I was incorrectly assuming that after squash the parents could be cleaned up which was not the case. On what conditions, could we trigger the cleanup of a fork? |
cleanup of a fork can happen when there are no children of the fork, i.e. once all children have called squash(). I don't think this approach is strictly necessary, though: another way to do squash and cleanup in this new structure is to clean up anything unreachable, and let squash() just be a hint for reach-ability. once a fork has been "squash()ed" we're saying "we don't care about parents of this fork when viewed via this fork." at that rate, you can defer cleanup until those accounts are no longer visible, i.e. have been superceded above a squash() point |
@rob-solana @sakridge @sambley What if we don't clean up. Just keep a single HashSet of root forks. If one is created every second, it will take 11 days before that set has 1m entries. We can cleanup with an explicit flag to compact the db on boot. |
Would it not cause the persistent store to eat up disk space unnecessarily that could have been freed and reused otherwise? |
@rob-solana @sakridge @sambley the more I think about it, I think using an appendvec or set of appendvecs per fork(striped across drives) makes more sense 1. it allows cheap deletion of a fork. 2. it allows compact snapshots to be generated cheaply. |
/// * forks - A vector of forks indicating a priority list of forks. The earlier | ||
/// forks are loaded first if available. | ||
/// * pubkey - The Account's public key. | ||
pub fn load_slow(&self, forks: &[u64], pubkey: &Pubkey) -> Option<&Account> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is out of date since we started new_from_parent(), wherein parenting is now inside accounts_db
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you push an update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to give us permission, or I can push a separate PR...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Problem
Accounts are state that is replicated by every fullnode. Right now that state is stored in RAM, and therefore is expensive. This design enables using NVMes to store the Account state, as well as remove the need for explicit checkpointing.
This design covers the work in #2279
Summary of Changes
Design for a high performance persistent store for Accounts.
Fixes # #2499
tag: @sakridge @garious