-
Notifications
You must be signed in to change notification settings - Fork 482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create Live Snapshot without shutting down node #2816
base: master
Are you sure you want to change the base?
Conversation
@@ -641,6 +643,11 @@ func mainImpl() int { | |||
deferFuncs = []func(){func() { currentNode.StopAndWait() }} | |||
} | |||
|
|||
// Live db snapshot creation is only supported on archive nodes | |||
if nodeConfig.Execution.Caching.Archive { | |||
go liveDBSnapshotter(ctx, chainDb, arbDb, execNode.ExecEngine.CreateBlocksMutex(), func() string { return liveNodeConfig.Get().SnapshotDir }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that:
- we should use stopwaiter pattern for the liveDBSnapshotter
- it might be nice to have a config option to disable (not start) the snapshotter, eg. if we are running sequencer, to be extra safe
- we should be able to support also full nodes (non archive), I am describing it more in the comment for
liveDBSnapshotter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Want a config option to enable the snapshotter, off by default.
continue | ||
} | ||
|
||
createBlocksMutex.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we could rearrange order of things a bit and patch geth a bit we could also support a non archive node (what I believe would be main use case for the db snapshoting).
- Instead of triggering the snapshot here, we could schedule a snapshot after next block is created. We could call e.g.
execNode.ScheduleDBSnapshot
, so we wouldn't need to have access tocreateBlockMutex
or any internals of ExecutionNode (that probably will be especially important for execution split). - In
ExecutionEngine.appendBlock
if a snapshot was scheduled, we could trigger the snapshot afters.bc.WriteBlockAndSetHeadWithTime
.
To support full nodes (non archive) we need to make sure that the state for the block written with WriteBlockAndSetHeadWithTime
is committed to disk. To do that we need to force commit the state. It could be done e.g. with a ForceTriedbCommitHook
hook that I added in snap sync draft: https://github.com/OffchainLabs/go-ethereum/pull/280/files#diff-53d5f4b8a536ec2a8c8c92bf70b8268f1d77ad77e9f316e6f68a2bcae5303215
The hook would be set to a function created in gethexec scope and that would have access to ExecutionEngine, something like:
hook := func() bool {
return execEngine.shouldForceCommitState()
}
func (e *ExecutionEngine) shouldForceCommitState() {
return e.forceCommitState
}
func (e *ExecutionEngine) ScheduleDBSnapshot() {
e.dbSnapshotScheduled.Store(true)
}
func (e *ExecutionEngine) appendBlock() error {
...
snapshotScheduled := e.dbSnapshotScheduled.Load()
if snapshotScheduled {
e.forceCommitState = true
}
status, err := s.bc.WriteBlockAndSetHeadWithTime(...)
if err != nil {
return err
}
...
if snapshotScheduled {
e.forceCommitState = false
chainDb.CreateDBSnapshot(snapshotDir)
}
...
}
That setting of the hook can be done similarly as in SnapHelper PR draft: https://github.com/OffchainLabs/nitro/pull/2122/files#diff-19d6494fe5ff01c95bfdd1e4af6d31d75207d21743af80f57f0cf93848a32e3e
Having written that, I am no longer sure if that's that straightforward as I thought when starting this comment 😓 but should be doable :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to go that way, I can split out simplified ForceTriedbCommitHook
from my draft PRs, so it can be merged in earlier and used here
This PR enables creation of live snapshot of databases (
arbitrumdata
,l2chaindata
,ancient
andwasm
) without having to shutdown the node.Supply the destination directory for the databases using the reloadable config option
--snapshot-dir=<pathToDestDir>
and one can trigger snapshot generation by invoking an arbdebug rpc commandarbdebug_createDBSnapshot
Note: This feature is only available for archive nodes running on pebble databases
Sample usage-
Testing
Triggered snapshot creation and successfully reused it run a new node in multiple scenario- small and large db sizes for arb1 and arb-sepolia nodes.
Pulls in geth PR- OffchainLabs/go-ethereum#380
Resolves NIT-2658