Skip to content

Commit

Permalink
feat(en): Unify snapshot recovery and recovery from L1 (#2256)
Browse files Browse the repository at this point in the history
## What ❔

- Replaces `storage_logs.{address, key}` fields in the snapshot with
`storage_logs.hashed_key`.
- Makes `storage_logs.{address, key}` fields in Postgres optional and
adapts the codebase for it.

## Why ❔

- This will make snapshot data equivalent to data obtainable from L1,
and thus would allow to unify snapshot recovery and recovery from L1
data.
- Decreases the snapshot size somewhat.

## Checklist

- [x] PR title corresponds to the body of PR (we generate changelog
entries from PRs).
- [x] Tests for the changes have been added / updated.
- [x] Documentation comments have been added / updated.
- [x] Code has been formatted via `zk fmt` and `zk lint`.
- [x] Spellcheck has been run via `zk spellcheck`.
  • Loading branch information
slowli authored Jun 27, 2024
1 parent 287958d commit e03a929
Show file tree
Hide file tree
Showing 68 changed files with 1,390 additions and 584 deletions.
3 changes: 2 additions & 1 deletion .github/workflows/ci-core-reusable.yml
Original file line number Diff line number Diff line change
Expand Up @@ -230,9 +230,10 @@ jobs:
fi
ENABLE_CONSENSUS=${{ matrix.consensus }} \
DEPLOYMENT_MODE=${{ matrix.deployment_mode }} \
SNAPSHOTS_CREATOR_VERSION=${{ matrix.deployment_mode == 'Validium' && '0' || '1' }} \
DISABLE_TREE_DURING_PRUNING=${{ matrix.base_token == 'Eth' }} \
ETH_CLIENT_WEB3_URL="http://reth:8545" \
PASSED_ENV_VARS="ENABLE_CONSENSUS,DEPLOYMENT_MODE,DISABLE_TREE_DURING_PRUNING,ETH_CLIENT_WEB3_URL" \
PASSED_ENV_VARS="ENABLE_CONSENSUS,DEPLOYMENT_MODE,DISABLE_TREE_DURING_PRUNING,SNAPSHOTS_CREATOR_VERSION,ETH_CLIENT_WEB3_URL" \
ci_run yarn recovery-test snapshot-recovery-test
- name: Genesis recovery test
Expand Down
16 changes: 12 additions & 4 deletions core/bin/external_node/src/config/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -422,7 +422,7 @@ pub(crate) struct OptionalENConfig {
pub snapshots_recovery_postgres_max_concurrency: NonZeroUsize,

#[serde(default)]
pub snapshot_recover_object_store: Option<ObjectStoreConfig>,
pub snapshots_recovery_object_store: Option<ObjectStoreConfig>,

/// Enables pruning of the historical node state (Postgres and Merkle tree). The node will retain
/// recent state and will continuously remove (prune) old enough parts of the state in the background.
Expand Down Expand Up @@ -622,7 +622,7 @@ impl OptionalENConfig {
.as_ref()
.map(|a| a.enabled)
.unwrap_or_default(),
snapshot_recover_object_store: load_config!(
snapshots_recovery_object_store: load_config!(
general_config.snapshot_recovery,
object_store
),
Expand Down Expand Up @@ -808,7 +808,7 @@ impl OptionalENConfig {
let mut result: OptionalENConfig = envy::prefixed("EN_")
.from_env()
.context("could not load external node config")?;
result.snapshot_recover_object_store = snapshot_recovery_object_store_config().ok();
result.snapshots_recovery_object_store = snapshot_recovery_object_store_config().ok();
Ok(result)
}

Expand Down Expand Up @@ -1041,6 +1041,10 @@ pub(crate) struct ExperimentalENConfig {
// Snapshot recovery
/// L1 batch number of the snapshot to use during recovery. Specifying this parameter is mostly useful for testing.
pub snapshots_recovery_l1_batch: Option<L1BatchNumber>,
/// Enables dropping storage key preimages when recovering storage logs from a snapshot with version 0.
/// This is a temporary flag that will eventually be removed together with version 0 snapshot support.
#[serde(default)]
pub snapshots_recovery_drop_storage_key_preimages: bool,
/// Approximate chunk size (measured in the number of entries) to recover in a single iteration.
/// Reasonable values are order of 100,000 (meaning an iteration takes several seconds).
///
Expand Down Expand Up @@ -1077,6 +1081,7 @@ impl ExperimentalENConfig {
Self::default_state_keeper_db_block_cache_capacity_mb(),
state_keeper_db_max_open_files: None,
snapshots_recovery_l1_batch: None,
snapshots_recovery_drop_storage_key_preimages: false,
snapshots_recovery_tree_chunk_size: Self::default_snapshots_recovery_tree_chunk_size(),
snapshots_recovery_tree_parallel_persistence_buffer: None,
commitment_generator_max_parallelism: None,
Expand All @@ -1095,7 +1100,6 @@ impl ExperimentalENConfig {
experimental.state_keeper_db_block_cache_capacity_mb,
default_state_keeper_db_block_cache_capacity_mb
),

state_keeper_db_max_open_files: load_config!(
general_config.db_config,
experimental.state_keeper_db_max_open_files
Expand All @@ -1110,6 +1114,10 @@ impl ExperimentalENConfig {
general_config.snapshot_recovery,
tree.parallel_persistence_buffer
),
snapshots_recovery_drop_storage_key_preimages: general_config
.snapshot_recovery
.as_ref()
.map_or(false, |config| config.drop_storage_key_preimages),
commitment_generator_max_parallelism: general_config
.commitment_generator
.as_ref()
Expand Down
5 changes: 5 additions & 0 deletions core/bin/external_node/src/init.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ use zksync_web3_decl::client::{DynClient, L2};
pub(crate) struct SnapshotRecoveryConfig {
/// If not specified, the latest snapshot will be used.
pub snapshot_l1_batch_override: Option<L1BatchNumber>,
pub drop_storage_key_preimages: bool,
pub object_store_config: Option<ObjectStoreConfig>,
}

Expand Down Expand Up @@ -111,6 +112,10 @@ pub(crate) async fn ensure_storage_initialized(
);
snapshots_applier_task.set_snapshot_l1_batch(snapshot_l1_batch);
}
if recovery_config.drop_storage_key_preimages {
tracing::info!("Dropping storage key preimages for snapshot storage logs");
snapshots_applier_task.drop_storage_key_preimages();
}
app_health.insert_component(snapshots_applier_task.health_check())?;

let recovery_started_at = Instant::now();
Expand Down
5 changes: 4 additions & 1 deletion core/bin/external_node/src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -971,7 +971,10 @@ async fn run_node(
.snapshots_recovery_enabled
.then_some(SnapshotRecoveryConfig {
snapshot_l1_batch_override: config.experimental.snapshots_recovery_l1_batch,
object_store_config: config.optional.snapshot_recover_object_store.clone(),
drop_storage_key_preimages: config
.experimental
.snapshots_recovery_drop_storage_key_preimages,
object_store_config: config.optional.snapshots_recovery_object_store.clone(),
});
ensure_storage_initialized(
connection_pool.clone(),
Expand Down
16 changes: 13 additions & 3 deletions core/bin/snapshots_creator/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,9 +51,9 @@ Creating a snapshot is a part of the [snapshot recovery integration test]. You c

Each snapshot consists of three types of data (see [`snapshots.rs`] for exact definitions):

- **Header:** Includes basic information, such as the miniblock / L1 batch of the snapshot, miniblock / L1 batch
timestamps, miniblock hash and L1 batch root hash. Returned by the methods in the `snapshots` namespace of the
JSON-RPC API of the main node.
- **Header:** Includes basic information, such as the L2 block / L1 batch of the snapshot, L2 block / L1 batch
timestamps, L2 block hash and L1 batch root hash. Returned by the methods in the `snapshots` namespace of the JSON-RPC
API of the main node.
- **Storage log chunks:** Latest values for all VM storage slots ever written to at the time the snapshot is made.
Besides key–value pairs, each storage log record also contains the L1 batch number of its initial write and its
enumeration index; both are used to restore the contents of the `initial_writes` table. Chunking storage logs is
Expand All @@ -64,6 +64,16 @@ Each snapshot consists of three types of data (see [`snapshots.rs`] for exact de
- **Factory dependencies:** All bytecodes deployed on L2 at the time the snapshot is made. Stored as a single gzipped
Protobuf message in an object store.

### Versioning

There are currently 2 versions of the snapshot format which differ in how keys are mentioned in storage logs.

- Version 0 includes key preimages (EVM-compatible keys), i.e. address / contract slot tuples.
- Version 1 includes only hashed keys as used in Era ZKP circuits and in the Merkle tree. Besides reducing the snapshot
size (with the change, keys occupy 32 bytes instead of 52), this allows to unify snapshot recovery with recovery from
L1 data. Having only hashed keys for snapshot storage logs is safe; key preimages are only required for a couple of
components to sort keys in a batch, but these cases only require preimages for L1 batches locally executed on a node.

[`snapshots.rs`]: ../../lib/types/src/snapshots.rs
[object store]: ../../lib/object_store
[snapshot recovery integration test]: ../../tests/recovery-test/tests/snapshot-recovery.test.ts
Loading

0 comments on commit e03a929

Please sign in to comment.