improve sync header map #3391

zhangsoledad · 2022-04-28T11:42:04Z

What problem does this PR solve?

Optimize the implementation of header map to improve the speed of header synchronization from 11924/s to 14377/s, about 20% improvement.
Human-friendly configuration

What is changed and how it works?

Asynchronous batch write to hard disk when memory limit is exceeded.
Replace the implementation of header map bankend. Rocksdb has some drawbacks in this scenario, Rocksdb cannot give feedback on whether the key is overwritten when writing, nor can it distinguish between successful or not found when deleting, it's inefficient and looks silly to retrieve a value just to check if it exists or not.

RocksDBBackend::contains_key duration 4041 ns
RocksDBBackend::remove duration 7000 ns
RocksDBBackend::insert duration 10750 ns

Sometimes it deteriorates into the following, because of the inconvenience of implementation：
RocksDBBackend::contains_key duration 10750 ns
RocksDBBackend::remove duration 13000 ns
RocksDBBackend::insert duration 21583 ns

-----------------------------
SledBackend::contains_key duration 2666 ns
SledBackend::remove duration 2583 ns
SledBackend::insert_batch 35503541/10028 ns (3540 ns)

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)
No code ci-runs-only: [ quick_checks,linters ]

Release note

Title Only: Include only the PR title in the release note.

zhangsoledad · 2022-04-29T03:37:36Z

https://gist.github.com/zhangsoledad/0605a6c98e58437d1796b7fe12638286

sync/src/types/mod.rs

sync/src/types/header_map/backend.rs

sync/src/types/header_map/backend_sled.rs

doitian · 2022-05-20T01:21:39Z

sync/src/types/header_map/backend_sled.rs

+use tempfile::TempDir;
+
+pub(crate) struct SledBackend {
+    count: AtomicUsize,


Can we remove len and is_empty from KeyValueBackend?

No, this is used for short-circuiting, when the count is 0 we can skip the db operation, this will be reflected in the profiling.
If you mean can we use sled::len or sled::is_empty , although I haven't actually tried it, but it shouldn't work either, performs a full O(n) scan under the hood from docs

pub fn len(&self) -> usize { self.iter().count() } /// Returns `true` if the `Tree` contains no elements. pub fn is_empty(&self) -> bool { self.iter().next().is_none() } pub(crate) fn next_inner(&mut self) -> Option<<Self as Iterator>::Item> { let guard = pin(); let (mut pid, mut node) = if let (true, Some((pid, node))) = (self.going_forward, self.cached_node.take()) { (pid, node) } else { let view = iter_try!(self.tree.view_for_key(self.low_key(), &guard)); (view.pid, view.deref().clone()) }; for _ in 0..MAX_LOOPS { if self.bounds_collapsed() { return None; } if !node.contains_upper_bound(&self.lo) { ........... } else if !node.contains_lower_bound(&self.lo, true) { ........ } else { ........ } } }

doitian · 2022-05-20T01:26:46Z

sync/src/types/header_map/backend_sled.rs

+        let mut count = 0;
+        for value in values {
+            let key = value.hash();
+            let last_value = self


Have you compared the performance difference between consecutive insertions with apply_batch?

apply_batch has the same problem, it can't give feedback whether a key was overwritten or written directly，Here also tried to compare with transaction，transaction performs worse because of the extra overhead required to guarantee ACID.

doitian · 2022-05-31T02:49:46Z

Sled has the same memory usage after exiting sync, but has higher peak memory usage during sync.

CC @quake

driftluo · 2022-06-24T03:47:25Z

bors r+

bors · 2022-06-24T03:54:18Z

Build succeeded:

3470: fix(tests): get_ancestor_use_skip_list r=zhangsoledad a=zhangsoledad  ### What problem does this PR solve? test `get_ancestor_use_skip_list` was broken since #3391 ### What is changed and how it works? Make sure genesis skip_hash is none. ### Check List  Tests  - Unit test - Integration test - Manual test (add detailed scripts or steps below) - No code ci-runs-only: [ quick_checks,linters ] ### Release note  ```release-note None: Exclude this PR from the release note. ``` Co-authored-by: zhangsoledad <[email protected]>

zhangsoledad force-pushed the zhangsoledad/header-map branch 2 times, most recently from 36fe267 to c899c52 Compare April 28, 2022 12:07

zhangsoledad marked this pull request as ready for review April 29, 2022 03:37

zhangsoledad requested a review from a team as a code owner April 29, 2022 03:37

zhangsoledad requested review from doitian and removed request for a team April 29, 2022 03:37

driftluo reviewed Apr 29, 2022

View reviewed changes

sync/src/types/mod.rs Show resolved Hide resolved

driftluo reviewed May 5, 2022

View reviewed changes

sync/src/types/mod.rs Outdated Show resolved Hide resolved

zhangsoledad force-pushed the zhangsoledad/header-map branch from c899c52 to 94a1915 Compare May 5, 2022 07:22

driftluo approved these changes May 5, 2022

View reviewed changes

zhangsoledad force-pushed the zhangsoledad/header-map branch 2 times, most recently from e6b4ef7 to 783f58c Compare May 7, 2022 14:14

quake reviewed May 8, 2022

View reviewed changes

sync/src/types/header_map/backend.rs Outdated Show resolved Hide resolved

sync/src/types/header_map/backend_sled.rs Outdated Show resolved Hide resolved

sync/src/types/header_map/backend_sled.rs Outdated Show resolved Hide resolved

zhangsoledad force-pushed the zhangsoledad/header-map branch 2 times, most recently from 86f05f1 to 035f625 Compare May 8, 2022 06:05

zhangsoledad force-pushed the zhangsoledad/header-map branch from 035f625 to 76be468 Compare May 18, 2022 01:46

zhangsoledad added 4 commits May 19, 2022 15:15

perf: sled backend

1e560e4

fix: header map stats

dc25c58

feat: human-friendly header-map memory limit

0e4fec9

chore: log header_map memory_limit

d5ada78

zhangsoledad force-pushed the zhangsoledad/header-map branch from 76be468 to d5ada78 Compare May 19, 2022 07:15

doitian reviewed May 20, 2022

View reviewed changes

quake approved these changes May 31, 2022

View reviewed changes

bors bot merged commit bb7f720 into nervosnetwork:develop Jun 24, 2022

zhangsoledad deleted the zhangsoledad/header-map branch June 28, 2022 06:45

zhangsoledad mentioned this pull request Jun 28, 2022

fix(tests): get_ancestor_use_skip_list #3470

Merged

chenyukang mentioned this pull request Apr 17, 2023

Memory tuning on header synchronization #3940

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improve sync header map #3391

improve sync header map #3391

zhangsoledad commented Apr 28, 2022 •

edited

Loading

zhangsoledad commented Apr 29, 2022

doitian May 20, 2022 •

edited

Loading

zhangsoledad May 20, 2022 •

edited

Loading

doitian May 20, 2022

zhangsoledad May 20, 2022

doitian commented May 31, 2022

driftluo commented Jun 24, 2022

bors bot commented Jun 24, 2022

improve sync header map #3391

improve sync header map #3391

Conversation

zhangsoledad commented Apr 28, 2022 • edited Loading

What problem does this PR solve?

What is changed and how it works?

Check List

Release note

zhangsoledad commented Apr 29, 2022

doitian May 20, 2022 • edited Loading

Choose a reason for hiding this comment

zhangsoledad May 20, 2022 • edited Loading

Choose a reason for hiding this comment

doitian May 20, 2022

Choose a reason for hiding this comment

zhangsoledad May 20, 2022

Choose a reason for hiding this comment

doitian commented May 31, 2022

driftluo commented Jun 24, 2022

bors bot commented Jun 24, 2022

zhangsoledad commented Apr 28, 2022 •

edited

Loading

doitian May 20, 2022 •

edited

Loading

zhangsoledad May 20, 2022 •

edited

Loading