Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve sync header map #3391

Merged
merged 4 commits into from
Jun 24, 2022

Conversation

zhangsoledad
Copy link
Member

@zhangsoledad zhangsoledad commented Apr 28, 2022

What problem does this PR solve?

  • Optimize the implementation of header map to improve the speed of header synchronization from 11924/s to 14377/s, about 20% improvement.
  • Human-friendly configuration

What is changed and how it works?

  • Asynchronous batch write to hard disk when memory limit is exceeded.
  • Replace the implementation of header map bankend. Rocksdb has some drawbacks in this scenario, Rocksdb cannot give feedback on whether the key is overwritten when writing, nor can it distinguish between successful or not found when deleting, it's inefficient and looks silly to retrieve a value just to check if it exists or not.
RocksDBBackend::contains_key duration 4041 ns
RocksDBBackend::remove duration 7000 ns
RocksDBBackend::insert duration 10750 ns

Sometimes it deteriorates into the following, because of the inconvenience of implementation:
RocksDBBackend::contains_key duration 10750 ns
RocksDBBackend::remove duration 13000 ns
RocksDBBackend::insert duration 21583 ns

-----------------------------
SledBackend::contains_key duration 2666 ns
SledBackend::remove duration 2583 ns
SledBackend::insert_batch 35503541/10028 ns (3540 ns)

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code ci-runs-only: [ quick_checks,linters ]

Release note

Title Only: Include only the PR title in the release note.

@zhangsoledad zhangsoledad force-pushed the zhangsoledad/header-map branch 2 times, most recently from 36fe267 to c899c52 Compare April 28, 2022 12:07
@zhangsoledad zhangsoledad marked this pull request as ready for review April 29, 2022 03:37
@zhangsoledad zhangsoledad requested a review from a team as a code owner April 29, 2022 03:37
@zhangsoledad zhangsoledad requested review from doitian and removed request for a team April 29, 2022 03:37
@zhangsoledad
Copy link
Member Author

sync/src/types/mod.rs Outdated Show resolved Hide resolved
@zhangsoledad zhangsoledad force-pushed the zhangsoledad/header-map branch from c899c52 to 94a1915 Compare May 5, 2022 07:22
@zhangsoledad zhangsoledad force-pushed the zhangsoledad/header-map branch 2 times, most recently from e6b4ef7 to 783f58c Compare May 7, 2022 14:14
sync/src/types/header_map/backend.rs Outdated Show resolved Hide resolved
sync/src/types/header_map/backend_sled.rs Outdated Show resolved Hide resolved
sync/src/types/header_map/backend_sled.rs Outdated Show resolved Hide resolved
@zhangsoledad zhangsoledad force-pushed the zhangsoledad/header-map branch 2 times, most recently from 86f05f1 to 035f625 Compare May 8, 2022 06:05
@zhangsoledad zhangsoledad force-pushed the zhangsoledad/header-map branch from 035f625 to 76be468 Compare May 18, 2022 01:46
@zhangsoledad zhangsoledad force-pushed the zhangsoledad/header-map branch from 76be468 to d5ada78 Compare May 19, 2022 07:15
use tempfile::TempDir;

pub(crate) struct SledBackend {
count: AtomicUsize,
Copy link
Member

@doitian doitian May 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove len and is_empty from KeyValueBackend?

Copy link
Member Author

@zhangsoledad zhangsoledad May 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is used for short-circuiting, when the count is 0 we can skip the db operation, this will be reflected in the profiling.
If you mean can we use sled::len or sled::is_empty , although I haven't actually tried it, but it shouldn't work either, performs a full O(n) scan under the hood from docs

   pub fn len(&self) -> usize {
        self.iter().count()
    }

    /// Returns `true` if the `Tree` contains no elements.
    pub fn is_empty(&self) -> bool {
        self.iter().next().is_none()
    }

   pub(crate) fn next_inner(&mut self) -> Option<<Self as Iterator>::Item> {
        let guard = pin();
        let (mut pid, mut node) = if let (true, Some((pid, node))) =
            (self.going_forward, self.cached_node.take())
        {
            (pid, node)
        } else {
            let view =
                iter_try!(self.tree.view_for_key(self.low_key(), &guard));
            (view.pid, view.deref().clone())
        };

        for _ in 0..MAX_LOOPS {
            if self.bounds_collapsed() {
                return None;
            }

            if !node.contains_upper_bound(&self.lo) {
                ...........
            } else if !node.contains_lower_bound(&self.lo, true) {
               ........
            } else {
                  ........
            }
        }
   
    }

let mut count = 0;
for value in values {
let key = value.hash();
let last_value = self
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you compared the performance difference between consecutive insertions with apply_batch?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apply_batch has the same problem, it can't give feedback whether a key was overwritten or written directly,Here also tried to compare with transaction,transaction performs worse because of the extra overhead required to guarantee ACID.

@doitian
Copy link
Member

doitian commented May 31, 2022

Sled has the same memory usage after exiting sync, but has higher peak memory usage during sync.

CC @quake

@driftluo
Copy link
Collaborator

bors r+

@bors bors bot merged commit bb7f720 into nervosnetwork:develop Jun 24, 2022
@zhangsoledad zhangsoledad deleted the zhangsoledad/header-map branch June 28, 2022 06:45
bors bot added a commit that referenced this pull request Jun 29, 2022
3470: fix(tests): get_ancestor_use_skip_list r=zhangsoledad a=zhangsoledad

<!--
Thank you for contributing to nervosnetwork/ckb!

If you haven't already, please read [CONTRIBUTING](https://github.com/nervosnetwork/ckb/blob/develop/CONTRIBUTING.md) document.

If you're unsure about anything, just ask; somebody should be along to answer within a day or two.

PR Title Format:
1. module [, module2, module3]: what's changed
2. *: what's changed
-->

### What problem does this PR solve?

test `get_ancestor_use_skip_list` was broken since #3391

### What is changed and how it works?

Make sure genesis skip_hash is none.


### Check List <!--REMOVE the items that are not applicable-->

Tests <!-- At least one of them must be included. -->

- Unit test
- Integration test
- Manual test (add detailed scripts or steps below)
- No code ci-runs-only: [ quick_checks,linters ]

### Release note <!-- Choose from None, Title Only and Note. Bugfixes or new features need a release note. -->

```release-note
None: Exclude this PR from the release note.
```



Co-authored-by: zhangsoledad <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants