Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: increase trie cache size #5706

Merged
merged 6 commits into from
Dec 8, 2021
Merged

fix: increase trie cache size #5706

merged 6 commits into from
Dec 8, 2021

Conversation

Longarithm
Copy link
Member

@Longarithm Longarithm commented Dec 7, 2021

Increase trie cache size to mitigate receipts undercharging issue.

Validity

We can increase cache size to 50_000, so the theoretical occupied size is 50_000 * 4 (num_shards) * 4_000 = 800 MB.

In reality, /usr/bin/time -v shows 520 -> 1070 MB RAM growth which seem to be caused by #5212. This change have an impact like 30 MB.
I've additionally checked that cache is full on my runs.

Speedup

On the sample of 30 receipts we considered, we have the following characteristics:

default settings: summary time = 3.3s, avg undercharging = 101
increase state cache size: summary time = 0.67s, avg undercharging = 21
increase state cache size + trie cache size = 50k: summary time = 0.5s, avg undercharging = 11.5
increase state cache size + trie cache size = 100k + reduce max value size to 1k: summary time = 0.3s, avg undercharging = 6.9

For now, let's just increase cache size. On this sample we reduce undercharging 2x and summary execution time 1.3x.

Testing

  • checking that cache is full;
  • existing tests,
  • estimating /usr/bin/time -v ./state-viewer apply_range --start_index 54051433 --end_index 54053438 --shard_id 3

@Longarithm Longarithm added the A-transaction-runtime Area: transaction runtime (transaction and receipts processing, state transition, etc) label Dec 7, 2021
@Longarithm Longarithm self-assigned this Dec 7, 2021
@Longarithm Longarithm marked this pull request as ready for review December 7, 2021 23:32
@Longarithm Longarithm requested a review from mzhangmzz as a code owner December 7, 2021 23:32
@@ -119,7 +119,7 @@ impl TrieStorage for TrieMemoryPartialStorage {

/// Maximum number of cache entries.
#[cfg(not(feature = "no_cache"))]
const TRIE_MAX_CACHE_SIZE: usize = 10000;
Copy link
Contributor

@pmnoxx pmnoxx Dec 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think Self(Arc::new(Mutex::new(SizedCache::with_size(TRIE_MAX_CACHE_SIZE)))), could be the problem.

Replacing it with SyncLruCache could help a little bit #5632
Once, this gets approved, I'll work on even better version, as there are a few other issues with it, that could cause big issues.

It is theoretically, possible, that the current implementation of TrieCache causes the issues.
I have a plan to fix it, but I need to wait for #5632 to be approved first.

To improve:

  • use SyncLruCache, that uses a better implemented LruCache instead of cached.
  • Instead of using single Mutex, use a shared mutex, so we have for example, 16 smaller caches, and the chances of hitting a mutex that's busy, are much lower. Latency, spikes can happen, if potentially, a thread is swapped out, while hitting a mutex, etc.
  • There, are a few other issues with the cache, though we need to wait for review first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, that's a bit out of topic, not blocking this PR. I think increase cache size is an excellent idea!

@Longarithm Longarithm requested a review from mm-near December 8, 2021 14:23
@near-bulldozer near-bulldozer bot merged commit 5a8ae26 into master Dec 8, 2021
@near-bulldozer near-bulldozer bot deleted the cachesizefix branch December 8, 2021 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-transaction-runtime Area: transaction runtime (transaction and receipts processing, state transition, etc)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants