-
Notifications
You must be signed in to change notification settings - Fork 654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: trie cache factory to allow variable cache sizes #7022
Conversation
8757766
to
46b09f9
Compare
dfd2705
to
6da2d66
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, we need to wait out for the mainnet shardId that is going to have increased load before we merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! A couple of minor comments. It will be good to get reviews from someone more seasoned in the code base as well though.
core/store/src/lib.rs
Outdated
@@ -67,8 +69,9 @@ impl Store { | |||
/// Caller must hold the temporary directory returned as first element of | |||
/// the tuple while the store is open. | |||
pub fn tmp_opener() -> (tempfile::TempDir, StoreOpener<'static>) { | |||
static CONFIG: Lazy<StoreConfig> = Lazy::new(StoreConfig::test_config); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I understand why we need to cache the config here. Can we not always call test_config
? Is calling test_config
a very expensive operation or can it return different values when called again and again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @mina86, this is the kind of side-effects which made me to avoid lifetime parameters by default (echoing back #6973 (review)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With However, the overall code does feel a bit "bolted on". I would like to fix the following things in the follow up:
|
Co-authored-by: Aleksey Kladov <[email protected]>
Co-authored-by: Akhilesh Singhania <[email protected]>
Note: I think we shouldn't puse all that in this PR immediately :) |
e0d7d12
to
c60e8c5
Compare
Great points @matklad, I will use the moment and ask some questions:
I think that putting all capacities to config leads to lots of repeated data in the config, especially if we have more shards.
I would like to, but logically we create trie caches' objects before creating |
So I think that users would want to put nothing in their config in either case -- we are relying on default values to makes sense. The problem isn't how the data in the config looks, but rather the fact that the actual size of the thing at runtime is determined by two completely separate sources of information:
I think it wold be beneficial, purely from the code quality point of view, to make sure that there's a single place in code which ultimately determines the size of the cache. Given that we want to make some aspect of this to be configurable, it makes sense for config to be this source of truth. Other than that, I think even operationally it would be useful to configure the size of cache for all shards withoug explicitly overriding each shard. But, again, this is not something we should be tackling in this PR: if there's time pressure, we can ship minimal correct diff, and work at making API nicer separately.
We need some bag of parameters, but we don't need this bag to have doer semantics I would think, an inert "plain old data" object would do. I'd imagine something like this would work: pub struct ShardTriesParams {
pub shard_version: ShardVersion,
pub num_shards: NumShards,
pub default_shard_capacity: usize,
pub shard_capacities: HashMap<ShardUId, usize>,
}
impl ShardTriesParams {
fn shard_capacity(&self, shard_uid: ShardUId) -> usize {
*self.shard_capacities.get(&shard_uid).unwrap_or(&self.default_shard_capacity)
}
}
...
let mut caches = caches_to_use.write().expect(POISONED_LOCK_ERR);
caches
.entry(shard_uid)
.or_insert_with(|| TrieCache::with_capacity(self.0.params.shard_capacity(shard_uid)))
.clone()
... Why I think this would be better:
But, again, that's probably better to be left for future refactor. |
Should this have auto-merge label? It collected all of the approvals I think? |
Guess I've found the answer: not yet |
Revert default value (to 50K) because after #7022 we got more evidence that it doesn't help to speed up storage ops. ## Testing Existing tests.
Revert default value (to 50K) because after near#7022 we got more evidence that it doesn't help to speed up storage ops. ## Testing Existing tests.
In the following release, we want to have variable sizes for trie caches, because shard 3 is going to get increased load. To do so, I introduce
TrieCacheFactory
initialized by the store config data, and move trie cache logic creation there.It's not clear if we can create all caches from the very beginning - from what I remember, new caches have to be created if shard split logic is triggered.
We also reduce
TRIE_LIMIT_CACHED_VALUE_SIZE
to 1000 due to two reasons:I partially use work from #7027 due to urgency of the change, we want to try adding it to the next release.Testing
Manual state-viewer run on shard 2: