Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: trie cache factory to allow variable cache sizes #7022

Merged
merged 11 commits into from
Jun 15, 2022

Conversation

Longarithm
Copy link
Member

@Longarithm Longarithm commented Jun 13, 2022

In the following release, we want to have variable sizes for trie caches, because shard 3 is going to get increased load. To do so, I introduce TrieCacheFactory initialized by the store config data, and move trie cache logic creation there.

It's not clear if we can create all caches from the very beginning - from what I remember, new caches have to be created if shard split logic is triggered.

We also reduce TRIE_LIMIT_CACHED_VALUE_SIZE to 1000 due to two reasons:

  • most frequently occured nodes have size < 1000; e.g. branches use ~32 * 16 = 512 bytes
  • this will make RAM increase smaller - from 1.6 GB to 0.4 + 2 = 2.4 GB.

I partially use work from #7027 due to urgency of the change, we want to try adding it to the next release.

Testing

Manual state-viewer run on shard 2:

$ ./target/release/neard --unsafe-fast-startup view_state --readwrite apply_range --start-index 91713500 --end-index 91713700 --shard-id 2 --sequential
Applying chunks in the range 91713500..=91713700 for shard_id 2                                                                                                                               [4/1867]
Printing results including outcomes of applying receipts

Processed 50 blocks, 100443 ms passed, 0.4978 blocks per second (0 skipped), 303.34 secs remaining 3 empty blocks 204.21 avg gas per non-empty block
Processed 100 blocks, 72478 ms passed, 0.6899 blocks per second (0 skipped), 146.41 secs remaining 4 empty blocks 199.30 avg gas per non-empty block
Processed 150 blocks, 60252 ms passed, 0.8298 blocks per second (0 skipped), 61.46 secs remaining 17 empty blocks 252.45 avg gas per non-empty block
Processed 200 blocks, 65808 ms passed, 0.7598 blocks per second (0 skipped), 1.32 secs remaining 9 empty blocks 206.71 avg gas per non-empty block

@Longarithm Longarithm force-pushed the trie-cache-factory branch from 8757766 to 46b09f9 Compare June 13, 2022 20:17
@Longarithm Longarithm changed the title implement trie cache factory feat: trie cache factory to allow variable cache sizes Jun 13, 2022
@Longarithm Longarithm force-pushed the trie-cache-factory branch from dfd2705 to 6da2d66 Compare June 13, 2022 20:49
@Longarithm Longarithm marked this pull request as ready for review June 13, 2022 20:50
@Longarithm Longarithm requested a review from a team as a code owner June 13, 2022 20:50
@Longarithm Longarithm requested a review from matklad June 13, 2022 20:50
@Longarithm Longarithm requested review from firatNEAR and akhi3030 June 13, 2022 21:00
@Longarithm Longarithm self-assigned this Jun 13, 2022
@Longarithm Longarithm added the T-core Team: issues relevant to the core team label Jun 13, 2022
Copy link
Contributor

@firatNEAR firatNEAR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, we need to wait out for the mainnet shardId that is going to have increased load before we merge.

Copy link
Collaborator

@akhi3030 akhi3030 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! A couple of minor comments. It will be good to get reviews from someone more seasoned in the code base as well though.

core/store/src/config.rs Outdated Show resolved Hide resolved
@@ -67,8 +69,9 @@ impl Store {
/// Caller must hold the temporary directory returned as first element of
/// the tuple while the store is open.
pub fn tmp_opener() -> (tempfile::TempDir, StoreOpener<'static>) {
static CONFIG: Lazy<StoreConfig> = Lazy::new(StoreConfig::test_config);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand why we need to cache the config here. Can we not always call test_config? Is calling test_config a very expensive operation or can it return different values when called again and again?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @mina86, this is the kind of side-effects which made me to avoid lifetime parameters by default (echoing back #6973 (review)).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of changes I took from #7027, and I don't fully understand the reasoning as well. Is it fine to proceed with this PR and discuss this moment in #7027?

core/store/src/trie/shard_tries.rs Outdated Show resolved Hide resolved
core/store/src/config.rs Outdated Show resolved Hide resolved
core/store/src/trie/shard_tries.rs Outdated Show resolved Hide resolved
@matklad
Copy link
Contributor

matklad commented Jun 14, 2022

With or_insert -> or_insert_with change looks good to me as a quick fix.

However, the overall code does feel a bit "bolted on". I would like to fix the following things in the follow up:

  • make sure that default and overriden cache capacity is handled in the same way. It doesn't make sense that the former is hard-coded, but the latter is overridable

  • Cleanup defaults in StoreConfig, it feels messy that the source of truth are awkward functions which exists solely for serde

  • Replace TrieFactory with just TrieConfig -- we don't need doer object here, just a bag of values

  • revisit the format of the config,

    "trie_cache_capacities": [
      [
        {
          "version": 1,
          "shard_id": 2
        },
        2000000
      ]
    ]
    

    doesn't look like a good config format, something like { "1:2": "2000000" } would'd be much user friendlier.

Looogarithm and others added 2 commits June 14, 2022 15:22
@Longarithm Longarithm marked this pull request as draft June 14, 2022 11:32
@matklad
Copy link
Contributor

matklad commented Jun 14, 2022

However, the overall code does feel a bit "bolted on". I would like to fix the following things in the follow up:

Note: I think we shouldn't puse all that in this PR immediately :)

@Longarithm Longarithm force-pushed the trie-cache-factory branch from e0d7d12 to c60e8c5 Compare June 14, 2022 12:19
@Longarithm Longarithm marked this pull request as ready for review June 14, 2022 12:46
@Longarithm
Copy link
Member Author

Great points @matklad, I will use the moment and ask some questions:

make sure that default and overriden cache capacity is handled in the same way. It doesn't make sense that the former is hard-coded, but the latter is overridable

I think that putting all capacities to config leads to lots of repeated data in the config, especially if we have more shards.

Replace TrieFactory with just TrieConfig -- we don't need doer object here, just a bag of values

I would like to, but logically we create trie caches' objects before creating ShardTries object, so we need some doer. Having create_cache method in TrieConfig looks weird to me.

@Longarithm Longarithm requested a review from matklad June 14, 2022 12:58
@matklad
Copy link
Contributor

matklad commented Jun 14, 2022

I think that putting all capacities to config leads to lots of repeated data in the config, especially if we have more shards.

So I think that users would want to put nothing in their config in either case -- we are relying on default values to makes sense. The problem isn't how the data in the config looks, but rather the fact that the actual size of the thing at runtime is determined by two completely separate sources of information:

  • a hard-coded constant in the code
  • a value from the config object

I think it wold be beneficial, purely from the code quality point of view, to make sure that there's a single place in code which ultimately determines the size of the cache. Given that we want to make some aspect of this to be configurable, it makes sense for config to be this source of truth. Other than that, I think even operationally it would be useful to configure the size of cache for all shards withoug explicitly overriding each shard. But, again, this is not something we should be tackling in this PR: if there's time pressure, we can ship minimal correct diff, and work at making API nicer separately.

I would like to, but logically we create trie caches' objects before creating ShardTries object, so we need some doer.

We need some bag of parameters, but we don't need this bag to have doer semantics I would think, an inert "plain old data" object would do. I'd imagine something like this would work:

pub struct ShardTriesParams {
    pub shard_version: ShardVersion,
    pub num_shards: NumShards,

    pub default_shard_capacity: usize,
    pub shard_capacities: HashMap<ShardUId, usize>,
}

impl ShardTriesParams {
    fn shard_capacity(&self, shard_uid: ShardUId) -> usize {
        *self.shard_capacities.get(&shard_uid).unwrap_or(&self.default_shard_capacity)
    }
}

...
            let mut caches = caches_to_use.write().expect(POISONED_LOCK_ERR);
            caches
                .entry(shard_uid)
                .or_insert_with(|| TrieCache::with_capacity(self.0.params.shard_capacity(shard_uid)))
                .clone()
...

Why I think this would be better:

  • when designing API, it's always beneficial to think from the call-site perspective. At the call-site, we don't care that the thing is used internally to create caches, at that level of abstraction we only care that there's a bunch of knobs on the tries, and we want to set those knobs to particular value.
  • we don't actually need a factory object here: there's no meaningful state the factory would hold. A simple fn crate_shard_cache(params: ShardTriesParams, shard: ShardId) -> TrieCache would do.
  • Between "object with state and behavior" and "plain old data", data is usually the simpler of the two, and should be default choice.

But, again, that's probably better to be left for future refactor.

@matklad
Copy link
Contributor

matklad commented Jun 14, 2022

Should this have auto-merge label? It collected all of the approvals I think?

@matklad
Copy link
Contributor

matklad commented Jun 14, 2022

LGTM, we need to wait out for the mainnet shardId that is going to have increased load before we merge.

Guess I've found the answer: not yet

@Longarithm Longarithm changed the title feat: trie cache factory to allow variable cache sizes [do not merge] feat: trie cache factory to allow variable cache sizes Jun 14, 2022
@Longarithm Longarithm changed the title [do not merge] feat: trie cache factory to allow variable cache sizes feat: trie cache factory to allow variable cache sizes Jun 15, 2022
@near-bulldozer near-bulldozer bot merged commit fc16eb2 into near:master Jun 15, 2022
nikurt added a commit that referenced this pull request Jun 20, 2022
near-bulldozer bot pushed a commit that referenced this pull request Jun 24, 2022
Revert default value (to 50K) because after #7022 we got more evidence that it doesn't help to speed up storage ops.

## Testing

Existing tests.
mina86 pushed a commit to mina86/nearcore that referenced this pull request Jun 24, 2022
Revert default value (to 50K) because after near#7022 we got more evidence that it doesn't help to speed up storage ops.

## Testing

Existing tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-core Team: issues relevant to the core team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants