feat(params-estimator): Touching trie node est. #6336

jakmeier · 2022-02-22T17:11:16Z

Replace old placeholder estimation for touching_trie_node
with an actual estimation.

jakmeier · 2022-02-22T17:31:49Z

This introduces estimation logic for TouchingTrieNode. (Related to #3193) It produces estimates around 28% of the current parameter.

TouchingTrieNode           4_532_429_869 gas [    36259.44i 0.00r 0.00w]           (computed in 138.397667916s)
TouchingTrieNodeRead       1_575_849_586 gas [    12606.80i 0.00r 0.00w]           (computed in 211.415µs)
TouchingTrieNodeWrite      4_532_429_869 gas [    36259.44i 0.00r 0.00w]           (computed in 107.584µ

There is one obvious issue, as we can see there is 0 observed IO here. That's because we currently don't clear RocksDB write buffer and OS page cache between blocks. I already know one way to do it but it's quite hacky as it requires to be on Linux with write access to /proc/sys/vm/drop_caches. This only works with sudo and not at all inside docker. I will fix that together with #3097.

FWIW, my quick local tests had time based estimates increased by about 30% when clearing the cache and the write buffer. So it seems that even with today's inaccuracy, we are already in the vicinity of a correct estimate. At least it should be an improvement over using a ratio of read base cost.

runtime/runtime-params-estimator/src/lib.rs

matklad · 2022-02-23T12:12:18Z

runtime/runtime-params-estimator/src/lib.rs

+    let key = "j".repeat(final_key_len);
+    let mut setup_block = Vec::new();
+    for key_len in 0..final_key_len {
+        setup_block.push(tb.account_insert_key(signer.clone(), &key.as_str()[..key_len], "0"));


Suggested change

setup_block.push(tb.account_insert_key(signer.clone(), &key.as_str()[..key_len], "0"));

let key = &key[..key_len];

let value = "0";

setup_block.push(tb.account_insert_key(signer.clone(), key, value));

mostly to make "0" more obvious

matklad · 2022-02-23T12:14:23Z

runtime/runtime-params-estimator/src/lib.rs

+    let nodes_touched_delta = ext_cost_long_key[&ExtCosts::touching_trie_node]
+        - ext_cost_short_key[&ExtCosts::touching_trie_node];


Let's assert here that we've touched the right amount of nodes

Well, it's not quite as simple as it sounds. While it is true that each additional key produces exactly two extra nodes, the exact number of touched nodes is trickier.

The numbers observed are:

Key length Touched nodes

Read 1 byte 6

Write 1 byte 7

Read 1000 byte 2001

Write 1000 byte 2001

So the diffs to assert on would be 1995 and 1994. I can explain 1995 but 1994 still doesn't make sense to me.

The bottom line for me is, since these numbers are not trivial to understand, maybe we should not have these assertions. They might fail when a small implementation detail for the trie changes.

expand only IF you are curious, here is my explanation for 1995.

So the trie shape created looks a bit lik this:

[Account root] | [extension for a half-byte] | [Branch] --- [ Leaf ] // j | [extension for a half-byte] | [Branch] --- [ Leaf ] // jj | ... | [extension for a half-byte] | [Branch] --- [ Leaf ] // j*999 | [extension for *three* half-bytes] | [ Leaf ] // j*1000

The account root needs to be loaded in all cases, X nodes are touched for that.

Reading a one-byte-key touches 1 branch + 1 extension + 1 leaf + X.

The key with n bytes will lookup (n-1) branches + (n-1) extensions + 1 leaf + X. (It is n-1 and not n because of the last extension being longer than the others.)

The difference between the two simplifies to 2*n-5, thus for n=1000 we get 1995.
(Also, we find that X=3 but this is irrelevant for the formula.)

For writes, it should be the exact same story. But nope, somehow there is an additional lookup but only for the short key. Order doesn't matter here, it is always the short key that has this extra lookup. Maybe I will figure it out eventually. But the complexity alone is reason enough for me to question assertions for exact numbers.

Can we assert "around 2k" then?

Yes that's reasonable. I've added an assertion on +/- 10 nodes with a comment that explains it.

matklad · 2022-02-23T12:16:10Z

runtime/runtime-params-estimator/src/transaction_builder.rs

+        let arg = (key.len() as u64)
+            .to_le_bytes()
+            .into_iter()
+            .chain(key.as_bytes().into_iter().cloned())


Suggested change

.chain(key.as_bytes().into_iter().cloned())

.chain(key.bytes())

matklad · 2022-02-23T12:23:09Z

It produces estimates around 28% of the current parameter.

Actually, I think it matches the current parameter more or less exactly -- that 3x difference comes from the safety multiplier.

jakmeier · 2022-02-23T22:22:26Z

I've now already addressed the suggestions to simplify the code. But personally I still feel like the functions fn touching_trie_node_write and fn touching_trie_node_read are too hard to read. Ideally, all these estimation functions are easy to grasp for an outsider.

@matklad If you share the same sentiment, I can give it another shuffle to make it more readable and maybe avoid some copy-pasting.

matklad

LGTM!

I think the functions are rather OK!

They are not suuuper simple themselves (there's quite a bunch of logic in there), but they don't use a tonne of abstractions. You just need to understand the functions themselves, there's little context involved. So, I don't thing we should try to go out of our way here.

In other words, complexity of a single, isolated function is not a problem -- only sprawling complexity is problematic.

matklad · 2022-03-09T17:16:57Z

runtime/runtime-params-estimator/src/lib.rs

+    let nodes_touched_delta = ext_cost_long_key[&ExtCosts::touching_trie_node]
+        - ext_cost_short_key[&ExtCosts::touching_trie_node];


Can we assert "around 2k" then?

Replace old placeholder estimation for touching_trie_node with an actual estimation.

jakmeier requested review from olonho and matklad as code owners February 22, 2022 17:11

matklad reviewed Feb 22, 2022

View reviewed changes

runtime/runtime-params-estimator/src/lib.rs Outdated Show resolved Hide resolved

jakmeier mentioned this pull request Feb 22, 2022

Ensure touch_trie_node fee is correct #3193

Closed

matklad reviewed Feb 23, 2022

View reviewed changes

matklad approved these changes Mar 9, 2022

View reviewed changes

jakmeier added 4 commits March 10, 2022 10:59

feat(params-estimator): Touching trie node est.

37c5009

Replace old placeholder estimation for touching_trie_node with an actual estimation.

use str::repeat instead of manual reimplementation

a88b64e

Small simplifications pointed out in review

bb30e26

Assert on approximate number of nodes touched

122e67d

jakmeier force-pushed the jakmeier-estimate-touch-trie-node branch from 24fec09 to 122e67d Compare March 10, 2022 10:21

jakmeier added the S-automerge label Mar 10, 2022

near-bulldozer bot merged commit d014743 into near:master Mar 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(params-estimator): Touching trie node est. #6336

feat(params-estimator): Touching trie node est. #6336

jakmeier commented Feb 22, 2022

jakmeier commented Feb 22, 2022

matklad Feb 23, 2022

matklad Feb 23, 2022

jakmeier Feb 23, 2022

matklad Mar 9, 2022

jakmeier Mar 10, 2022

matklad Feb 23, 2022

matklad commented Feb 23, 2022

jakmeier commented Feb 23, 2022

matklad left a comment

matklad Mar 9, 2022

-        setup_block.push(tb.account_insert_key(signer.clone(), &key.as_str()[..key_len], "0"));
+        let key = &key[..key_len];
+        let value = "0";
+        setup_block.push(tb.account_insert_key(signer.clone(), key, value));

		let nodes_touched_delta = ext_cost_long_key[&ExtCosts::touching_trie_node]
		- ext_cost_short_key[&ExtCosts::touching_trie_node];

	.chain(key.as_bytes().into_iter().cloned())
	.chain(key.bytes())

feat(params-estimator): Touching trie node est. #6336

feat(params-estimator): Touching trie node est. #6336

Conversation

jakmeier commented Feb 22, 2022

jakmeier commented Feb 22, 2022

matklad Feb 23, 2022

Choose a reason for hiding this comment

matklad Feb 23, 2022

Choose a reason for hiding this comment

jakmeier Feb 23, 2022

Choose a reason for hiding this comment

matklad Mar 9, 2022

Choose a reason for hiding this comment

jakmeier Mar 10, 2022

Choose a reason for hiding this comment

matklad Feb 23, 2022

Choose a reason for hiding this comment

matklad commented Feb 23, 2022

jakmeier commented Feb 23, 2022

matklad left a comment

Choose a reason for hiding this comment

matklad Mar 9, 2022

Choose a reason for hiding this comment