-
Notifications
You must be signed in to change notification settings - Fork 684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(params-estimator): Touching trie node est. #6336
feat(params-estimator): Touching trie node est. #6336
Conversation
This introduces estimation logic for
There is one obvious issue, as we can see there is 0 observed IO here. That's because we currently don't clear RocksDB write buffer and OS page cache between blocks. I already know one way to do it but it's quite hacky as it requires to be on Linux with write access to FWIW, my quick local tests had time based estimates increased by about 30% when clearing the cache and the write buffer. So it seems that even with today's inaccuracy, we are already in the vicinity of a correct estimate. At least it should be an improvement over using a ratio of read base cost. |
let key = "j".repeat(final_key_len); | ||
let mut setup_block = Vec::new(); | ||
for key_len in 0..final_key_len { | ||
setup_block.push(tb.account_insert_key(signer.clone(), &key.as_str()[..key_len], "0")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setup_block.push(tb.account_insert_key(signer.clone(), &key.as_str()[..key_len], "0")); | |
let key = &key[..key_len]; | |
let value = "0"; | |
setup_block.push(tb.account_insert_key(signer.clone(), key, value)); |
mostly to make "0"
more obvious
let nodes_touched_delta = ext_cost_long_key[&ExtCosts::touching_trie_node] | ||
- ext_cost_short_key[&ExtCosts::touching_trie_node]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's assert here that we've touched the right amount of nodes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it's not quite as simple as it sounds. While it is true that each additional key produces exactly two extra nodes, the exact number of touched nodes is trickier.
The numbers observed are:
Key length | Touched nodes | |
---|---|---|
Read | 1 byte | 6 |
Write | 1 byte | 7 |
Read | 1000 byte | 2001 |
Write | 1000 byte | 2001 |
So the diffs to assert on would be 1995 and 1994. I can explain 1995 but 1994 still doesn't make sense to me.
The bottom line for me is, since these numbers are not trivial to understand, maybe we should not have these assertions. They might fail when a small implementation detail for the trie changes.
expand only IF you are curious, here is my explanation for 1995.
So the trie shape created looks a bit lik this:
[Account root]
|
[extension for a half-byte]
|
[Branch] --- [ Leaf ] // j
|
[extension for a half-byte]
|
[Branch] --- [ Leaf ] // jj
|
...
|
[extension for a half-byte]
|
[Branch] --- [ Leaf ] // j*999
|
[extension for *three* half-bytes]
|
[ Leaf ] // j*1000
The account root needs to be loaded in all cases, X
nodes are touched for that.
Reading a one-byte-key touches 1 branch + 1 extension + 1 leaf + X
.
The key with n
bytes will lookup (n-1)
branches + (n-1)
extensions + 1 leaf + X
. (It is n-1
and not n
because of the last extension being longer than the others.)
The difference between the two simplifies to 2*n-5
, thus for n=1000
we get 1995.
(Also, we find that X=3
but this is irrelevant for the formula.)
For writes, it should be the exact same story. But nope, somehow there is an additional lookup but only for the short key. Order doesn't matter here, it is always the short key that has this extra lookup. Maybe I will figure it out eventually. But the complexity alone is reason enough for me to question assertions for exact numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we assert "around 2k" then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that's reasonable. I've added an assertion on +/- 10 nodes with a comment that explains it.
let arg = (key.len() as u64) | ||
.to_le_bytes() | ||
.into_iter() | ||
.chain(key.as_bytes().into_iter().cloned()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.chain(key.as_bytes().into_iter().cloned()) | |
.chain(key.bytes()) |
Actually, I think it matches the current parameter more or less exactly -- that 3x difference comes from the safety multiplier. |
I've now already addressed the suggestions to simplify the code. But personally I still feel like the functions @matklad If you share the same sentiment, I can give it another shuffle to make it more readable and maybe avoid some copy-pasting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
I think the functions are rather OK!
They are not suuuper simple themselves (there's quite a bunch of logic in there), but they don't use a tonne of abstractions. You just need to understand the functions themselves, there's little context involved. So, I don't thing we should try to go out of our way here.
In other words, complexity of a single, isolated function is not a problem -- only sprawling complexity is problematic.
let nodes_touched_delta = ext_cost_long_key[&ExtCosts::touching_trie_node] | ||
- ext_cost_short_key[&ExtCosts::touching_trie_node]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we assert "around 2k" then?
Replace old placeholder estimation for touching_trie_node with an actual estimation.
24fec09
to
122e67d
Compare
Replace old placeholder estimation for touching_trie_node
with an actual estimation.