-
Notifications
You must be signed in to change notification settings - Fork 20.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core: prefetch next block state concurrently #19328
Conversation
Interesting idea, looking forward to some benchmarks -- might go either way, since we'll effectively execute most things twice. I'm not 100% sure that there won't be any concurrent access errors for some cache somewhere, personally. |
There should be no error, because this essentially emulates doing a CALL for each transaction and throwing away the results. If this fails, our RPC APIs are also borked. Or fast sync. As for the benchmarks, I'm genuinely curious what happens. Restarted mon08/09 with your PoC cleaner PR (09) and this one on top (08) in full sync mode. Lets see what happens. |
20a6f37
to
63b8e87
Compare
I expect this to produce a pretty nice performance win! Doing a few pprof runs a few days ago showed that account storage trie nodes were the biggest bottleneck throughout block processing. Performing speculative execution of a block should be able to do a good job prefetching a lot of these nodes. I do wonder how doing the speculation at a transaction-level granularity compares to a block-level granularity. I played around with a similar idea for doing prefetching just for the transaction sender account data, recipient account data, and the recipient contract code and didn't get outstanding results. This was for a few reason: My only knit would be to call it a |
It's a bit hard to quantify the win, I guess the amount of read/write cache directly competes with the optimizations introduced here. For a 4GB archive sync, I think it took until block 4.xM until this PR was visible. I'm running a Regarding the bottlenecks, we've added some new metrics (a bit flawed, but useful) that show exactly what evm execution spends its time on. This PR on an archive sync: This PR on full sync: |
3 day mark followup After crossing over Byzantium (and Ethereum picking up tx volume), this PR seems to produce around a 25% performance gain. Lets see how it evolves towards the head of the chain, but looks promising. Hoped for a bit more really, but not complaining like this either. Perhaps we need some closer investigation as to exactly what's the bottleneck now (just to see if this PR indeed cannot do more, or if it can be tweaked further). One thing to investigate is how frequently the concurrent execution is aborted prematurely. It might happen that moving the interruption around a bit gives it more time to cache more useful data. Maybe not, we need a number on it. @matthalp I did transaction level granularity a while back, but that doesn't seem to have worked too well because a single tx is relatively light. So concurrently processing them frequently hits issues where the "current" one is done fast, so the background execution is just pointless. The block version is less "optimal" but it's a bit more stable imho. An alternative would be to try a mixture of both, concurrently prefetch hopefully useful data from the peeked next block, and at the same time prefetch probably useful data from the current block's future transactions. That said, benchmark, benchmark, benchmark :P It's a PITA to do these optimizations. |
It generally looks good, but I think I would prefer a CLI switch to disable this functionality. There might be reasons to not run this, such as,
|
@holiman PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I need the clean cache fix (#19307) in first, then this PR can be benchmarked on top.