-
Notifications
You must be signed in to change notification settings - Fork 20.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stateless witness prefetcher changes #29519
Stateless witness prefetcher changes #29519
Conversation
core/state/trie_prefetcher.go
Outdated
case ch := <-sf.copy: | ||
// Somebody wants a copy of the current trie, grant them | ||
ch <- sf.db.CopyTrie(sf.trie) | ||
|
||
case <-sf.stop: | ||
// Termination is requested, abort and leave remaining tasks | ||
// Termination is requested, abort |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing the comment is nice, but the code doesn't reflect it :P
The code should check if sf.tasks
is nil or not and if not, should keep looping until it becomes so, otherwise we run the risk of receiving a last task and immediately closing down; the close being executed first (remember, select branch evaluation is non-deterministic if multiple channels are ready).
core/state/trie_prefetcher.go
Outdated
func (p *triePrefetcher) used(owner common.Hash, root common.Hash, used [][]byte) { | ||
if p.closed { | ||
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are IMO not good changes. It makes things harder to reason about and the close becomes this magic thing that nukes the prefercher offline, but I'm not sure that's the intended case, since close is also teh thing that waits for data to be finished. So we need to figure out what close does: kill it, or wait on it.
core/state/trie_prefetcher.go
Outdated
return nil | ||
} | ||
return sf.db.CopyTrie(sf.trie) | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really see the point of this change, it makes peek useless after close, but close it teh thing that waits for all the data to be loaded, so it's kind of ... weird
core/state/trie_prefetcher.go
Outdated
// abort interrupts the subfetcher immediately. It is safe to call abort multiple | ||
// times but it is not thread safe. | ||
func (sf *subfetcher) abort() { | ||
// close waits for the subfetcher to finish its tasks. It cannot be called multiple times |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it can be called multiple times. Close might also not be the best name since we're waiting for it to finish but should AFAIK not kill the thing.
core/state/trie_prefetcher.go
Outdated
} | ||
|
||
case ch := <-sf.copy: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure about this code path here with the rewrite. Do we want to allow retrieval from a live prefetcher? If yes, why? Perhaps for tx boundaries? We should really document it somewhere why - if - it's needed. It's a very specific use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we call updateTrie
on a state object, we attempt to source the trie from the prefetcher. So copying from a live prefetcher is used here to preserve that functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think one issue that teh PR does not address but it must is what the new lifecycle of the prefetcher is. Previously it was jsut something we threw data at, and then at some point we aborted it and pulled every useful data it [re-loaded and built our stuff on top.
The new logic seems to push it towards a witness where we wait for all data to be loaded before pulling and operating on it. But the code doesn't seem to reflect that, many paths instead becomming dud after a close.
Either this PR is only half the code needed that actually uses the prefetcher as is, or something's kind of borked. Either way, we must define what the intended behavior is and both document it as well as make sure teh prefetcher adheres to it.
I'm kind of wondering whether close is needed, rather we should have a wait method which perhaps just ensure everything is loaded. Whether we're between txs or at block end, waiting for prefetching to finish makes sense. I guess close might be needed to nuke out the loop goroutine, but we should still have a wait then before peeking at stuff. Ah, I guess the "implicit" behavioral thing this PR is aiming for is that the prefetcher is not thread safe so by the time qwe wall peek, any shceduled data is already prefetched. I don't think that's the case, at least it's a dangerous data-race to assume that events fired on 2 different channels will arrive in the exact order one expects. If this is the inteded behavior, I'd rather make it ever so slightly more explicit that hoping for a good order of events.
As I see it, the prefetcher needs a couple of phases.
Perhaps we need something more elaborate than this, but, whatever we need, we would be well served by first jotting down the description in human language; before doing some lock/mutex/channel-based implementation of "something" |
As I understand the difference between the old and new prefetcher is (should be) as follows:
|
It's actually meant to gather witnesses for read values. In the stateless witness builder PR, I gather write witnesses from committing the tries. But iirc, earlier on the call today you mentioned not tying the retrieval of write witnesses to the commit operation, which would change the assumptions from my original code. |
…ssociated subfetcher Co-authored-by: Martin HS <[email protected]> Co-authored-by: Péter Szilágyi <[email protected]>
139448f
to
f5ec2e7
Compare
core/state/state_object.go
Outdated
// if a prefetcher is available. This path is used if snapshots are unavailable, | ||
// since that requires reading the trie *during* execution, when the prefetchers | ||
// cannot yet return data. | ||
func (s *stateObject) getTrie(skipPrefetcher bool) (Trie, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, skipPrefetcher
is kind of an ugly hack, I just wanted to avoid the lack-of-snapshot poking into the prefetcher. Open to cleaner suggestions.
if s.data.Root == types.EmptyRootHash || s.db.prefetcher == nil { | ||
return nil, nil | ||
} | ||
// Attempt to retrieve the trie from the pretecher |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo pretecher
=> prefetcher
…ereum#29519) * core/state: trie prefetcher change: calling trie() doesn't stop the associated subfetcher Co-authored-by: Martin HS <[email protected]> Co-authored-by: Péter Szilágyi <[email protected]> * core/state: improve prefetcher * core/state: restore async prefetcher stask scheduling * core/state: finish prefetching async and process storage updates async * core/state: don't use the prefetcher for missing snapshot items * core/state: remove update concurrency for Verkle tries * core/state: add some termination checks to prefetcher async shutdowns * core/state: differentiate db tries and prefetched tries * core/state: teh teh teh --------- Co-authored-by: Jared Wasinger <[email protected]> Co-authored-by: Martin HS <[email protected]> Co-authored-by: Gary Rong <[email protected]>
@@ -17,6 +17,7 @@ | |||
package state | |||
|
|||
import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTC 2
…ereum#29519) * core/state: trie prefetcher change: calling trie() doesn't stop the associated subfetcher Co-authored-by: Martin HS <[email protected]> Co-authored-by: Péter Szilágyi <[email protected]> * core/state: improve prefetcher * core/state: restore async prefetcher stask scheduling * core/state: finish prefetching async and process storage updates async * core/state: don't use the prefetcher for missing snapshot items * core/state: remove update concurrency for Verkle tries * core/state: add some termination checks to prefetcher async shutdowns * core/state: differentiate db tries and prefetched tries * core/state: teh teh teh --------- Co-authored-by: Jared Wasinger <[email protected]> Co-authored-by: Martin HS <[email protected]> Co-authored-by: Gary Rong <[email protected]>
Superseeds #29035 because OP didn't permit modifications from maintainers...