-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(store): track store's contiguous head #239
base: feat-adj
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are getting there
Besides the comments, there are two more cases for head/subscription handling we need to support:
- Getting headers above contigious head
Height == 100
Append 150
GetByHeight 150 -> no error
- Subscribing for headers above contigious head
Height == 100
goA GetByHeight(150) -> block
Append(150) -> no error
goA -> unblock, no error
We can also do them in a follow up
Coming out as a result of reviewing #239
Let's add more more info on what has changed in the description |
for h := range hs.heightSubs { | ||
if h < height { | ||
hs.notify(h, true) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In which situation would there be active subscriptions on an un-initialised heightsub instance (which means an uninitialised header store)? Just curious
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have test TestBatch_GetByHeightBeforeInit
which can invoke a method on non-inited store. To keep this behaviour (at least in this RP) we need to handle this case properly in heightSub
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary that heightsub should be able to be re-initialised with pre-existing subscriptions still on it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its not see #243
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be resolved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean yes, it's just still unclear to me in which case there are waiters on a unintialised heightsub in current use of the lib
store/store.go
Outdated
|
||
newHeight := s.advanceContiguousHead(ctx, h.Height()) | ||
if newHeight >= h.Height() { | ||
s.contiguousHead.Store(&h) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't this already happen in updateContiguousHead
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might not happen if we will not find a higher head due to if currHeight > prevHeight {
.
That's why we do here >=
and update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't this check be de-duplicated and left only to advanceContiguousHead
where it can check for >=
on #L541 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, can be. The only things is that we will do a bit more often headKey
updates. I don't see a problem with that, done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, no. We need a real header (newHead
) to do updateContiguousHead
. Adding >=
will complicate code for not a little reason. Reverted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be resolved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still do not understand why we are doing this check, in different formats, in 2 different places. It's complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After today's changes this can be simplified even more. Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another pass with multiple simplifications
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #239 +/- ##
==========================================
+ Coverage 62.80% 64.21% +1.41%
==========================================
Files 39 38 -1
Lines 3589 3641 +52
==========================================
+ Hits 2254 2338 +84
+ Misses 1160 1133 -27
+ Partials 175 170 -5 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Massive
Lets make sure we keep PR description with necessary details and have next PR to fix batch
scheduled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approve with a bunch of nits
func (s *Store[H]) nextContiguousHead(ctx context.Context, height uint64) H { | ||
var newHead H | ||
for { | ||
height++ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we fetch just height
first to see that the header exists at all and start looking for the next after?
No idea, just have realised that there is a window of opportunity that we might look for a contiguous headers when initial doesn't exist (which sounds like a disk datastore corruption or dunno).
if err == nil { | ||
return head, nil | ||
func (s *Store[H]) Head(_ context.Context, _ ...header.HeadOption[H]) (H, error) { | ||
if head := s.contiguousHead.Load(); head != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure this will break the initialisation behaviour in celestia-node. Did you test this PR with celestia-node?
Celestia-node calls the Init
function from inside the store package here in go-header, which relies on reading the Head from the store (s) before it calls s.Init (which is unsafe btw - as Init can technically take a header whose height is < actual head height on disk, initialises heightsub with it, stores it in contiguousHead
field, and then flushes it to disk which replaces the actual head key).
This means that if the node is trying to use the store.Init
function to check whether the node's header store is actually initialised or not (which happens before the store is actually started), then Head will always return ErrNoHead
as contiguousHead
is not set at that point, causing the node to re-fetch the genesis header and unsafe init the header store with it again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a helper Init
function in store
pkg that we use instead of calling the Store.Init
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment to #243 (which stoles idea from 246)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cristaloleg This still does not address the problem that this PR breaks previous behaviour of Head and is incomplete without addressing the issue of unsafe Init + ordering of initialisation vs. Start
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The catch makes sense. Propose to return lazy loading in Head
, yet keeping it on Start
as well, or reworking Init
helper to call Start
instead of Head
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, fixed.
// closed s.writesDn means that store was stopped before, recreate chan. | ||
select { | ||
case <-s.writesDn: | ||
s.writesDn = make(chan struct{}) | ||
default: | ||
} | ||
|
||
if err := s.loadContiguousHead(ctx); err != nil { | ||
// we might start on an empty datastore, no key is okay. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it necessary to change this behaviour in this PR?
Scope is large and this feels like an unnecessary change to group in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, @Wondertan your word
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is, otherwise there is a bug explained in some earlier thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In short we need to have contiguousHead
set on start, 'cause we need to update it without gaps. That's why this should happen here.
(Sorry, somehow this wasn't send but msg above was)
if head := s.contiguousHead.Load(); head != nil { | ||
return *head, nil | ||
func (s *Store[H]) Head(ctx context.Context, _ ...header.HeadOption[H]) (H, error) { | ||
head, err := s.GetByHeight(ctx, s.heightSub.Height()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also try check loading contiguousHead here and if nil, then try readHead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approve from my side :)
Massive work. Thank you @cristaloleg!
Store
field which tracks the highest contiguous header observed.heightSub
to work only with height and not headers (drastically simplified internals and API)headKey
onStore.Start
Store.Head
is much simpler, againbatch
will be reworked in the next PR.Fixes #201