Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(store): track store's contiguous head #239

Open
wants to merge 44 commits into
base: feat-adj
Choose a base branch
from

Conversation

cristaloleg
Copy link
Contributor

@cristaloleg cristaloleg commented Jan 8, 2025

  • Introduce a new unexported Store field which tracks the highest contiguous header observed.
  • Rework heightSub to work only with height and not headers (drastically simplified internals and API)
  • Load headKey on Store.Start
  • Store.Head is much simpler, again
  • batch will be reworked in the next PR.

Fixes #201

store/store.go Outdated Show resolved Hide resolved
store/heightsub.go Outdated Show resolved Hide resolved
store/store.go Show resolved Hide resolved
store/store_test.go Show resolved Hide resolved
store/store_test.go Outdated Show resolved Hide resolved
@cristaloleg cristaloleg marked this pull request as draft January 10, 2025 13:09
@cristaloleg cristaloleg marked this pull request as ready for review January 13, 2025 13:09
store/heightsub.go Outdated Show resolved Hide resolved
sync/sync_test.go Outdated Show resolved Hide resolved
Copy link
Member

@Wondertan Wondertan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are getting there

Besides the comments, there are two more cases for head/subscription handling we need to support:

  • Getting headers above contigious head
Height == 100
Append 150
GetByHeight 150 -> no error
  • Subscribing for headers above contigious head
Height == 100
goA GetByHeight(150) -> block
Append(150) -> no error
goA -> unblock, no error

We can also do them in a follow up

store/heightsub.go Outdated Show resolved Hide resolved
store/store.go Outdated Show resolved Hide resolved
store/heightsub.go Outdated Show resolved Hide resolved
store/heightsub.go Outdated Show resolved Hide resolved
store/heightsub.go Outdated Show resolved Hide resolved
store/store.go Outdated Show resolved Hide resolved
cristaloleg pushed a commit that referenced this pull request Jan 14, 2025
store/store.go Outdated Show resolved Hide resolved
@Wondertan
Copy link
Member

Let's add more more info on what has changed in the description

store/batch.go Show resolved Hide resolved
Comment on lines +42 to +45
for h := range hs.heightSubs {
if h < height {
hs.notify(h, true)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which situation would there be active subscriptions on an un-initialised heightsub instance (which means an uninitialised header store)? Just curious

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have test TestBatch_GetByHeightBeforeInit which can invoke a method on non-inited store. To keep this behaviour (at least in this RP) we need to handle this case properly in heightSub.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary that heightsub should be able to be re-initialised with pre-existing subscriptions still on it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its not see #243

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be resolved?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean yes, it's just still unclear to me in which case there are waiters on a unintialised heightsub in current use of the lib

store/heightsub_test.go Outdated Show resolved Hide resolved
store/store.go Outdated Show resolved Hide resolved
store/store.go Outdated

newHeight := s.advanceContiguousHead(ctx, h.Height())
if newHeight >= h.Height() {
s.contiguousHead.Store(&h)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't this already happen in updateContiguousHead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might not happen if we will not find a higher head due to if currHeight > prevHeight {.

That's why we do here >= and update.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't this check be de-duplicated and left only to advanceContiguousHead where it can check for >= on #L541 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, can be. The only things is that we will do a bit more often headKey updates. I don't see a problem with that, done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, no. We need a real header (newHead) to do updateContiguousHead. Adding >= will complicate code for not a little reason. Reverted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be resolved?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still do not understand why we are doing this check, in different formats, in 2 different places. It's complicated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After today's changes this can be simplified even more. Done.

store/store.go Outdated Show resolved Hide resolved
Copy link
Member

@Wondertan Wondertan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another pass with multiple simplifications

store/store.go Outdated Show resolved Hide resolved
store/store.go Outdated Show resolved Hide resolved
store/store.go Outdated Show resolved Hide resolved
store/store.go Outdated Show resolved Hide resolved
store/store.go Outdated Show resolved Hide resolved
store/store.go Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 90.29851% with 13 lines in your changes missing coverage. Please review.

Project coverage is 64.21%. Comparing base (88c5b8c) to head (ac051e4).
Report is 27 commits behind head on main.

Files with missing lines Patch % Lines
store/heightsub.go 86.66% 6 Missing and 2 partials ⚠️
store/store.go 92.75% 3 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #239      +/-   ##
==========================================
+ Coverage   62.80%   64.21%   +1.41%     
==========================================
  Files          39       38       -1     
  Lines        3589     3641      +52     
==========================================
+ Hits         2254     2338      +84     
+ Misses       1160     1133      -27     
+ Partials      175      170       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Wondertan
Wondertan previously approved these changes Feb 4, 2025
Copy link
Member

@Wondertan Wondertan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Massive

Lets make sure we keep PR description with necessary details and have next PR to fix batch scheduled

Wondertan
Wondertan previously approved these changes Feb 4, 2025
Copy link
Member

@Wondertan Wondertan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve with a bunch of nits

store/store.go Outdated Show resolved Hide resolved
store/store.go Outdated Show resolved Hide resolved
store/store.go Outdated Show resolved Hide resolved
store/store.go Outdated Show resolved Hide resolved
store/store.go Outdated Show resolved Hide resolved
store/store.go Outdated Show resolved Hide resolved
func (s *Store[H]) nextContiguousHead(ctx context.Context, height uint64) H {
var newHead H
for {
height++
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we fetch just height first to see that the header exists at all and start looking for the next after?

No idea, just have realised that there is a window of opportunity that we might look for a contiguous headers when initial doesn't exist (which sounds like a disk datastore corruption or dunno).

if err == nil {
return head, nil
func (s *Store[H]) Head(_ context.Context, _ ...header.HeadOption[H]) (H, error) {
if head := s.contiguousHead.Load(); head != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure this will break the initialisation behaviour in celestia-node. Did you test this PR with celestia-node?

Celestia-node calls the Init function from inside the store package here in go-header, which relies on reading the Head from the store (s) before it calls s.Init (which is unsafe btw - as Init can technically take a header whose height is < actual head height on disk, initialises heightsub with it, stores it in contiguousHead field, and then flushes it to disk which replaces the actual head key).

This means that if the node is trying to use the store.Init function to check whether the node's header store is actually initialised or not (which happens before the store is actually started), then Head will always return ErrNoHead as contiguousHead is not set at that point, causing the node to re-fetch the genesis header and unsafe init the header store with it again.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a helper Init function in store pkg that we use instead of calling the Store.Init directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment to #243 (which stoles idea from 246)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cristaloleg This still does not address the problem that this PR breaks previous behaviour of Head and is incomplete without addressing the issue of unsafe Init + ordering of initialisation vs. Start

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The catch makes sense. Propose to return lazy loading in Head, yet keeping it on Start as well, or reworking Init helper to call Start instead of Head

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, fixed.

// closed s.writesDn means that store was stopped before, recreate chan.
select {
case <-s.writesDn:
s.writesDn = make(chan struct{})
default:
}

if err := s.loadContiguousHead(ctx); err != nil {
// we might start on an empty datastore, no key is okay.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it necessary to change this behaviour in this PR?

Scope is large and this feels like an unnecessary change to group in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, @Wondertan your word

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, otherwise there is a bug explained in some earlier thread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In short we need to have contiguousHead set on start, 'cause we need to update it without gaps. That's why this should happen here.

(Sorry, somehow this wasn't send but msg above was)

if head := s.contiguousHead.Load(); head != nil {
return *head, nil
func (s *Store[H]) Head(ctx context.Context, _ ...header.HeadOption[H]) (H, error) {
head, err := s.GetByHeight(ctx, s.heightSub.Height())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also try check loading contiguousHead here and if nil, then try readHead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

@Wondertan Wondertan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve from my side :)

Massive work. Thank you @cristaloleg!

@cristaloleg cristaloleg changed the base branch from main to feat-adj February 14, 2025 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

store: Head of store needs to be the *contiguous* head of the store
4 participants