-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dbnode] Gracefully handle reads including documents with stale index state #3905
Conversation
072a3c2
to
b8e3286
Compare
src/m3ninx/doc/types.go
Outdated
// Attempts to retrieve the most recent index entry from the shard if this entry | ||
// was never inserted there. If there is an error, returns the value for this entry. | ||
// Additionally, returns a boolean to denote if result came from reconciliation. | ||
ReconciledIndexedForBlockStart(blockStart xtime.UnixNano) (bool, bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, this API works, but it'd be much friendlier, I think, if we could just return the storage.Entry
from the lookup map as an OnIndexSeries
here. The problem is we'd need to expose methods on this interface for incrementing and decrementing the reader+writer count which ends up feeling equally unfriendly and more error prone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True yeah.
Although things we could do to make that nicer:
type CleanupFn func()
ReconciledOnIndexSeries() (OnIndexSeries, CleanupFn)
Then to call it:
onIndexSeries, cleanup := entry.ReconciledOnIndexSeries()
defer cleanup()
// whatever with onIndexSeries
Or something which could be more allocation intensive is passing a function/lambda and doing the lookup and the decrement/etc afterwards explicitly (which might need to be dynamically allocated):
WithReconciledOnIndexSeries(callback func (OnIndexSeries))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the cleanup option since that's a convention we follow in a few different places. Going to rework this PR to do that since I think it'll allow us to get rid of specific Reconciled...
functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just saw this comment now - but my vote for making this less error prone for the consumer of metadata.OnIndexSeries
is to instead have a separate interface that just provides a method for properly retrieving OnIndexSeries
so then whoever has Metadata
is forced to go through this reconciliation path when consuming
require.NoError(t, err) | ||
} | ||
|
||
func generateTestSetup(t *testing.T) TestSetup { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Would be nice if we were able to namespace these method names. I guess if other tests want to use the same kind of thing they can in the future pass in their own overrides and we can move this to a shared place. e.g. func generateTestSetup(t *testing.T, customOpts func(storage.Options) storage.Options) { ... }
.
b8e3286
to
a07684d
Compare
Codecov Report
@@ Coverage Diff @@
## master #3905 +/- ##
======================================
Coverage 56.6% 56.6%
======================================
Files 553 553
Lines 63275 63296 +21
======================================
+ Hits 35815 35863 +48
+ Misses 24256 24238 -18
+ Partials 3204 3195 -9
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
StartInclusive: xtime.Now(), | ||
EndExclusive: xtime.Now(), | ||
StartInclusive: queryTime, | ||
EndExclusive: queryTime.Add(time.Nanosecond), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this have an additional nanosecond?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that the query will actually yield results. If StartInclusive
and EndExclusive
are the exact same value then nothing will be returned. This was happening implicitly by calling xtime.Now
twice in the previous form.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see so that double Now
call was functionally doing the same thing. Kinda scary lol but agreed w/ making this more explicit.
src/dbnode/storage/entry.go
Outdated
// is an error during retrieval, simply returns the current entry. Additionally, | ||
// returns a cleanup function to run once finished using the reconciled entry and | ||
// a boolean value indicating whether the result came from reconciliation or not. | ||
func (entry *Entry) ReconciledOnIndexSeries() (doc.OnIndexSeries, doc.ReconciledOnIndexSeriesCleanupFn, bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe instead of a func + bool, we just return a func and caller checks if it is non-nil?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or always return a func and caller always calls close (and something that func is just noop)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bool is just so we can know if we need to increment the reconciled metric. The caller always calls cleanup which is a noop if there's nothing to reconcile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a bit of a Go anti-pattern to check for nilability instead of just returning a bool as the last arg (since most folks will know based on Go APIs to check a bool if it's returned, where as they might not know whether to check if something is nil since it's not a common pattern).
@@ -543,15 +545,20 @@ func (b *block) queryWithSpan( | |||
// Ensure that the block contains any of the relevant time segments for the query range. | |||
doc := iter.Current() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole range-check code block has gotten pretty big so maybe split to another func. Then we can also defer / make more readable the cleanup func?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. Refactored this a bit. PTAL
src/dbnode/storage/index/block.go
Outdated
@@ -543,15 +545,20 @@ func (b *block) queryWithSpan( | |||
// Ensure that the block contains any of the relevant time segments for the query range. | |||
doc := iter.Current() | |||
if md, ok := doc.Metadata(); ok && md.OnIndexSeries != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it'd be worth, for preventing mistaken calls directly to metadata.OnIndexSeries
in the future, having a separate interface that Entry
implements called something like OnIndexSeriesReference
that just has a single func for GetOnIndexSeries
(i.e. ReconciledOnIndexSeries
). And then we have Metadata
have that so a caller is forced to always go through this reconciled API instead of referencing the field directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we always want to reconcile though? I'm not sure that's true from looking at other users of metadata.OnIndexSeries
. For example, when we call NeedsIndexGarbageCollected
it seems like it specifically doesn't reconcile because it's not worth the extra cost:
// NeedsIndexGarbageCollected checks if the entry is eligible to be garbage collected
// from the index. Otherwise returns false.
func (entry *Entry) NeedsIndexGarbageCollected() bool {
// This is a cheaper check that loading the entry from the shard again
// which makes it cheaper to run frequently.
// It may not be as accurate, but it's fine for an approximation since
// only a single series in a segment needs to return true to trigger an
// index segment to be garbage collected.
if entry.insertTime.Load() == 0 {
return false // Not inserted, does not need garbage collection.
}
// Check that a write is not potentially pending and the series is empty.
return entry.ReaderWriterCount() == 0 && entry.Series.IsEmpty()
}
If we change the interface here, we're always going to reconcile when insertTime is 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm yeah good point that we intentionally don't want to reconcile in other cases.
src/dbnode/storage/index/block.go
Outdated
@@ -543,15 +545,20 @@ func (b *block) queryWithSpan( | |||
// Ensure that the block contains any of the relevant time segments for the query range. | |||
doc := iter.Current() | |||
if md, ok := doc.Metadata(); ok && md.OnIndexSeries != nil { | |||
onIndexSeries, cleanup, reconciled := md.OnIndexSeries.ReconciledOnIndexSeries() | |||
if reconciled { | |||
b.metrics.reconciledEntryOnQuery.Inc(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call on the metric
… state In db nodes, state is kept for cheap lookups to determine if a series is indexed for a specific block start. This state exists on the document in the index and in the shard object. On concurrent writes to the same time series, it's possible for these two sources to get out of sync. Basically, the state associated with the document on the index is a different instance than the one that gets placed in the shard's lookup map. This can result in issues where we don't accurately track that a series in indexed for a specific block start as only the entry in the lookup map will get updated in the future. To resolve this, we introduce reconciliation methods when querying these objects. If it's noticed on read that we're querying index state that never got inserted into the shard lookup map, then we query the shard for the appropriate entry and use the results on that object.
a07684d
to
dd56923
Compare
src/m3ninx/doc/types.go
Outdated
|
||
// ReconciledOnIndexSeriesCleanupFn is a function for performing cleanup when | ||
// ReconciledOnIndexSeries is called. | ||
type ReconciledOnIndexSeriesCleanupFn func() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: This could just be CleanupFn
given it could be used for other returns/etc since it's just a func()
?
To be even cleaner you could return a resource.SimpleCloser
.
https://pkg.go.dev/github.com/m3db/[email protected]/src/x/resource#SimpleCloser
e.g.
import (
"github.com/m3db/m3/src/x/resource"
)
// then definition
ReconciledOnIndexSeries() (OnIndexSeries, resource.SimpleCloser, bool)
// then when returning e.g. cast the func() to a resource.SimplerCloserFunc which implements resource.SimpleCloser
return e, resource.SimpleCloserFn(func() {
e.DecrementReaderWriterCount()
}), true
// then using the results
entry, closer, ok := entry.ReconciledOnIndexSeries()
defer closer.Close()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM other than minor nits to either address or not asynchronously
onIndexSeries, cleanup, reconciled := md.OnIndexSeries.ReconciledOnIndexSeries() | ||
if reconciled { | ||
b.metrics.reconciledEntryOnQuery.Inc(1) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Maybe increment a metric when it's not reconciled too? That way we can tell the magnitude of entries that are reconciled vs not-reconciled (i.e. is 1k/sec big or small number? hard to know if we don't know the total).
src/dbnode/storage/entry.go
Outdated
// is an error during retrieval, simply returns the current entry. Additionally, | ||
// returns a cleanup function to run once finished using the reconciled entry and | ||
// a boolean value indicating whether the result came from reconciliation or not. | ||
func (entry *Entry) ReconciledOnIndexSeries() (doc.OnIndexSeries, doc.ReconciledOnIndexSeriesCleanupFn, bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a bit of a Go anti-pattern to check for nilability instead of just returning a bool as the last arg (since most folks will know based on Go APIs to check a bool if it's returned, where as they might not know whether to check if something is nil since it's not a common pattern).
dd56923
to
a5d486f
Compare
a5d486f
to
9fdd964
Compare
What this PR does / why we need it:
In db nodes, state is kept for cheap lookups to determine if a series is
indexed for a specific block start. This state exists on the document in the
index and in the shard object. On concurrent writes to the same time series,
it's possible for these two sources to get out of sync. Basically, the
state associated with the document on the index is a different instance
than the one that gets placed in the shard's lookup map. This can
result in issues where we don't accurately track that a series in
indexed for a specific block start as only the entry in the lookup
map will get updated in the future.
To resolve this, we introduce reconciliation methods when querying these
objects. If it's noticed on read that we're querying index state that
never got inserted into the shard lookup map, then we query the shard
for the appropriate entry and use the results on that object.
Special notes for your reviewer:
Does this PR introduce a user-facing and/or backwards incompatible change?:
Does this PR require updating code package or user-facing documentation?: