-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dbnode] Series ref resolver #3316
Conversation
* master: [aggregator] Move placement checks to a background job (#3315) [tests] Remove a few more usages of NoOpAllBootstrapper from tests (#3314) [coordinator] make drop timestamp apply to the metric rather than specific rule (#3310) Update server.go (#3298) Fix Data Race in checkoutSeriesWithLock (#3300) [query] Allow configuration of placement options (#3304)
optimize commit log snapshot load.
added UniqueIndex to SeriesRef.
removed unnecessary resolvers.
Codecov Report
@@ Coverage Diff @@
## master #3316 +/- ##
=======================================
Coverage 72.6% 72.6%
=======================================
Files 1099 1099
Lines 103579 103579
=======================================
Hits 75210 75210
Misses 23210 23210
Partials 5159 5159
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
* master: (22 commits) Remove deprecated fields (#3327) Add quotas to Permits (#3333) [aggregator] Drop messages that have a drop policy applied (#3341) Fix NPE due to race with a closing series (#3056) [coordinator] Apply auto-mapping rules if-and-only-if no drop policies are in effect (#3339) [aggregator] Add validation in AddTimedWithStagedMetadatas (#3338) [coordinator] Fix panic in Ready endpoint for admin coordinator (#3335) [instrument] Config option to emit detailed Go runtime metrics only (#3332) [aggregator] Sort heap in one go, instead of iterating one-by-one (#3331) [pool] Add support for dynamic, sync.Pool backed, object pools (#3334) Enable PANIC_ON_INVARIANT_VIOLATED for tests (#3326) [aggregator] CanLead for unflushed window takes BufferPast into account (#3328) Optimize StagedMetadatas conversion (#3330) [m3msg] Improve message scan performance (#3319) [dbnode] Add reason tag to bootstrap retries metric (#3317) [coordinator] Enable rule filtering on prom metric type (#3325) Update m3dbnode-all-config.yml (#3204) [coordinator] Include Type in RollupOp.Equal (#3322) [coordinator] Simplify iteration logic of matchRollupTarget (#3321) [coordinator] Add rollup type to remove specific dimensions (#3318) ...
@@ -28,11 +28,14 @@ import ( | |||
"sync" | |||
"time" | |||
|
|||
"golang.org/x/sync/errgroup" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps move the other third party imports at the bottom to the top here with errgroup import?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
errs, _ := errgroup.WithContext(ctx.GoContext()) | ||
errs.Go(worker.readSeriesBlocks) | ||
if err := s.loadBlocks(worker.dataCh, writeType); err != nil { | ||
close(worker.dataCh) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm doesn't it look like this is closed on line 119?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be better to close it from one place, either in the worker itself always (with defer), or from the outside always.
Maybe just remove this and rely on the defer in the worker itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. Refactored to use cancellable context so that readSeriesBlocks
could be cancelled if loadBlocks()
returns an error unexpectedly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments (mostly nits and clarity suggestions).
* master: [dbnode] Remove unused shardBlockVolume (#3347) Fix new Go 1.15+ vet check failures (#3345) [coordinator] Add config option to make rollup rules untimed (#3343) [aggregator] Raw TCP Client write queueing/buffering refactor (#3342) [dbnode] Fail M3TSZ encoding on DeltaOfDelta overflow (#3329)
* master: [dtest] ns update/delete api (#3344)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Any thoughts on putting storage.seriesResolver
into its own file rather than further growing this monster shard.go file (see #3316 (comment))?
I'd also support moving it into it's own file too |
tags.Close() | ||
|
||
select { | ||
case <-ctx.Done(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be expensive to check on every single series (since reading from channel does need thread safety), perhaps we can check this every N series? Say every 1024 just using a top level var here if i%1024==0 { /* check ctx.Done() */ }
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM other than minor comments (which would be good to address before merging, but stamping so that you can address those asynchronously and then merge PR)
* master: [dbnode] Fix clock options not propagated where needed (#3353)
* master: [x] Add OffsetClock constructor that uses given time delta (#3354)
What this PR does / why we need it:
This PR implements series ref resolver which writes new series asynchronously, reducing lock contention during db-node bootstrapping. So now bootstrappers, instead of writing new series immediately, are able to use resolver to retrieve series ref. Resolver ensures that given series will be written async (and it will wait for the write to complete if it's still not inserted) or it will be returned immediately if it already exists.
Since new series are now written asynchronously, writing new data using series ref ideally should be done when more series ref resolvers are accumulated, because otherwise each data write will wait for the background write process to complete. So in Commit log and Peers bootstrappers some code were updated to retain the same performance as it was before.
Special notes for your reviewer:
Does this PR introduce a user-facing and/or backwards incompatible change?:
Does this PR require updating code package or user-facing documentation?: