Refactor FetchTagged to return an Iterator of results #3141

ryanhall07 · 2021-01-29T21:59:38Z

This is the first step in several refactorings to limit how many series blocks can be loaded at once. This will prevent
large queries from overwhelming the system and give fair access to all queries in the system.

This first refactoring creates the interface for callers to use to iterate through series blocks one at a
time. The series blocks are still all loaded at once and this will be fixed in a future PR with another Iterator.

What this PR does / why we need it:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing and/or backwards incompatible change?:

Does this PR require updating code package or user-facing documentation?:

This is the first step in several refactorings to limit how many series blocks can be loaded at once. This will prevent large queries from overwhelming the system and give more fair access to all queries in the system. This first refactoring creates the interface for callers to use to iterate through series blocks one at a time. The series blocks are still all loaded at once and this will be fixed in a future PR with another Iterator.

ryanhall07 · 2021-01-30T01:02:56Z

src/dbnode/network/server/tchannelthrift/node/service.go

-		Elements:   make([]*rpc.FetchTaggedIDResult_, 0, results.Size()),
-	}
-	nsIDBytes := ns.Bytes()
+	tagEncoder := s.pools.tagEncoder.Get()


@robskillington does this make sense for getting a tag encoder?

Yeah this is best way to do it and return at end of request.

ryanhall07 · 2021-01-30T01:03:30Z

src/dbnode/network/server/tchannelthrift/node/service.go

 	}
+	tagBytes := make([]byte, len(encodedTags.Bytes()))
+	copy(tagBytes, encodedTags.Bytes())


@robskillington had to copy the bytes cause the same tag encoder is used for the entire request.

ryanhall07 · 2021-01-30T01:04:18Z

src/dbnode/network/server/tchannelthrift/node/service.go

+	if err != nil {
+		return nil, err
+	}
+	ctx.RegisterCloser(xresource.SimpleCloserFn(func() {


@robskillington does this do what I think it does? call the complete function when the rpc is closed.

Yeah this should work just fine 👍

robskillington · 2021-01-30T01:22:03Z

src/dbnode/network/server/tchannelthrift/node/service.go

-	var encodedDataResults [][][]xio.BlockReader
-	if fetchData {
-		encodedDataResults = make([][][]xio.BlockReader, results.Size())
+	return newFetchTaggedResultsIter(&fetchTaggedResultsIterOpts{


Any reason to pass fetchTaggedResultsIterOpts as a pointer value?

I wouldn't worry about using these values to pass along on the stack considering they already are on the stack in the current function.

Also if you passed each of these one by one in the method call gocritic wouldn't complain... either way I wouldn't worry about growing the stack here, it's much cheaper in general to grow the stack that to heap allocate as a general rule of thumb.

i was just avoiding the gocritic linter. so just add nolint:gocritic ?

robskillington · 2021-01-30T01:24:33Z

src/dbnode/network/server/tchannelthrift/node/service.go

+	// HasNextIDIter returns true if there is another series ID to process.
+	HasNextIDIter() bool
+
+	// NextIDIter returns an iterator to process the results for a series ID.
+	// HasNextIDIter must be called before each call of this method.
+	NextIDIter(ctx context.Context) (IDIter, error)


nit: If you check everywhere else in the code pass our iterators usually use the naming of:

Next() bool // or NextIDIter() for this current example Current() MyNextResult // or CurrentIDIter() for this current example

Perhaps for consistency better to use Next..() and Current..() instead of HasNext..() and Next..()?

you can leave java, but java can't leave you

robskillington · 2021-01-30T01:25:43Z

src/dbnode/network/server/tchannelthrift/node/service.go

+	// HasNextSegments returns true if there are more Segments to process.
+	HasNextSegments() bool
+
+	// NextSegments returns the next Segments.
+	// HasNextSegments must be called before each call of this method.
+	NextSegments(ctx context.Context) (*rpc.Segments, error)


nit: Same about hasnext and next vs rest of codebase with next and current.

robskillington · 2021-01-30T01:28:02Z

src/dbnode/network/server/tchannelthrift/node/service.go

+	tagBytes := make([]byte, len(encodedTags.Bytes()))
+	copy(tagBytes, encodedTags.Bytes())


This is equivalent in speed btw and with compiler optimizations most often comes out better (and @arnikola will attest with the many discussions and benchmarks we've walked through) with the more clean version:

tagBytes := append(make([]byte, 0, len(encodedTags.Bytes())), encodedTags.Bytes()...)

See more:
https://gist.github.com/xogeny/b819af6a0cf8ba1caaef

robskillington · 2021-01-30T01:39:07Z

src/dbnode/network/server/tchannelthrift/node/service.go

+	// TODO(rhall): don't request all series blocks at once.
+	if i.idx == 0 && i.fetchData {
+		for _, idResult := range i.idResults {
+			id := ident.BytesID(idResult.queryResult.Key())


Can you copy the old comment to here? Starts with:

// NB(r): Use a bytes ID here so that this ID doesn't need to be...

robskillington

LGTM other than minor nits

robskillington · 2021-01-30T02:51:48Z

src/dbnode/network/server/tchannelthrift/node/service.go

+		iter.idResults = append(iter.idResults, &idResult{
+			queryResult: &entry,
+		})


Can we make both idResult and index.ResultsMapEntry not use pointers here?

Both will cause these structs to be individually heap allocated (due to escape analysis unsure of the lifetimes of these two structs).

Using by value and non-pointer types will ensure that (1) idResult in the idResults slice can be just part of the allocation of the slice and (2) index.ResultsMapEntry won't need to be heap allocated and a memcpy can move the struct from the results map to this value.

…o rhall-fetch-results-iterator

ryanhall07 · 2021-01-30T22:35:10Z

@robskillington addressed comments in 40e2322

codecov · 2021-01-30T22:38:49Z

Codecov Report

Merging #3141 (635c973) into master (bef2564) will increase coverage by 0.0%.
The diff coverage is 90.4%.

@@           Coverage Diff           @@
##           master    #3141   +/-   ##
=======================================
  Coverage    72.2%    72.2%           
=======================================
  Files        1084     1084           
  Lines      100236   100279   +43     
=======================================
+ Hits        72428    72497   +69     
+ Misses      22755    22739   -16     
+ Partials     5053     5043   -10

Flag	Coverage Δ
aggregator	`75.9% <ø> (+0.1%)`	⬆️
cluster	`84.8% <ø> (ø)`
collector	`84.3% <ø> (ø)`
dbnode	`78.7% <90.4%> (+<0.1%)`	⬆️
m3em	`74.4% <ø> (ø)`
m3ninx	`73.1% <ø> (-0.1%)`	⬇️
metrics	`20.0% <ø> (ø)`
msg	`74.1% <ø> (+0.2%)`	⬆️
query	`67.2% <ø> (ø)`
x	`80.3% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bef2564...635c973. Read the comment docs.

robskillington · 2021-01-31T04:32:49Z

src/dbnode/network/server/tchannelthrift/node/service.go

+	if err != nil {
+		s.metrics.fetchTagged.ReportError(s.nowFn().Sub(callStart))
+	} else {
+		s.metrics.fetchTagged.ReportSuccess(s.nowFn().Sub(callStart))
+	}


Can use .ReportSuccessOrError(err, s.nowFn().Sub(callStart)) here and it will do the nil check of the error for you.

nice. just had the code from before.

robskillington · 2021-01-31T04:34:24Z

src/dbnode/network/server/tchannelthrift/node/service.go

+	for iter.Next(ctx) {
+		if iter.Err() != nil {


Hm usually we have Next(...) return false if there's an iterator error, then we just check the if err := iter.Err(); err != nil after the iterator has finished.

i.e.

for iter.Next() { // return false if no more or an error } if err := iter.Err(); err != nil { return nil, err }

robskillington · 2021-01-31T04:36:29Z

src/dbnode/network/server/tchannelthrift/node/service.go

+		tagBytes := make([]byte, 0)
+		tagBytes, err = cur.WriteTags(tagBytes)


Typically if you're not reusing a byte slice being passed into a method taking a dst []byte to write to we just pass nil. That way it's not allocated before calling the function (i.e. reduces two allocs to just one alloc inside of the WriteTags(...)).

e.g.

tagBytes, err := cur.WriteTags(nil) // ...

robskillington · 2021-01-31T04:38:13Z

src/dbnode/network/server/tchannelthrift/node/service.go

+		for segIter.Next(ctx) {
+			if segIter.Err() != nil {


Same here would opt to more consistently check for the err after and have the .Next(...) call return false if there is an error to make it break.

for segIter.Next(ctx) { // inner } if err := segIter.Err(); err != nil { return nil, err }

robskillington · 2021-01-31T04:43:50Z

src/dbnode/network/server/tchannelthrift/node/service.go

+				id := ident.BytesID(result.queryResult.Key())
+				result.blockReaders, i.err = i.db.ReadEncoded(ctx, i.nsID, id, i.startInclusive, i.endExclusive)
+				if i.err != nil {
+					return true


Yeah to be consistent with our other iterators I would make an error return false to break the for loop then allow the caller to check the error after the for loop has broken.

This is consistent with our other iterators (easiest way to find them is to search for for iter.Next() { or for it.Next() { and should surface quite a few results).

robskillington · 2021-01-31T04:47:54Z

src/dbnode/network/server/tchannelthrift/node/service.go

+	if err != nil { // This is an invariant, should never happen
+		return nil, tterrors.NewInternalError(err)
+	}
+	result = append(result, encodedTags.Bytes()...)


nit: A slightly more defensive programming approach here is to slice the result into result[:0] so that if caller accidently passed a full buffer for reuse that hadn't been resized it will overwrite with the length reset but still able to use the allocated capacity. Also very nitty, we usually the name dst for a byte slice that is to be written into.

e.g.

func (i *IDResult) WriteTags(dst []byte) ([]byte, error) { // .... other dst = append(dst[:0], encodedTags.Bytes()...)

robskillington · 2021-01-31T04:49:36Z

src/dbnode/network/server/tchannelthrift/node/service.go

+}
+
+type fetchTaggedResultsIter struct {
+	queryResults   map[index.ResultsMapHash]index.ResultsMapEntry


Q: Why not take the *index.ResultsMap itself here? Not that this isn't wrong it just breaks the abstraction a little bit of our map type wrapping the underlying map itself (in case we ever wanted to change the way the .Iter() method worked for the map type).

* master: Refactor FetchTagged to return an Iterator of results (#3141)

* master: [dtest] endpoint to fetch tagged (#3138) Refactor FetchTagged to return an Iterator of results (#3141) [dbnode] Add aggregate term limit regression test (#3135) [DOCS] Adding Prometheus steps to quickstart (#3043) [dbnode] Revert AggregateQuery changes (#3133) Fix TestSessionFetchIDs flaky test (#3132) [dbnode] Alter multi-segments builder to order by size before processing (#3128) [dbnode] Emit aggregate usage metrics (#3123) [dbnode] Add Shard.OpenStreamingReader method (#3119)

This is the first step in several refactorings to limit how many series blocks can be loaded at once. This will prevent large queries from overwhelming the system and give more fair access to all queries in the system. This first refactoring creates the interface for callers to use to iterate through series blocks one at a time. The series blocks are still all loaded at once and this will be fixed in a future PR with another Iterator.

ryanhall07 requested review from nbroyles and robskillington January 29, 2021 21:59

ryanhall07 force-pushed the rhall-fetch-results-iterator branch from a4a111f to 3175ecd Compare January 29, 2021 23:38

ryanhall07 force-pushed the rhall-fetch-results-iterator branch from be04580 to 2633aec Compare January 30, 2021 01:01

ryanhall07 commented Jan 30, 2021

View reviewed changes

robskillington reviewed Jan 30, 2021

View reviewed changes

robskillington approved these changes Jan 30, 2021

View reviewed changes

Merge branch 'master' into rhall-fetch-results-iterator

1420730

robskillington reviewed Jan 30, 2021

View reviewed changes

ryanhall07 added 2 commits January 30, 2021 14:33

review comments

40e2322

Merge branch 'rhall-fetch-results-iterator' of github.com:m3db/m3 int…

23404e0

…o rhall-fetch-results-iterator

ryanhall07 marked this pull request as ready for review January 30, 2021 22:34

robskillington reviewed Jan 31, 2021

View reviewed changes

review comments

635c973

ryanhall07 merged commit 25fbe60 into master Jan 31, 2021

ryanhall07 deleted the rhall-fetch-results-iterator branch January 31, 2021 22:51

soundvibe added a commit that referenced this pull request Feb 1, 2021

Merge branch 'master' into linasn/bootstrap-profiler

c8c2d5d

* master: Refactor FetchTagged to return an Iterator of results (#3141)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor FetchTagged to return an Iterator of results #3141

Refactor FetchTagged to return an Iterator of results #3141

ryanhall07 commented Jan 29, 2021 •

edited

Loading

ryanhall07 Jan 30, 2021

robskillington Jan 30, 2021

ryanhall07 Jan 30, 2021

ryanhall07 Jan 30, 2021

robskillington Jan 30, 2021

robskillington Jan 30, 2021

robskillington Jan 30, 2021

ryanhall07 Jan 30, 2021

robskillington Jan 30, 2021 •

edited

Loading

ryanhall07 Jan 30, 2021

robskillington Jan 30, 2021

robskillington Jan 30, 2021 •

edited

Loading

robskillington Jan 30, 2021

robskillington Jan 30, 2021

robskillington left a comment

robskillington Jan 30, 2021 •

edited

Loading

ryanhall07 commented Jan 30, 2021

codecov bot commented Jan 30, 2021 •

edited

Loading

robskillington Jan 31, 2021

ryanhall07 Jan 31, 2021

robskillington Jan 31, 2021

robskillington Jan 31, 2021 •

edited

Loading

robskillington Jan 31, 2021

robskillington Jan 31, 2021

robskillington Jan 31, 2021 •

edited

Loading

robskillington Jan 31, 2021

		tagBytes := make([]byte, len(encodedTags.Bytes()))
		copy(tagBytes, encodedTags.Bytes())

		tagBytes := make([]byte, 0)
		tagBytes, err = cur.WriteTags(tagBytes)

Refactor FetchTagged to return an Iterator of results #3141

Refactor FetchTagged to return an Iterator of results #3141

Conversation

ryanhall07 commented Jan 29, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robskillington Jan 30, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robskillington Jan 30, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robskillington left a comment

Choose a reason for hiding this comment

robskillington Jan 30, 2021 • edited Loading

Choose a reason for hiding this comment

ryanhall07 commented Jan 30, 2021

codecov bot commented Jan 30, 2021 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robskillington Jan 31, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robskillington Jan 31, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ryanhall07 commented Jan 29, 2021 •

edited

Loading

robskillington Jan 30, 2021 •

edited

Loading

robskillington Jan 30, 2021 •

edited

Loading

robskillington Jan 30, 2021 •

edited

Loading

codecov bot commented Jan 30, 2021 •

edited

Loading

robskillington Jan 31, 2021 •

edited

Loading

robskillington Jan 31, 2021 •

edited

Loading