Add cluster namespace fanout heuristics supporting queries greater than retention #908

robskillington · 2018-09-16T19:43:10Z

I need to add tests, however I want to get buy-in that this is meets our current needs before moving to writing extensive tests.

This fixes #866.

…greater than retention

prateek · 2018-09-17T00:41:58Z

src/query/storage/local/storage.go

+						existing.attrs.Resolution <= attrs.Resolution
+				existsBetter = longerRetention || sameRetentionAndSameOrMoreGranularResolution
+			default:
+				panic(fmt.Sprintf("unknown query fanout type: %d", r.queryFanoutType))


hm could you error instead of panic-ing here?

prateek · 2018-09-17T18:13:28Z

src/query/storage/local/storage.go

@@ -72,26 +227,19 @@ func (s *localStorage) Fetch(ctx context.Context, query *storage.FetchQuery, opt
 	// cluster that can completely fulfill this range and then prefer the
 	// highest resolution (most fine grained) results.
 	// This needs to be optimized, however this is a start.
+	fanout, namespaces, err := s.resolveClusterNamespacesForQuery(query.Start, query.End)


couple thoughts:

maybe it'd be good to add a way for users to see what data source was hit for each series

do you want to add tests for the various combinations - just ensuring we override to the right storage based on the heuristics in code

codecov · 2018-10-04T03:57:41Z

Codecov Report

Merging #908 into master will increase coverage by 0.02%.
The diff coverage is 76.02%.

@@            Coverage Diff             @@
##           master     #908      +/-   ##
==========================================
+ Coverage   76.72%   76.75%   +0.02%     
==========================================
  Files         436      436              
  Lines       37002    37086      +84     
==========================================
+ Hits        28391    28464      +73     
- Misses       6585     6590       +5     
- Partials     2026     2032       +6

Flag	Coverage Δ
#dbnode	`81.35% <100%> (+0.04%)`	⬆️
#m3em	`66.39% <ø> (ø)`	⬆️
#m3ninx	`75.33% <ø> (+0.07%)`	⬆️
#m3nsch	`51.19% <ø> (ø)`	⬆️
#query	`63.93% <75.69%> (+0.06%)`	⬆️
#x	`84.72% <ø> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 94784bd...1da3723. Read the comment docs.

prateek · 2018-10-05T17:47:01Z

src/query/storage/m3/storage.go

@@ -51,8 +59,16 @@ type m3storage struct {

 // NewStorage creates a new local m3storage instance.
 // TODO: Consider combining readWorkerPool and writeWorkerPool
-func NewStorage(clusters Clusters, workerPool pool.ObjectPool, writeWorkerPool xsync.PooledWorkerPool) Storage {
-	return &m3storage{clusters: clusters, readWorkerPool: workerPool, writeWorkerPool: writeWorkerPool}
+func NewStorage(


i thought artem had removed the TODO above in another PR. mind rebasing on master once?

Sure thing.

I've rebased, its still there.

sigh must have been skipped in the last review, mind deleting it please.

richardartoul · 2018-10-05T18:02:50Z

src/query/storage/m3/multi_fetch_result.go

+	if len(r.seenIters) == 0 {
+		// store the first attributes seen
+		r.seenFirstAttrs = attrs
+	}
 	r.seenIters = append(r.seenIters, iterators)
 	if !r.err.Empty() {


should this be moved up to the first thing after defering the unlock?

No this still needs to be here, otherwise we won't free/close the iters for successful requests that come back.

I'm adding a note about it so that it's obvious that's why this is below.

richardartoul · 2018-10-05T18:04:15Z

src/query/storage/m3/multi_fetch_result.go

-			iter:  iter,
-		})
+	if len(r.seenIters) == 2 {
+		// need to build backfill the dedupe map from the first result first


maybe remove the word build

Sure thing.

richardartoul · 2018-10-05T18:09:00Z

src/query/storage/m3/multi_fetch_result.go

 		i++
 	}

-	return iter, nil
+	return r.finalResult, nil
 }

 func (r *multiResult) Add(


super nit: maybe call the iterators arg newIterators, I got a little lost between seenIters and iterators the first time

Sure thing.

richardartoul · 2018-10-05T18:12:31Z

src/query/storage/m3/multi_fetch_result.go

-			serieses[idx] = multiResultSeries{}
+		var existsBetter bool
+		switch r.fanout {
+		case namespacesCoverAllRetention:


is this more like namespacesCoverQueryDuration?

Yeah I can rename this. Probably to namespaceCoversQueryRange.

richardartoul · 2018-10-05T18:15:43Z

src/query/storage/m3/multi_fetch_result.go

+		case namespacesCoverPartialRetention:
+			// Already exists and either has longer retention, or the same retention
+			// and result we are adding is not as precise
+			longerRetention := existing.attrs.Retention > attrs.Retention


nit: existsLongerRetention

Sure thing.

richardartoul · 2018-10-05T18:17:16Z

src/query/storage/m3/multi_fetch_result.go

-		for idx := range serieses {
-			serieses[idx] = multiResultSeries{}
+		var existsBetter bool
+		switch r.fanout {


nit: I feel like this logic would be easier to follow if it was framed from the perspective of determining whether the incoming attributes were better as opposed to vice versa, but that might just be personal preference

Hm, I'm kind of preferring to keep the existing one if its there, so that's why I preferably found it better to frame it that way.

richardartoul · 2018-10-05T18:27:24Z

One thing I'm realizing now is that there is no concept of how old a namespace is. So say you start with:

namespace1: 5 day retention, 1m resolution

Then you decide you want higher resolution so you start dual writing to:

namespace1: 5 day retention, 1m resolution
namespace2: 5 day retention, 30s resolution

Your queries will immediately begin going to namespace 2 even though it doesn't really have any data.

Fixing that is probably out of scope for this P.R but maybe we can track in an issue or something, although I'm not sure what the solution is other than configuring "cutoffs" or something

robskillington · 2018-10-05T18:40:17Z

@richardartoul yeah, I'm explicitly not tackling that.

We can add a readable/writeable flag later to the namespaces - and then manually flip it on once the namespace is filled up.

Then eventually we'll get to the all singing, all dancing, dynamic and automated version of this all.

prateek · 2018-10-05T18:47:42Z

src/query/storage/m3/multi_fetch_result.go

+	r.seenIters = nil
+
+	if r.finalResult != nil {
+		// NB(r): Since all the series iterators in the final result are held onto


Mind adding a unit test w/ mocks to ensure we free the underlying iterators exactly once assuming the API is used correctly. I think it's 100% right the way it is but the test will be good for when someone comes and attempts to refactors this.

Sure thing.

As discussed, let's revisit this in a followup change.

prateek · 2018-10-05T18:52:52Z

src/query/storage/m3/multi_fetch_result.go

+		// need to backfill the dedupe map from the first result first
+		existing := r.seenIters[0]
+		r.dedupeMap = make(map[string]multiResultSeries, existing.Len())
+		for _, iter := range existing.Iters() {


nit: could you refactor this loop into func (r *multiResult) addOrUpdateMap(attrs, iters) - would be re-usable here and in dedupe()

Sure thing.

richardartoul · 2018-10-05T18:28:38Z

src/query/storage/m3/storage.go

+	start time.Time,
+	end time.Time,
+) (queryFanoutType, ClusterNamespaces, error) {
+	now := time.Now()


maybe just set a nowFn on *m3Storage and initialize it to time.Now for now, so at least this is slightly easier to work with if someone needs to control it in a test later

Sure thing.

richardartoul · 2018-10-05T18:29:23Z

src/query/storage/m3/storage.go

+
+	unaggregated := s.clusters.UnaggregatedClusterNamespace()
+	unaggregatedRetention := unaggregated.Options().Attributes().Retention
+	unaggregatedStart := now.Add(-1 * unaggregatedRetention)


Do you not need to truncate to a blockSize here? I guess thats just kind of an implementation detail of M3DB and not super relevant here..

Not here thankfully.

richardartoul · 2018-10-05T18:30:25Z

src/query/storage/m3/storage.go

+	unaggregated := s.clusters.UnaggregatedClusterNamespace()
+	unaggregatedRetention := unaggregated.Options().Attributes().Retention
+	unaggregatedStart := now.Add(-1 * unaggregatedRetention)
+	if !unaggregatedStart.After(start) {


Is this the same as unaggregatedStart.Before(start)? If so thats easier to reason about to me

oh I see you would have to add || .Equal() as well but still seems nicer to me

Hm, yeah I did it so you don't have to call twice. It's not a hot path though so I'll refactor to this.

richardartoul · 2018-10-05T18:33:41Z

src/query/storage/m3/storage.go

+		return namespacesCoverAllRetention, ClusterNamespaces{unaggregated}, nil
+	}
+
+	// First determine if any aggregated clusters that can span the whole query


"First determine if any aggregated clusters span the whole query range, if so..."

Good catch, thanks.

prateek · 2018-10-05T19:30:03Z

src/query/storage/m3/multi_fetch_result.go

+		case namespaceCoversPartialQueryRange:
+			// Already exists and either has longer retention, or the same retention
+			// and result we are adding is not as precise
+			existsLongerRetention := existing.attrs.Retention > attrs.Retention


All this is to avoid consolidation... :sigh:

richardartoul · 2018-10-05T19:30:22Z

src/query/storage/m3/storage.go

+
+		downsampleOpts, err := opts.DownsampleOptions()
+		if err != nil || !downsampleOpts.All {
+			// Cluster does not contain all data, include as part of fan out


Confused about the downsampling thing, how is this different than saying something is unaggregated?

Basically it's aggregated, but data will only go to this cluster if rules are setup to do so.

richardartoul

LGTM with the nits

Add local query cluster namespace fanout heuristics to allow queries …

5eedb9f

…greater than retention

robskillington mentioned this pull request Sep 16, 2018

PromQL query with longer time range causes internal server error #866

Closed

prateek reviewed Sep 17, 2018

View reviewed changes

robskillington mentioned this pull request Sep 29, 2018

Fix a bug in iterator deduplication logic #984

Closed

Merge branch 'master' into r/local-query-cluster-fanout-heuristics

78e5edb

Add remaining tests

98a6f74

prateek reviewed Oct 5, 2018

View reviewed changes

richardartoul reviewed Oct 5, 2018

View reviewed changes

Merge branch 'master' into r/local-query-cluster-fanout-heuristics

3b56a9c

Address feedback

4fc69ae

prateek reviewed Oct 5, 2018

View reviewed changes

richardartoul reviewed Oct 5, 2018

View reviewed changes

prateek reviewed Oct 5, 2018

View reviewed changes

richardartoul reviewed Oct 5, 2018

View reviewed changes

Address feedback

1da3723

richardartoul approved these changes Oct 5, 2018

View reviewed changes

robskillington merged commit c72ee5b into master Oct 5, 2018

robskillington deleted the r/local-query-cluster-fanout-heuristics branch October 5, 2018 20:00

Add cluster namespace fanout heuristics supporting queries greater than retention #908

Add cluster namespace fanout heuristics supporting queries greater than retention #908

Conversation

robskillington commented Sep 16, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Oct 4, 2018 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robskillington Oct 5, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

richardartoul commented Oct 5, 2018

robskillington commented Oct 5, 2018

prateek Oct 5, 2018 • edited by robskillington Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

richardartoul left a comment

Choose a reason for hiding this comment

robskillington commented Sep 16, 2018 •

edited

Loading

codecov bot commented Oct 4, 2018 •

edited

Loading

robskillington Oct 5, 2018 •

edited

Loading

prateek Oct 5, 2018 •

edited by robskillington

Loading