-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tsdb/inverted index wiring #6252
Conversation
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. - ingester -0.1%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0.3% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few nits/questions
i.Delete(labels, fp) | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
|
||
func (m *Multi) Add(labels []logproto.LabelAdapter, fp model.Fingerprint) (result labels.Labels) { | ||
for _, i := range m.indices { | ||
if i != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for curiosity, why would an index inside m.indices be nil?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case where there's only tsdb
period configs or non-tsdb period configs (such as boltdb-shipper
), only one of the sub-indices will be needed. If that's the case, we can avoid the overhead of running an unnecessary one.
return m.indexFor(t).LabelValues(name, shard) | ||
} | ||
|
||
// Query planning is responsible for ensuring no query spans more than one inverted index. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this comment out of place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I intended to put it here because most queries in loki use a from, through
, but in this case we're using another part of the code base to ensure we'd never query more than one period at a time, so we don't need to account for the range from->through
, only the first point (from
).
pkg/ingester/index/multi.go
Outdated
if bitPrefixed == nil { | ||
bitPrefixed, err = NewBitPrefixWithShards(indexShards) | ||
if err != nil { | ||
return nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe you could wrap this error with errors.Wrap
and add more details to the message, like which period erroed and its index type. Motivation:
there's a chain of calls that aren't wrapping errors that will end up here, so it might be hard to identify the root of the problem. The chain is:
GetOrCreateInstance
is invoked at multiple places without wrapping errors and triggersnewInstance
newInstance
will invokeindex.NewMultiInvertedIndex
without wrapping errors- we end up here, with
NewMultiInvertedIndex
invokingNewBitPrefixWithShards
- we end up here, with
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good Owen, I have a few comments which are mostly questions. Thanks!
var ( | ||
err error | ||
|
||
ii Interface // always stored in 0th index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does ii
stand for invertedIndex
? if so I might opt for the longer version of that variable as it's not that long.
|
||
type periodIndex struct { | ||
time.Time | ||
idx int // address of the index to use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a bit confused about what this represents. it seems to be a type more than an address to me, ie 0 is a regular inverted index, whereas 1 is a tsdb inverted index. could we reflect that in the name and/or comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or is this the index in the indices
property on Multi
? If so, and there can only be 2, I'm curious if instead of storing these in a fixed length slice with magic indices, maybe Multi
should have a field for each type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a fair point. I wrote it like this in case we had period configs like [boltdb, tsdb, boltdb, tsdb]
, in which case we'd want the first period to point to the index of the boltdb
inverted index, the second to point to the tsdb
inverted index and so on. Decoupling the period from the index it points to allows us to instantiate a maximum of one inverted index for each type, regardless of the number of period configs.
The way it's written now will allow us to add other index types in the future if necessary. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it does, but we could also come back and update this struct with the necessary types when we get to that point. just feels like we're overloading slice indexing as a mechanism for expressing type, whereas there are clearer ways to do that.
that being said I have no interest in dying on this hill, and I'm cool with keeping it this way if you think it's more flexible.
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. - ingester -0.2%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0.4% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Followup to #6233
This allows us to run two incompatible inverted indices in the ingesters, one for bit-prefixed sharding (tsdb) and one for modulo sharding (everything else). It also adds some more config validation and fixes a bug in the config-wrapper that hasn't surfaced yet, but will as soon as you try to run a non-current boltdb-shipper period config: d9f1e06
ref #5428