Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix postings list cache causing incorrect results #1461

Merged
merged 9 commits into from
Mar 15, 2019

Conversation

richardartoul
Copy link
Contributor

@richardartoul richardartoul commented Mar 15, 2019

The postings list cache was broken in that the key did not include the field that the query was executed against. This meant that for a given segment UUID the results of a query for one term could be mixed up with the results of the same query but for a different term. This would lead to incorrect results being returned when the index block was queried.

This P.R introduces the field as part of the key in the postings list cache to resolve the issue. In addition, it introduces a property test that generates two index blocks, one with the postings list cache enabled and one without, and then executes hundreds of different queries against both blocks and makes sure that they return the exact same result in all cases.

@CLAassistant
Copy link

CLAassistant commented Mar 15, 2019

CLA assistant check
All committers have signed the CLA.

@richardartoul richardartoul marked this pull request as ready for review March 15, 2019 00:13
@codecov
Copy link

codecov bot commented Mar 15, 2019

Codecov Report

Merging #1461 into master will increase coverage by 18.6%.
The diff coverage is 85.2%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #1461      +/-   ##
=========================================
+ Coverage    48.2%   66.9%   +18.6%     
=========================================
  Files         736     741       +5     
  Lines       62963   64770    +1807     
=========================================
+ Hits        30384   43355   +12971     
+ Misses      29787   18499   -11288     
- Partials     2792    2916     +124
Flag Coverage Δ
#aggregator 78.7% <ø> (+20.4%) ⬆️
#cluster 84.1% <ø> (+37.4%) ⬆️
#collector 63.7% <ø> (+24.5%) ⬆️
#dbnode 74.4% <93.5%> (+6.7%) ⬆️
#m3em 66.7% <ø> (+16.1%) ⬆️
#m3ninx 68.4% <0%> (+10.8%) ⬆️
#m3nsch 28.4% <ø> (-71.6%) ⬇️
#metrics 17.6% <ø> (?)
#msg 74.9% <ø> (+3.6%) ⬆️
#query 63.8% <ø> (+62.3%) ⬆️
#x 72.4% <ø> (+6%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dc35308...5978480. Read the comment docs.

@codecov
Copy link

codecov bot commented Mar 15, 2019

Codecov Report

Merging #1461 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@          Coverage Diff           @@
##           master   #1461   +/-   ##
======================================
  Coverage    70.9%   70.9%           
======================================
  Files         841     841           
  Lines       72006   72006           
======================================
  Hits        51108   51108           
  Misses      17557   17557           
  Partials     3341    3341
Flag Coverage Δ
#aggregator 82.3% <0%> (ø) ⬆️
#cluster 85.8% <0%> (ø) ⬆️
#collector 63.7% <0%> (ø) ⬆️
#dbnode 80.8% <0%> (ø) ⬆️
#m3em 73.2% <0%> (ø) ⬆️
#m3ninx 74.4% <0%> (ø) ⬆️
#m3nsch 51.1% <0%> (ø) ⬆️
#metrics 17.6% <0%> (ø) ⬆️
#msg 74.9% <0%> (ø) ⬆️
#query 66% <0%> (ø) ⬆️
#x 76.3% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 55d8b41...6e37b95. Read the comment docs.

@richardartoul richardartoul changed the title Fix postings list cache bug Fix postings list cache causing incorrect results Mar 15, 2019
// PostingsListCacheQuery represents a query that we want to cache the result of
// for a given segment. Note that it include the field in the segment that the
// query was executed on, as well as the pattern of the query itself.
type PostingsListCacheQuery struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm i think you need to include query/pattern type here. i.e. a term query of "abc:1*" is not the same as regexp query of "abc:1*".

Could you ensure the tests catch this too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is handled. They go through separate APIs and end up with separate keys in the LRU map. See postings_list_cache_lru.go key struct

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Mind including pattern here anyway? A query isn’t completely defined without it so this struct feels incomplete.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d have to change the APIs though and then you have a potentially invalid iota coming from the caller

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you just opposing the name? We could call it TermPattern or FieldPattern

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you just opposing the name? We could call it TermPattern or FieldPattern

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s not the name as much as it is the struct itself. Like without the pattern included, the struct is an incomplete description of a query.

How would you feel about migrating to using search.Query here? The thought is if we add new types of queries we’d still get this part for free.

Copy link
Contributor Author

@richardartoul richardartoul Mar 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Search.Query is an interface :/ I don't think that makes a lot of sense.

I don't like the struct either, I'd prefer to keep all three fields as arguments to the methods in the postings list cache. I only did it this way because you suggested a struct in our previous discussion

Copy link
Collaborator

@prateek prateek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like there's a failure case that still needs to be addressed.

Copy link
Collaborator

@arnikola arnikola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good to me in general, but take that with a grain of salt since not exactly familiar with all the implications here yet

src/dbnode/storage/index/block_prop_test.go Show resolved Hide resolved
@@ -63,6 +63,14 @@ type PostingsListCache struct {
metrics *postingsListCacheMetrics
}

// PostingsListCacheQuery represents a query that we want to cache the result of
// for a given segment. Note that it include the field in the segment that the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: include->includes

src/dbnode/storage/index/postings_list_cache_lru.go Outdated Show resolved Hide resolved
patternType PatternType,
) (postings.List, bool) {
newKey := newKey(query, patternType)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't you create this here for no reason if uuidArray is not present in items? Looks like in Add you always have to generate newKey so it's ok there, but might be unnecessary both here and in Remove

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah but its a stack alloc so I think it doesnt matter much

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no alloc > stack alloc ;)

src/dbnode/storage/index/postings_list_cache_lru.go Outdated Show resolved Hide resolved
src/dbnode/storage/index/postings_list_cache_test.go Outdated Show resolved Hide resolved
src/dbnode/storage/index/postings_list_cache_test.go Outdated Show resolved Hide resolved
src/dbnode/storage/index/read_through_segment.go Outdated Show resolved Hide resolved
src/dbnode/storage/index/read_through_segment.go Outdated Show resolved Hide resolved
src/m3ninx/idx/query.go Show resolved Hide resolved
Copy link
Collaborator

@arnikola arnikola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me as long as Prateek is happy :)

Copy link
Collaborator

@prateek prateek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending test change

@richardartoul richardartoul merged commit 753eb9e into master Mar 15, 2019
@richardartoul richardartoul deleted the ra/fix-pl-cache-bug branch March 15, 2019 19:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants