opt: index accelerate && (overlaps) expressions for array inverted indexes #76212

RajivTS · 2022-02-08T00:08:28Z

Previously, we did not support index acceleration for queries involving
array overlaps (&&).

This change adds support for using the inverted index with && expressions on
Array columns. When there is an inverted index available, a scan will be done on
the Array column using the spans found from the constant value.

Fixes: #75477

Release note (performance improvement): Expressions using the overlaps (&&)
operator for arrays now support index-acceleration for faster execution in
some cases.

blathers-crl · 2022-02-08T00:08:31Z

Thank you for contributing to CockroachDB. Please ensure you have followed the guidelines for creating a PR.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.}

cockroach-teamcity · 2022-02-08T00:08:31Z

All committers have signed the CLA.

cockroach-teamcity · 2022-02-08T00:08:38Z

This change is

blathers-crl · 2022-02-09T00:36:24Z

Thank you for updating your pull request.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.}

rytaft

Thanks for the contribution, @RajivTS! This is looking great. A few comments below.

Reviewed 7 of 8 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @RajivTS)

-- commits, line 12 at r2:
nit: Mention that this is specifically array expressions

pkg/sql/opt/invertedidx/json_array.go, line 366 at r2 (raw file):

		}
	case *memo.OverlapsExpr:
		invertedExpr = getInvertedExprForArrayIndexForOverlaps(evalCtx, memo.ExtractConstDatum(t.Right))

I'm surprised you don't need an extract function similar to the above. Are we doing extra work if t.Left is not actually an index column? Is it possible that ExtractConstDatum will panic if it cannot extract a constant from t.Right? It would be good to make sure we have test cases that exercise this.

pkg/sql/rowenc/index_encoding.go, line 866 at r2 (raw file):

// inKey is prefixed to all returned keys.
//
// Returns unique=true if the spans are guaranteed not to produce

I see that you copied this from above, but it looks like this comment has gotten stale so maybe you can fix it in both places?

We no longer return a value for unique, instead we set Unique appropriately on the returned inverted.SpanExpression. Feel free to just remove this comment, and/or maybe add a comment where we set Unique=true below. (A previous version of this code had a separate return value for unique)

pkg/sql/rowenc/index_encoding.go, line 872 at r2 (raw file):

) (invertedExpr inverted.Expression, err error) {
	emptyArrSpanExpr := inverted.ExprForSpan(
		inverted.MakeSingleValSpan(encoding.EncodeEmptyArray(inKey)), false, /* tight */

do we need this for overlaps? ARRAY[]::int[] && ARRAY[]::int[]; is false

pkg/sql/rowenc/index_encoding.go, line 890 at r2 (raw file):

	for _, key := range keys {
		spanExpr := inverted.ExprForSpan(
			inverted.MakeSingleValSpan(key), false, /* tight */

The comment above on line 713 said that the span expression will be tight, but it looks like that may not be the case? Do you think this should be true?

pkg/sql/sem/tree/eval.go, line 434 at r2 (raw file):

}

// ArrayOverlaps return true if there is even one element common between the left and right arrays.

nit: comments should be <= 80 columns wide

pkg/sql/sem/tree/eval.go, line 435 at r2 (raw file):

// ArrayOverlaps return true if there is even one element common between the left and right arrays.
func ArrayOverlaps(ctx *EvalContext, left, right *DArray) (*DBool, error) {

You should make sure that this is the same function we are actually using to evaluate &&. I think you'll want to extract the code from here for this helper function, and call it from both places:

cockroach/pkg/sql/sem/tree/eval.go

Lines 2598 to 2616 in 63988a4

    
           Fn: func(ctx *EvalContext, left Datum, right Datum) (Datum, error) { 
        
           	array := MustBeDArray(left) 
        
           	other := MustBeDArray(right) 
        
           	if !array.ParamTyp.Equivalent(other.ParamTyp) { 
        
           		return nil, pgerror.New(pgcode.DatatypeMismatch, "cannot compare arrays with different element types") 
        
           	} 
        
           	for _, needle := range array.Array { 
        
           		// Nulls don't compare to each other in && syntax. 
        
           		if needle == DNull { 
        
           			continue 
        
           		} 
        
           		for _, hay := range other.Array { 
        
           			if needle.Compare(ctx, hay) == 0 { 
        
           				return DBoolTrue, nil 
        
           			} 
        
           		} 
        
           	} 
        
           	return DBoolFalse, nil 
        
           },

pkg/sql/logictest/testdata/logic_test/inverted_index, line 1572 at r2 (raw file):

----
{rat}
{rat,NULL,""}

nit: remove extra space (I think if you rewrite this file as follows it should remove the space):
make test PKG=./pkg/sql/logictest/... TESTS='TestLogic//inverted_index' TESTFLAGS='-rewrite=true'

pkg/sql/opt/exec/execbuilder/testdata/inverted_index, line 1446 at r2 (raw file):

            │ columns: (a, b_inverted_key)
            │ inverted column: b_inverted_key
            │ num spans: 2

the formatting looks weird -- you might try to run this to fix it:
make test PKG=./pkg/sql/opt/exec/execbuilder... TESTS='TestExecBuild//inverted_index' TESTFLAGS='-rewrite=true'

pkg/sql/opt/xform/testdata/rules/select, line 9398 at r2 (raw file):

           └── const-agg [as=b:5, outer=(5)]
                └── b:5

nit: remove this

pkg/sql/rowenc/index_encoding_test.go, line 749 at r2 (raw file):

		// to determine if the spans produced from the second Array value will
		// correctly overlap or be distinct from the first value, indicated by
		// overlapsKeys. Then, if indexedValue && value, expected is true.

I don't see overlapsKeys

pkg/sql/rowenc/index_encoding_test.go, line 754 at r2 (raw file):

		// unless the value produces a single empty array span.

		// First we test that the spans will include expected value.

Nice tests!

pkg/sql/rowenc/index_encoding_test.go, line 808 at r2 (raw file):

		}

		// Array spans for && are never tight.

why not? (as I wrote elsewhere, I think you might be able to make the spans tight)

pkg/sql/rowenc/index_encoding_test.go, line 813 at r2 (raw file):

		}

		// Array spans for && are never unique unless the result is a singular empty array span.

nit: comments <= 80 columns wide

pkg/sql/rowenc/index_encoding_test.go, line 845 at r2 (raw file):

		// Generate two random arrays and evaluate the result of `left && right`.
		left := randgen.RandArray(rng, typ, 9 /* nullChance */)
		right := randgen.RandArray(rng, typ, 9 /* nullChance */)

why are you using 9 here for nullChance? (maybe just add a comment)

mgartner

Great start! I left a few comments as well.

Reviewed 4 of 8 files at r1, 1 of 1 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @RajivTS and @rytaft)

pkg/sql/rowenc/index_encoding.go, line 872 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

do we need this for overlaps? ARRAY[]::int[] && ARRAY[]::int[]; is false

Hmmm, this is an interesting case. So what should we do when val is an empty array, generate no spans? I don't know if there's precedent for scans over an empty range... maybe this can't be index accelerated?

Maybe we can add a normalization rule in a follow-up PR for this case. We have to be careful though. We can't simply normalize a && ARRAY[]::INT[] to false, because if a is NULL, it evaluates to NULL. But I think we could normalize a filter with a && ARRAY[]::INT[] to false.

pkg/sql/rowenc/index_encoding.go, line 887 at r2 (raw file):

		return nil, err
	}
	invertedExpr = emptyArrSpanExpr

Why do we include the empty array span in this case? It seems like it shouldn't be needed given that an empty array doesn't overlap with any other array.

pkg/sql/rowenc/index_encoding.go, line 890 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

The comment above on line 713 said that the span expression will be tight, but it looks like that may not be the case? Do you think this should be true?

@RajivTS FYI there's some explanation of tightness here:

cockroach/pkg/sql/inverted/expression.go

Lines 280 to 282 in a8fc111

    
           // IsTight returns whether the inverted expression is tight, i.e., will the 
        
           // original expression not need to be reevaluated on each row output by the 
        
           // query evaluation over the inverted index.

You can see examples of tight and non-tight expressions here:

cockroach/pkg/sql/opt/invertedidx/json_array_test.go

Line 237 in b2c04be

testCases := []struct {

This is a good example of a non-tight expression:

cockroach/pkg/sql/opt/invertedidx/json_array_test.go

Lines 266 to 273 in b2c04be

    
           { 
        
           	filters:          `j <@ '{"a": 1}'`, 
        
           	indexOrd:         jsonOrd, 
        
           	ok:               true, 
        
           	tight:            false, 
        
           	unique:           false, 
        
           	remainingFilters: `j <@ '{"a": 1}'`, 
        
           },

We can scan the inverted span for '{"a": 1}', to get rows that may match, but we need to re-evaluate the expression to ensure that the JSON columns don't contain other values that aren't contained by '{"a": 1}'.

I think these overlaps expressions should be tight.

pkg/sql/logictest/testdata/logic_test/inverted_index, line 1491 at r2 (raw file):

statement ok
INSERT INTO cb VALUES
  (0, ARRAY[], ARRAY[]),

Can you add a row where numbers and words are NULL?

pkg/sql/logictest/testdata/logic_test/inverted_index, line 1504 at r2 (raw file):

SELECT numbers FROM cb@n WHERE numbers && ARRAY[]::INT[]
----

nit: remove extra lines after the empty test responses here and below

pkg/sql/opt/invertedidx/json_array_test.go, line 743 at r2 (raw file):

		},
		{
			filters:          "a && '{1}' OR str && '{hello}'",

It'd be interesting to add expressions that combine && expressions with @> and <@ with ANDs and ORs.

pkg/sql/rowenc/index_encoding_test.go, line 752 at r2 (raw file):

		// The expression is a union of spans, so unique should never be true
		// unless the value produces a single empty array span.

Is this true for overlaps? I think the spans for a single array overlap would be unique, like `a &&

pkg/sql/rowenc/index_encoding_test.go, line 784 at r2 (raw file):

		{`{NULL}`, `{1, NULL}`, false, false},
		{`{1,NULL}`, `{NULL}`, false, true},
		{`{2, NULL}`, `{1, NULL}`, false, false},

Can you add test with NULL arrays?

mgartner · 2022-02-09T16:37:18Z

pkg/sql/logictest/testdata/logic_test/inverted_index, line 1491 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Can you add a row where numbers and words are NULL?

I think arrays of tuples would be interesting test cases too, like ARRAY[(1, 2)].

rytaft · 2022-02-09T17:29:00Z

pkg/sql/logictest/testdata/logic_test/inverted_index, line 1491 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

I think arrays of tuples would be interesting test cases too, like ARRAY[(1, 2)].

I don't think we support columns with arrays of tuples

RajivTS

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @RajivTS, and @rytaft)

pkg/sql/opt/invertedidx/json_array.go, line 366 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I'm surprised you don't need an extract function similar to the above. Are we doing extra work if t.Left is not actually an index column? Is it possible that ExtractConstDatum will panic if it cannot extract a constant from t.Right? It would be good to make sure we have test cases that exercise this.

Good catch. On adding new tests, I can confirm that having non-constant array column on the right leads to panic. Observing the implementation of EqExpr (which is also commutative), I believe that in case of both t.Left and t.Right being indexed columns we should simply return a NonInvertedColExpression indicating that index acceleration is not possible in this case. What do you think?

pkg/sql/rowenc/index_encoding.go, line 866 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I see that you copied this from above, but it looks like this comment has gotten stale so maybe you can fix it in both places?

We no longer return a value for unique, instead we set Unique appropriately on the returned inverted.SpanExpression. Feel free to just remove this comment, and/or maybe add a comment where we set Unique=true below. (A previous version of this code had a separate return value for unique)

Done.

pkg/sql/rowenc/index_encoding.go, line 872 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Hmmm, this is an interesting case. So what should we do when val is an empty array, generate no spans? I don't know if there's precedent for scans over an empty range... maybe this can't be index accelerated?

Maybe we can add a normalization rule in a follow-up PR for this case. We have to be careful though. We can't simply normalize a && ARRAY[]::INT[] to false, because if a is NULL, it evaluates to NULL. But I think we could normalize a filter with a && ARRAY[]::INT[] to false.

I have kept this as is (except making tight=true for overlap) since I don't see a consensus on the expected behavior here. Please let me know if changes are needed.

pkg/sql/rowenc/index_encoding.go, line 887 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Why do we include the empty array span in this case? It seems like it shouldn't be needed given that an empty array doesn't overlap with any other array.

Yes strictly speaking its not needed and hence I have updated the code here. It did help in the scenario when the input array is non-empty but contains only NULL values which then get filtered out leaving us again with an empty array (and 0 keys). invertedExpr = emptyArrSpanExpr makes sure an empty span expression gets returned in this case. It also helps prevent an explicit nil check within the loop.

pkg/sql/rowenc/index_encoding.go, line 890 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

@RajivTS FYI there's some explanation of tightness here:

cockroach/pkg/sql/inverted/expression.go

Lines 280 to 282 in a8fc111

// IsTight returns whether the inverted expression is tight, i.e., will the

// original expression not need to be reevaluated on each row output by the

// query evaluation over the inverted index.

You can see examples of tight and non-tight expressions here:

cockroach/pkg/sql/opt/invertedidx/json_array_test.go

Line 237 in b2c04be

testCases := []struct {

This is a good example of a non-tight expression:

cockroach/pkg/sql/opt/invertedidx/json_array_test.go

Lines 266 to 273 in b2c04be

{

filters: `j <@ '{"a": 1}'`,

indexOrd: jsonOrd,

ok: true,

tight: false,

unique: false,

remainingFilters: `j <@ '{"a": 1}'`,

},

We can scan the inverted span for '{"a": 1}', to get rows that may match, but we need to re-evaluate the expression to ensure that the JSON columns don't contain other values that aren't contained by '{"a": 1}'.

I think these overlaps expressions should be tight.

That's the understanding with which I began this implementation. However, if the generated inverted expression is marked tight then the execution plan created in inverted_index test shows a direct scan of all the rows for evaluating overlaps instead of filtering using the inverted index. I do not know the reason behind this but to ensure that the inverted index is used for array overlaps, I kept tight as false for now.

pkg/sql/sem/tree/eval.go, line 434 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: comments should be <= 80 columns wide

Done.

pkg/sql/sem/tree/eval.go, line 435 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

You should make sure that this is the same function we are actually using to evaluate &&. I think you'll want to extract the code from here for this helper function, and call it from both places:

cockroach/pkg/sql/sem/tree/eval.go

Lines 2598 to 2616 in 63988a4

Fn: func(ctx *EvalContext, left Datum, right Datum) (Datum, error) {

array := MustBeDArray(left)

other := MustBeDArray(right)

if !array.ParamTyp.Equivalent(other.ParamTyp) {

return nil, pgerror.New(pgcode.DatatypeMismatch, "cannot compare arrays with different element types")

}

for _, needle := range array.Array {

// Nulls don't compare to each other in && syntax.

if needle == DNull {

continue

}

for _, hay := range other.Array {

if needle.Compare(ctx, hay) == 0 {

return DBoolTrue, nil

}

}

}

return DBoolFalse, nil

},

Done.

pkg/sql/logictest/testdata/logic_test/inverted_index, line 1491 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I don't think we support columns with arrays of tuples

Done.

pkg/sql/logictest/testdata/logic_test/inverted_index, line 1504 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: remove extra lines after the empty test responses here and below

Done.

pkg/sql/logictest/testdata/logic_test/inverted_index, line 1572 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: remove extra space (I think if you rewrite this file as follows it should remove the space):
make test PKG=./pkg/sql/logictest/... TESTS='TestLogic//inverted_index' TESTFLAGS='-rewrite=true'

Done.

pkg/sql/opt/exec/execbuilder/testdata/inverted_index, line 1446 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

the formatting looks weird -- you might try to run this to fix it:
make test PKG=./pkg/sql/opt/exec/execbuilder... TESTS='TestExecBuild//inverted_index' TESTFLAGS='-rewrite=true'

Done.

pkg/sql/opt/xform/testdata/rules/select, line 9398 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: remove this

Done.

pkg/sql/opt/invertedidx/json_array_test.go, line 743 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

It'd be interesting to add expressions that combine && expressions with @> and <@ with ANDs and ORs.

Done.

pkg/sql/rowenc/index_encoding_test.go, line 749 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I don't see overlapsKeys

Done.

pkg/sql/rowenc/index_encoding_test.go, line 754 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Nice tests!

Done.

pkg/sql/rowenc/index_encoding_test.go, line 784 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Can you add test with NULL arrays?

Tried with NULL::INT[] and NULL but they could not be parsed while running the test. Got the below error:
Failed to parse array NULL::INT[]: could not parse "NULL::INT[]" as type int[]: array must be enclosed in { and }

pkg/sql/rowenc/index_encoding_test.go, line 808 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

why not? (as I wrote elsewhere, I think you might be able to make the spans tight)

Same concern as mentioned earlier.

pkg/sql/rowenc/index_encoding_test.go, line 813 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: comments <= 80 columns wide

Done.

Code quote:

// Array spans for && are never unique unless the result

pkg/sql/rowenc/index_encoding_test.go, line 845 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

why are you using 9 here for nullChance? (maybe just add a comment)

Done.

rytaft

Reviewed 8 of 8 files at r3, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @RajivTS, and @rytaft)

pkg/sql/opt/invertedidx/json_array.go, line 366 at r2 (raw file):

Previously, RajivTS (Rajiv Sharma) wrote…

Good catch. On adding new tests, I can confirm that having non-constant array column on the right leads to panic. Observing the implementation of EqExpr (which is also commutative), I believe that in case of both t.Left and t.Right being indexed columns we should simply return a NonInvertedColExpression indicating that index acceleration is not possible in this case. What do you think?

Yep, looks good to me. Thanks!

pkg/sql/rowenc/index_encoding.go, line 872 at r2 (raw file):

Previously, RajivTS (Rajiv Sharma) wrote…

I have kept this as is (except making tight=true for overlap) since I don't see a consensus on the expected behavior here. Please let me know if changes are needed.

I think it's important that we don't scan spans that we don't need to. I think it's ok to just return &inverted.SpanEpression{Tight: true}. If that causes problems, try &inverted.SpanEpression{Tight: true, SpansToRead: []Span{}, FactoredUnionSpans: []Span{}}}. And if that still causes problems, you could just add a check in extractArrayOverlapsCondition that the const datum is not an empty array (and if it is, return inverted.NonInvertedColExpression{}).

We'll also want to add the normalization rule that @mgartner suggested, but I agree that can be done in a separate PR.

pkg/sql/rowenc/index_encoding.go, line 890 at r2 (raw file):

Previously, RajivTS (Rajiv Sharma) wrote…

That's the understanding with which I began this implementation. However, if the generated inverted expression is marked tight then the execution plan created in inverted_index test shows a direct scan of all the rows for evaluating overlaps instead of filtering using the inverted index. I do not know the reason behind this but to ensure that the inverted index is used for array overlaps, I kept tight as false for now.

Can you try forcing the use of the inverted index using the table@idx syntax in the inverted_index test? It's possible that we need to improve our statistics estimation to choose the correct plan for this case (that could be done in a follow up PR).

I do think we'll want to change tight to true here (we can validate that this is correct with the randomized test that you've included, although you'll need to modify the test to not apply the additional filter on top).

pkg/sql/sem/tree/eval.go, line 438 at r3 (raw file):

func ArrayOverlaps(ctx *EvalContext, left, right *DArray) (*DBool, error) {
	array := MustBeDArray(left)
	other := MustBeDArray(right)

you don't need these two lines -- the parameters are already type DArray

RajivTS

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @rytaft)

-- commits, line 12 at r2:

Previously, rytaft (Rebecca Taft) wrote…

nit: Mention that this is specifically array expressions

Done

pkg/sql/opt/invertedidx/json_array.go, line 366 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Yep, looks good to me. Thanks!

👍

pkg/sql/rowenc/index_encoding.go, line 872 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I think it's important that we don't scan spans that we don't need to. I think it's ok to just return &inverted.SpanEpression{Tight: true}. If that causes problems, try &inverted.SpanEpression{Tight: true, SpansToRead: []Span{}, FactoredUnionSpans: []Span{}}}. And if that still causes problems, you could just add a check in extractArrayOverlapsCondition that the const datum is not an empty array (and if it is, return inverted.NonInvertedColExpression{}).

We'll also want to add the normalization rule that @mgartner suggested, but I agree that can be done in a separate PR.

Done. Returned inverted.NonInvertedColExpression{} in case of empty arrays or arrays with only NULL, in all other cases it produces a tight inverted expression. However this is being done in encodeOverlapsArrayInvertedIndexSpans instead of extractArrayOverlapsCondition since the null check for array elements can only be done once the value datums type has been confired as an Array.
The tests have been modified accordingly.

pkg/sql/rowenc/index_encoding.go, line 890 at r2 (raw file):

Can you try forcing the use of the inverted index using the table@idx syntax in the inverted_index test? It's possible that we need to improve our statistics estimation to choose the correct plan for this case (that could be done in a follow up PR).

Yes this did the trick :)
tight is now true and the tests are working as expected.

pkg/sql/sem/tree/eval.go, line 438 at r3 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

you don't need these two lines -- the parameters are already type DArray

My bad. Done.

pkg/sql/rowenc/index_encoding_test.go, line 752 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Is this true for overlaps? I think the spans for a single array overlap would be unique, like `a &&

I am not so sure. Since the same element can be present in multiple rows for the same array, can unique be true? i.e. Will the spans produce no duplicate primary keys?

rytaft

Getting close! Thanks for the updates.

Reviewed 7 of 7 files at r4, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @RajivTS)

pkg/sql/rowenc/index_encoding.go, line 861 at r4 (raw file):

// the given array, one slice of spans per entry in the array. The input
// inKey is prefixed to all returned keys.
//

nit: remove blank line (here and above where you removed the last bit of the function comment)

pkg/sql/rowenc/index_encoding.go, line 867 at r4 (raw file):

	// If the given array is directly empty (i.e. Len == 0),
	// or contains only NULLs and thus has effective length 0,
	// we cannot generate an inverted expression.

We support the notion of a "contradiction" in regular constraints -- I think we might want to do the same for inverted constraints. A contradiction for inverted constraints would effectively be an empty SpanExpression (&inverted.SpanEpression{Tight: true, SpansToRead: []Span{}, FactoredUnionSpans: []Span{}}}). Does returning that expression cause an error? If so, we may need to update the inverted.SpanEpression code to explicitly support contradictions (can be a separate PR).

In the mean time, if that causes an error, returning inverted.NonInvertedColExpression{} is fine. Could you just add a TODO here that this should be a contradiction?

pkg/sql/opt/exec/execbuilder/testdata/inverted_index, line 1398 at r4 (raw file):

                  spans: /"f"-/"f"/PrefixEnd /[]-/{} /Arr/"f"-/Arr/"f"/PrefixEnd /Arr/{}-/Arr/{}/PrefixEnd /Arr/"a"/"b"-/Arr/"a"/"b"/PrefixEnd /Arr/"c"/{}-/Arr/"c"/{}/PrefixEnd /Arr/"c"/"d"/[]-/Arr/"c"/"d"/{} /Arr/"c"/"d"/Arr/"e"-/Arr/"c"/"d"/Arr/"e"/PrefixEnd

# Test that queries with overlaps && operator use the inverted index

This comment is no longer valid for this test case, so please add a TODO for this case that we should have a normalization rule that turns this into a no-op, since the predicate is a contradiction.

pkg/sql/opt/exec/execbuilder/testdata/inverted_index, line 1444 at r4 (raw file):


query T
EXPLAIN (VERBOSE) SELECT * FROM e WHERE b && ARRAY[NULL]::INT[]

ditto here -- add a TODO

pkg/sql/opt/xform/testdata/rules/select, line 3365 at r3 (raw file):


opt expect=GenerateInvertedIndexScans
SELECT * FROM c WHERE a && ARRAY[]::INT[]

I think these test cases are good, so it would be better to leave them in but use expect-not=GenerateInvertedIndexScans and explain in a TODO that they should be addressed with a normalization rule rather than GenerateInvertedIndexScans

pkg/sql/opt/xform/testdata/rules/select, line 3338 at r4 (raw file):


opt expect=GenerateInvertedIndexScans
SELECT * FROM c WHERE a && ARRAY[1]

you can use c@a_inv_idx here to show the plan with the inverted index

pkg/sql/opt/xform/testdata/rules/select, line 3353 at r4 (raw file):


opt expect=GenerateInvertedIndexScans
SELECT * FROM c WHERE a && ARRAY[1] OR a && ARRAY[2]

use c@a_inv_idx

pkg/sql/opt/invertedidx/json_array_test.go, line 744 at r3 (raw file):

		{
			// When operations affecting two different variables are OR-ed, we cannot
			// constrain either index.

why did you remove this comment? I think it's useful

pkg/sql/opt/invertedidx/json_array_test.go, line 718 at r4 (raw file):

		},
		{
			// Overlaps with an empty array produces a non-inverted expression

nit: end comments with a period (here and below)

pkg/sql/opt/invertedidx/json_array_test.go, line 724 at r4 (raw file):

		},
		{
			// Overlaps with conjunction of both tight expression with

nit: conjunction of both tight expression -> conjunction of two tight expressions (here and below)

pkg/sql/rowenc/index_encoding_test.go, line 742 at r4 (raw file):

		indexedValue string
		value        string
		inverted     bool

I'd call this ok instead

pkg/sql/rowenc/index_encoding_test.go, line 751 at r4 (raw file):

		// correctly overlap or be distinct from the first value.

		// The expression is a union of spans, so unique should never be true

In this case, I don't think we need to include it as a parameter of the test case. It would be better to just unconditionally add a test below that the resulting SpanExpression has Unique=false

pkg/sql/rowenc/index_encoding_test.go, line 817 at r4 (raw file):

		// Array spans for && are never unique unless the result
		// is a singular empty array span.

I think you can change this to test that they are never unique and remove the expectUnique param

pkg/sql/rowenc/index_encoding_test.go, line 826 at r4 (raw file):

		}

		actual, _ := tree.ArrayOverlaps(&evalCtx, value.(*tree.DArray), indexedValue.(*tree.DArray))

you shouldn't need to apply this filter now, since the spans are tight

pkg/sql/rowenc/index_encoding_test.go, line 854 at r4 (raw file):

		// We cannot check for false positives with these tests (due to the fact that
		// the spans are not tight), so we will only test for false negatives.

Since now these spans are always tight, I think you should change this test to check for false positives as well. This test should look very similar to TestEncodeContainingArrayInvertedIndexSpans

pkg/sql/rowenc/index_encoding_test.go, line 864 at r4 (raw file):

		// An inverted expression can only be generated if the value array is
		// non-empty or contains atleast one non-NULL element.
		isInverted := rightArr.Len() > 0 && rightArr.HasNonNulls

nit: I'd call this ok

rytaft

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @RajivTS)

pkg/sql/rowenc/index_encoding.go, line 867 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

We support the notion of a "contradiction" in regular constraints -- I think we might want to do the same for inverted constraints. A contradiction for inverted constraints would effectively be an empty SpanExpression (&inverted.SpanEpression{Tight: true, SpansToRead: []Span{}, FactoredUnionSpans: []Span{}}}). Does returning that expression cause an error? If so, we may need to update the inverted.SpanEpression code to explicitly support contradictions (can be a separate PR).

In the mean time, if that causes an error, returning inverted.NonInvertedColExpression{} is fine. Could you just add a TODO here that this should be a contradiction?

Another reason I'd really like to just return an empty span (i.e. contradiction) here instead of inverted.NonInvertedColExpression is that if we want to use this function to support inverted joins with && in the future, returning inverted.NonInvertedColExpression won't work (and any normalization rule we add won't help).

Anyway, as I said, this is fine for now if returning an empty span doesn't work, but we'll need to change it in the future.

pkg/sql/logictest/testdata/logic_test/inverted_index, line 1512 at r4 (raw file):

----

query T

For all of these tests that return multiple rows, you'll either need to add an ORDER BY clause or use the rowsort directive (e.g., query T rowsort). That will prevent the CI build failures we're seeing.

rytaft

Great work! Thanks again for the contribution.

Do you want to take another look, @mgartner? Otherwise I'll go ahead and merge this.

Reviewed 6 of 6 files at r5, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @mgartner)

mgartner

Nicely done!

I left some minor optional suggestions, and one question for @rytaft.

Reviewed 1 of 8 files at r3, 1 of 7 files at r4, 6 of 6 files at r5, all commit messages.
Reviewable status: complete! 2 of 0 LGTMs obtained (waiting on @mgartner, @RajivTS, and @rytaft)

pkg/sql/opt/invertedidx/json_array.go, line 398 at r5 (raw file):

		// inverted expression with the constant value picked from right.
		constantVal = right
	} else if isIndexColumn(j.tabID, j.index, right, j.computedColumns) && memo.CanExtractConstDatum(left) {

This is fine for now because it's following a pattern already established in this file, but in the long term I think it'd be better to make a normalization rule to commute && expressions so that variables are always on the left and constants are always on the right. I'm not sure why we didn't do that for contains @> and contained by <@. Maybe those aren't commutative?

pkg/sql/rowenc/index_encoding_test.go, line 752 at r2 (raw file):

Previously, RajivTS (Rajiv Sharma) wrote…

I am not so sure. Since the same element can be present in multiple rows for the same array, can unique be true? i.e. Will the spans produce no duplicate primary keys?

It would be impossible to have duplicate entries in the inverted index for the same primary key - they index keys would be identical. Here's a kvtrace test that proves it:

statement ok
CREATE TABLE t (
  k INT PRIMARY KEY,
  a INT[],
  INVERTED INDEX (a),
  FAMILY (k, a)
)

query T kvtrace
INSERT INTO t VALUES (1, ARRAY[1, 2])
----
CPut /Table/106/1/1/0 -> /TUPLE/
InitPut /Table/106/2/1/1/0 -> /BYTES/
InitPut /Table/106/2/2/1/0 -> /BYTES/

query T kvtrace
INSERT INTO t VALUES (2, ARRAY[1, 2, 1])
----
CPut /Table/106/1/2/0 -> /TUPLE/
InitPut /Table/106/2/1/2/0 -> /BYTES/
InitPut /Table/106/2/2/2/0 -> /BYTES/

Regardless, I think setting unique=true for an expression like a && ARRAY[1] would be a minor performance improvement. Leaving it as unique=false wouldn't cause incorrect results, is that correct @rytaft?

pkg/sql/rowenc/index_encoding_test.go, line 784 at r2 (raw file):

Previously, RajivTS (Rajiv Sharma) wrote…

Tried with NULL::INT[] and NULL but they could not be parsed while running the test. Got the below error:
Failed to parse array NULL::INT[]: could not parse "NULL::INT[]" as type int[]: array must be enclosed in { and }

You could add a check for "NULL" in parseArray like:

if s == "NULL" {
    return tree.DNull
}

rytaft

Reviewable status: complete! 2 of 0 LGTMs obtained (waiting on @mgartner and @RajivTS)

pkg/sql/rowenc/index_encoding.go, line 882 at r5 (raw file):

			inverted.MakeSingleValSpan(key), true, /* tight */
		)
		if invertedExpr == nil {

I think you should add spanExpr.Unique = true above this line.

pkg/sql/rowenc/index_encoding_test.go, line 752 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

It would be impossible to have duplicate entries in the inverted index for the same primary key - they index keys would be identical. Here's a kvtrace test that proves it:
statement ok
CREATE TABLE t (
  k INT PRIMARY KEY,
  a INT[],
  INVERTED INDEX (a),
  FAMILY (k, a)
)

query T kvtrace
INSERT INTO t VALUES (1, ARRAY[1, 2])
----
CPut /Table/106/1/1/0 -> /TUPLE/
InitPut /Table/106/2/1/1/0 -> /BYTES/
InitPut /Table/106/2/2/1/0 -> /BYTES/

query T kvtrace
INSERT INTO t VALUES (2, ARRAY[1, 2, 1])
----
CPut /Table/106/1/2/0 -> /TUPLE/
InitPut /Table/106/2/1/2/0 -> /BYTES/
InitPut /Table/106/2/2/2/0 -> /BYTES/
Regardless, I think setting unique=true for an expression like a && ARRAY[1] would be a minor performance improvement. Leaving it as unique=false wouldn't cause incorrect results, is that correct @rytaft?

Yea -- I think you're totally right, @mgartner. I added a comment in the code where I think unique should be set to true. So unfortunately you'll need to undo the change where you removed unique from the test case, @RajivTS.

RajivTS

Reviewable status: complete! 2 of 0 LGTMs obtained (waiting on @mgartner, @RajivTS, and @rytaft)

pkg/sql/rowenc/index_encoding.go, line 867 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Another reason I'd really like to just return an empty span (i.e. contradiction) here instead of inverted.NonInvertedColExpression is that if we want to use this function to support inverted joins with && in the future, returning inverted.NonInvertedColExpression won't work (and any normalization rule we add won't help).

Anyway, as I said, this is fine for now if returning an empty span doesn't work, but we'll need to change it in the future.

Currently it does cause an error and the tests fail which is why I reverted to inverted.NonInvertedColExpression. Will add the TODO for now.

pkg/sql/logictest/testdata/logic_test/inverted_index, line 1512 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

For all of these tests that return multiple rows, you'll either need to add an ORDER BY clause or use the rowsort directive (e.g., query T rowsort). That will prevent the CI build failures we're seeing.

Done.

pkg/sql/opt/exec/execbuilder/testdata/inverted_index, line 1398 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

This comment is no longer valid for this test case, so please add a TODO for this case that we should have a normalization rule that turns this into a no-op, since the predicate is a contradiction.

Done.

pkg/sql/opt/exec/execbuilder/testdata/inverted_index, line 1444 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

ditto here -- add a TODO

Done.

pkg/sql/opt/xform/testdata/rules/select, line 3365 at r3 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I think these test cases are good, so it would be better to leave them in but use expect-not=GenerateInvertedIndexScans and explain in a TODO that they should be addressed with a normalization rule rather than GenerateInvertedIndexScans

Done.

pkg/sql/opt/xform/testdata/rules/select, line 3338 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

you can use c@a_inv_idx here to show the plan with the inverted index

Done.

pkg/sql/opt/xform/testdata/rules/select, line 3353 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

use c@a_inv_idx

Done.

pkg/sql/opt/invertedidx/json_array_test.go, line 744 at r3 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

why did you remove this comment? I think it's useful

Done.

pkg/sql/rowenc/index_encoding_test.go, line 752 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Yea -- I think you're totally right, @mgartner. I added a comment in the code where I think unique should be set to true. So unfortunately you'll need to undo the change where you removed unique from the test case, @RajivTS.

Unfortunately, I don't know how to read a KV trace so am unable to figure out how the above trace confirms that there cannot be a duplicate. If there are any resources that you can share to better understand the above terms, it would really help.

Let me describe how I think how this whole scenario is working out. Please rectify wherever I am incorrect.
A table T contains an indexed INT array column, let's call it A. Few of the entries in A are: [1], [1, 2], [2, 3], [3, 5, 6, 7], [6, 8], [9,0] with auto-incrementing primary key starting at 1. The inverted index on A would be a table roughly like below:

Array Value (ID)	Primary Key in T (VAL)
1	1, 2
2	2, 3
3	3, 4
5	4
6	4, 5
7	4
8	5
9	6
0	6

Assume that we are running a query as follows:
SELECT * FROM T WHERE A && [0,1,2,8]

After generating the inverted span expression for the constant value on the right hand-side, we get a modified query like:
SELECT * FROM T WHERE T_Idx IN (SELECT VAL FROM InvertedIdx_A WHERE ID == 0 OR ID == 1 OR ID == 2 OR ID == 8)

Assuming everything up until this point has been (roughly) correct, we can make the following claims about uniqueness:

The above inverted span expression is not unique due to the ORing of the keys pulled from the array, i.e. A && [1] would be unique but not A && [0,1,2,8]. Thus, overlaps between an indexed array and a single element array would always be unique.
ORing of two different overlaps will not be unique even if the individual overlaps themselves are unique, i.e. A && [1] OR A && [2] won't be unique.
ANDing of two different overlaps will be unique provided the individual overlaps themselves are unique, i.e. A && [1] AND A && [2] will be unique but A && [1, 2, 3] AND A && [5, 6] will not be unique.

pkg/sql/rowenc/index_encoding_test.go, line 742 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I'd call this ok instead

Done.

pkg/sql/rowenc/index_encoding_test.go, line 751 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

In this case, I don't think we need to include it as a parameter of the test case. It would be better to just unconditionally add a test below that the resulting SpanExpression has Unique=false

Done.

pkg/sql/rowenc/index_encoding_test.go, line 817 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I think you can change this to test that they are never unique and remove the expectUnique param

Done.

pkg/sql/rowenc/index_encoding_test.go, line 826 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

you shouldn't need to apply this filter now, since the spans are tight

Done.

pkg/sql/rowenc/index_encoding_test.go, line 854 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Since now these spans are always tight, I think you should change this test to check for false positives as well. This test should look very similar to TestEncodeContainingArrayInvertedIndexSpans

Done.

RajivTS

Reviewable status: complete! 2 of 0 LGTMs obtained (waiting on @mgartner, @RajivTS, and @rytaft)

pkg/sql/rowenc/index_encoding_test.go, line 752 at r2 (raw file):

Previously, RajivTS (Rajiv Sharma) wrote…

Unfortunately, I don't know how to read a KV trace so am unable to figure out how the above trace confirms that there cannot be a duplicate. If there are any resources that you can share to better understand the above terms, it would really help.

Let me describe how I think how this whole scenario is working out. Please rectify wherever I am incorrect.
A table T contains an indexed INT array column, let's call it A. Few of the entries in A are: [1], [1, 2], [2, 3], [3, 5, 6, 7], [6, 8], [9,0] with auto-incrementing primary key starting at 1. The inverted index on A would be a table roughly like below:

Array Value (ID) Primary Key in T (VAL)

1 1, 2

2 2, 3

3 3, 4

5 4

6 4, 5

7 4

8 5

9 6

0 6

Assume that we are running a query as follows:
SELECT * FROM T WHERE A && [0,1,2,8]

After generating the inverted span expression for the constant value on the right hand-side, we get a modified query like:
SELECT * FROM T WHERE T_Idx IN (SELECT VAL FROM InvertedIdx_A WHERE ID == 0 OR ID == 1 OR ID == 2 OR ID == 8)

Assuming everything up until this point has been (roughly) correct, we can make the following claims about uniqueness:

The above inverted span expression is not unique due to the ORing of the keys pulled from the array, i.e. A && [1] would be unique but not A && [0,1,2,8]. Thus, overlaps between an indexed array and a single element array would always be unique.

ORing of two different overlaps will not be unique even if the individual overlaps themselves are unique, i.e. A && [1] OR A && [2] won't be unique.

ANDing of two different overlaps will be unique provided the individual overlaps themselves are unique, i.e. A && [1] AND A && [2] will be unique but A && [1, 2, 3] AND A && [5, 6] will not be unique.

Once the above logic gets confirmed, I will make changes to the unique part and associated tests.

rytaft

Reviewable status: complete! 2 of 0 LGTMs obtained (waiting on @mgartner and @RajivTS)

pkg/sql/rowenc/index_encoding_test.go, line 752 at r2 (raw file):

Previously, RajivTS (Rajiv Sharma) wrote…

Once the above logic gets confirmed, I will make changes to the unique part and associated tests.

@RajivTS your understanding is correct, as described in this comment:

cockroach/pkg/sql/inverted/expression.go

Lines 300 to 305 in a8fc111

    
           // Unique is true if the spans are guaranteed not to produce duplicate 
        
           // primary keys. Otherwise, Unique is false. Unique may be true for certain 
        
           // JSON or Array SpanExpressions, and it holds when unique SpanExpressions 
        
           // are combined with And. It does not hold when these SpanExpressions are 
        
           // combined with Or. 
        
           Unique bool

I think your examples above would make great test cases.

The logic for AND and OR is already taken care of in expression.go, so you don't need to worry about implementing it.

The logic for AND is implemented here:

cockroach/pkg/sql/inverted/expression.go

Line 655 in a8fc111

Unique: left.Unique && right.Unique,

And the logic for OR is implemented here:

cockroach/pkg/sql/inverted/expression.go

Lines 680 to 687 in a8fc111

    
           expr := &SpanExpression{ 
        
           	Tight:              left.Tight && right.Tight, 
        
           	SpansToRead:        unionSpans(left.SpansToRead, right.SpansToRead), 
        
           	FactoredUnionSpans: unionSpans(left.FactoredUnionSpans, right.FactoredUnionSpans), 
        
           	Operator:           SetUnion, 
        
           	Left:               left, 
        
           	Right:              right, 
        
           }

(notice that the resulting SpanExpression has the default value for Unique, which is false for booleans in go.)

I think you should be able to just set spanExpr.Unique = true where I indicated in index_encoding.go and it should all "just work".

RajivTS

Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner and @rytaft)

pkg/sql/opt/invertedidx/json_array.go, line 398 at r5 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

This is fine for now because it's following a pattern already established in this file, but in the long term I think it'd be better to make a normalization rule to commute && expressions so that variables are always on the left and constants are always on the right. I'm not sure why we didn't do that for contains @> and contained by <@. Maybe those aren't commutative?

Will work on that PR next. IMHO, contains and contained by don't appear to be commutative (i.e. A @> B isn't always the same as B @> A)

pkg/sql/rowenc/index_encoding.go, line 882 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I think you should add spanExpr.Unique = true above this line.

Done.

pkg/sql/rowenc/index_encoding_test.go, line 784 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

You could add a check for "NULL" in parseArray like:
if s == "NULL" {
    return tree.DNull
}

Done.

ajwerner · 2022-02-28T14:13:12Z

bors r-

rytaft

Thanks, @ajwerner

Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @RajivTS)

pkg/sql/rowenc/index_encoding.go, line 874 at r6 (raw file):

Previously, ajwerner wrote…

This needs a rebase.

bors r-

@RajivTS I don't have permission to push to your branch, so can you please rebase from cockroach/master and change
descpb.LatestNonPrimaryIndexDescriptorVersion here to descpb.PrimaryIndexWithStoredColumnsVersion. Thanks!

blathers-crl · 2022-02-28T22:35:28Z

Thank you for updating your pull request.

Before a member of our team reviews your PR, I have some potential action items for you:

We notice you have more than one commit in your PR. We try break logical changes into separate commits, but commits such as "fix typo" or "address review commits" should be squashed into one commit and pushed with --force
Please ensure your git commit message contains a release note.
When CI has completed, please ensure no errors have appeared.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.}

…dexes. Previously, we did not support index acceleration for queries involving array overlaps (&&). This change adds support for using the inverted index with && expressions on Array columns. When there is an inverted index available, a scan will be done on the Array column using the spans found from the constant value. Release note (performance improvement): Expressions using the overlaps (&&) operator for arrays now support index-acceleration for faster execution in some cases.

otan · 2022-03-01T08:57:04Z

unfortunately, i think the rebase here was done incorrectly. there are 2436 files changed.

i would suggest git checkout -b <branch_name> c368e32733bb370dfe70c09e444c4700b2f324b6 && git pull --rebase origin master and pushing that branch here.

RajivTS · 2022-03-01T08:59:08Z

unfortunately, i think the rebase here was done incorrectly. there are 2436 files changed.

Yes, I will try to do this fresh with a new branch, avoiding this entire hassle. Will close this PR and create a new one (with link to the original) once I am done with my changes.

RajivTS requested a review from a team as a code owner February 8, 2022 00:08

blathers-crl bot added the O-community Originated from the community label Feb 8, 2022

yuzefovich requested a review from a team February 8, 2022 06:07

RajivTS force-pushed the RajivTS/IssueFix75477 branch from 5299bf4 to f21d1a5 Compare February 9, 2022 00:36

rytaft reviewed Feb 9, 2022

View reviewed changes

mgartner requested changes Feb 9, 2022

View reviewed changes

RajivTS force-pushed the RajivTS/IssueFix75477 branch from f21d1a5 to 79b434c Compare February 10, 2022 00:18

RajivTS commented Feb 10, 2022

View reviewed changes

rytaft reviewed Feb 10, 2022

View reviewed changes

RajivTS force-pushed the RajivTS/IssueFix75477 branch from 79b434c to 3430798 Compare February 13, 2022 22:59

RajivTS commented Feb 13, 2022

View reviewed changes

RajivTS force-pushed the RajivTS/IssueFix75477 branch from 3430798 to 1683881 Compare February 13, 2022 23:54

rytaft reviewed Feb 14, 2022

View reviewed changes

RajivTS force-pushed the RajivTS/IssueFix75477 branch from 1683881 to e8e33a3 Compare February 20, 2022 22:39

rytaft approved these changes Feb 21, 2022

View reviewed changes

mgartner approved these changes Feb 21, 2022

View reviewed changes

rytaft reviewed Feb 21, 2022

View reviewed changes

RajivTS commented Feb 27, 2022

View reviewed changes

rytaft reviewed Feb 27, 2022

View reviewed changes

RajivTS force-pushed the RajivTS/IssueFix75477 branch 2 times, most recently from fa3c6d9 to c368e32 Compare February 27, 2022 21:45

RajivTS commented Feb 27, 2022

View reviewed changes

rytaft reviewed Feb 28, 2022

View reviewed changes

RajivTS force-pushed the RajivTS/IssueFix75477 branch from 68e2d7e to d62a8f2 Compare March 1, 2022 08:55

RajivTS requested review from a team as code owners March 1, 2022 08:55

RajivTS requested review from a team March 1, 2022 08:55

RajivTS requested a review from a team as a code owner March 1, 2022 08:55

RajivTS requested a review from a team March 1, 2022 08:55

RajivTS requested a review from a team as a code owner March 1, 2022 08:55

RajivTS requested a review from a team March 1, 2022 08:55

RajivTS requested review from a team as code owners March 1, 2022 08:55

RajivTS requested review from dt, shermanCRL and erikgrinaker and removed request for a team March 1, 2022 08:55

RajivTS closed this Mar 1, 2022

RajivTS deleted the RajivTS/IssueFix75477 branch March 6, 2022 22:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

opt: index accelerate && (overlaps) expressions for array inverted indexes #76212

opt: index accelerate && (overlaps) expressions for array inverted indexes #76212

RajivTS commented Feb 8, 2022 •

edited

Loading

blathers-crl bot commented Feb 8, 2022

cockroach-teamcity commented Feb 8, 2022 •

edited

Loading

cockroach-teamcity commented Feb 8, 2022

blathers-crl bot commented Feb 9, 2022

rytaft left a comment

mgartner left a comment

mgartner commented Feb 9, 2022

rytaft commented Feb 9, 2022

RajivTS left a comment

rytaft left a comment

RajivTS left a comment

rytaft left a comment

rytaft left a comment

rytaft left a comment

mgartner left a comment

rytaft left a comment

RajivTS left a comment

RajivTS left a comment

rytaft left a comment

RajivTS left a comment

ajwerner commented Feb 28, 2022

rytaft left a comment

blathers-crl bot commented Feb 28, 2022

otan commented Mar 1, 2022 •

edited

Loading

RajivTS commented Mar 1, 2022

	Fn: func(ctx *EvalContext, left Datum, right Datum) (Datum, error) {
	array := MustBeDArray(left)
	other := MustBeDArray(right)
	if !array.ParamTyp.Equivalent(other.ParamTyp) {
	return nil, pgerror.New(pgcode.DatatypeMismatch, "cannot compare arrays with different element types")
	}
	for _, needle := range array.Array {
	// Nulls don't compare to each other in && syntax.
	if needle == DNull {
	continue
	}
	for _, hay := range other.Array {
	if needle.Compare(ctx, hay) == 0 {
	return DBoolTrue, nil
	}
	}
	}
	return DBoolFalse, nil
	},

	// IsTight returns whether the inverted expression is tight, i.e., will the
	// original expression not need to be reevaluated on each row output by the
	// query evaluation over the inverted index.

	{
	filters: `j <@ '{"a": 1}'`,
	indexOrd: jsonOrd,
	ok: true,
	tight: false,
	unique: false,
	remainingFilters: `j <@ '{"a": 1}'`,
	},

	// Unique is true if the spans are guaranteed not to produce duplicate
	// primary keys. Otherwise, Unique is false. Unique may be true for certain
	// JSON or Array SpanExpressions, and it holds when unique SpanExpressions
	// are combined with And. It does not hold when these SpanExpressions are
	// combined with Or.
	Unique bool

	expr := &SpanExpression{
	Tight: left.Tight && right.Tight,
	SpansToRead: unionSpans(left.SpansToRead, right.SpansToRead),
	FactoredUnionSpans: unionSpans(left.FactoredUnionSpans, right.FactoredUnionSpans),
	Operator: SetUnion,
	Left: left,
	Right: right,
	}

opt: index accelerate && (overlaps) expressions for array inverted indexes #76212

opt: index accelerate && (overlaps) expressions for array inverted indexes #76212

Conversation

RajivTS commented Feb 8, 2022 • edited Loading

blathers-crl bot commented Feb 8, 2022

cockroach-teamcity commented Feb 8, 2022 • edited Loading

cockroach-teamcity commented Feb 8, 2022

blathers-crl bot commented Feb 9, 2022

rytaft left a comment

Choose a reason for hiding this comment

mgartner left a comment

Choose a reason for hiding this comment

mgartner commented Feb 9, 2022

rytaft commented Feb 9, 2022

RajivTS left a comment

Choose a reason for hiding this comment

rytaft left a comment

Choose a reason for hiding this comment

RajivTS left a comment

Choose a reason for hiding this comment

rytaft left a comment

Choose a reason for hiding this comment

rytaft left a comment

Choose a reason for hiding this comment

rytaft left a comment

Choose a reason for hiding this comment

mgartner left a comment

Choose a reason for hiding this comment

rytaft left a comment

Choose a reason for hiding this comment

RajivTS left a comment

Choose a reason for hiding this comment

RajivTS left a comment

Choose a reason for hiding this comment

rytaft left a comment

Choose a reason for hiding this comment

RajivTS left a comment

Choose a reason for hiding this comment

ajwerner commented Feb 28, 2022

rytaft left a comment

Choose a reason for hiding this comment

blathers-crl bot commented Feb 28, 2022

otan commented Mar 1, 2022 • edited Loading

RajivTS commented Mar 1, 2022

RajivTS commented Feb 8, 2022 •

edited

Loading

cockroach-teamcity commented Feb 8, 2022 •

edited

Loading

otan commented Mar 1, 2022 •

edited

Loading