sql: Tighter spans for column families in lookup and index joins #38301

rohany · 2019-06-19T20:24:29Z

When only specific families are needed for a query, point lookups
in joiners can issue family specific spans to retrieve only
the data that they need, instead of querying for an entire row.

Release note: None

cockroach-teamcity · 2019-06-19T20:24:36Z

This change is

rohany · 2019-06-19T20:26:38Z

if this works, i believe a very similar strategy can be used to do the same optimization in the index joiner.

solongordon

Testing is an interesting question. When I did this work for tablereader I added tests which looked at the spans in the explain plans, but that won't work here.

Right now the best thing I can think of would be to add a logic test where you use the SHOW TRACE FOR SESSION feature to see what scans are happening.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @rohany, and @solongordon)

pkg/sql/distsqlrun/indexjoiner.go, line 55 at r4 (raw file):

	spans roachpb.Spans

	// Hold onto what families we need to query from if our

Nit: The Go convention is to prefix comments with the name of the thing they're describing, so neededFamilies is.... That way if you grep the code for neededFamilies you can see both the definition and the description.

pkg/sql/distsqlrun/indexjoiner.go, line 226 at r4 (raw file):

	// looking up a full primary key
	if len(ij.neededFamilies) > 0 &&
		len(ij.desc.Families) > 1 &&

I think this condition is redundant. It's implied by the other two.

pkg/sql/distsqlrun/joinreader.go, line 293 at r4 (raw file):

// TODO: maybe want to move this into the joinerbase?
func (jr *joinReader) canSplitSpanIntoSeparateFamilies(span roachpb.Span) (bool, roachpb.Spans) {

Usually the convention in Go is to return the bool "ok" value second, not first. But actually I'm thinking it would be simpler here to do away with it altogether. If the span can't be split, you could still return roachpb.Spans but only containing the input span. Then just call the function maybeSplitSpanIntoFamilies or whatever.

pkg/sql/distsqlrun/joinreader.go, line 296 at r4 (raw file):

	// check the following:
	// - we have more than one needed family
	// - we are looking at the primary index

I think this is too restrictive. We can also split if we have a unique secondary index.

pkg/sql/distsqlrun/joinreader.go, line 302 at r4 (raw file):

	// up on is equal to the number of columns in the primary index, then
	// we are guaranteed to be looking at a single row, so we don't need to
	// be checking if span.Key.Equals(span.EndKey)

This sounds plausible except I suspect it doesn't hold true for interleaved tables. I would recommend revisiting this after you've done some reading on the interleaved encoding stuff. You also might want to take a look at MakeSpanFromEncDatums, which is what the lookup joiner uses to make a span for a single key lookup.

pkg/sql/distsqlrun/joinreader.go, line 305 at r4 (raw file):

	if len(jr.neededFamilies) > 0 &&
		jr.index.ID == jr.desc.PrimaryIndex.ID &&
		len(jr.desc.Families) > 1 &&

Redundant condition

pkg/sql/sqlbase/index_encoding.go, line 272 at r4 (raw file):

// into separate spans that request particular families from neededFamilies
// instead of requesting all the families. It is up to the client to verify
// whether the requested span represents a single row lookup, and when

Can you make it clearer that this is only appropriate for point lookups?

pkg/sql/sqlbase/index_encoding.go, line 283 at r4 (raw file):

		tempSpan.EndKey = tempSpan.Key.PrefixEnd()
		if i > 0 && familyID == neededFamilies[i-1]+1 {
			// the family ID is the same, so collapse these spans

I think the previous comment was more accurate:

// This column family is adjacent to the previous one. We can merge
// the two spans into one.

solongordon

This is the general kind of test I'm thinking of:

cockroach/pkg/sql/logictest/testdata/planner_test/update

Lines 277 to 294 in 27fe883

    
           # Make sure that the scan for the update actually gets parallelized. 
        
           statement ok 
        
           SET tracing = on; UPDATE a SET b=10 WHERE a = 0 OR a = 10; SET tracing = off 
        
           # The span "sending partial batch" means that the scan was parallelized. 
        
           # If this test is failing and doesn't have that span, it means that the scanNode 
        
           # was improperly configured to add a limit to the ScanRequest batch. 
        
           # See #30943 for more details. 
        
           query T 
        
           SELECT message FROM [SHOW TRACE FOR SESSION] WHERE message IN 
        
               ('querying next range at /Table/58/1/0', 
        
                'querying next range at /Table/58/1/10', 
        
                '=== SPAN START: kv.DistSender: sending partial batch ===' 
        
               ) 
        
           ---- 
        
           querying next range at /Table/58/1/0 
        
           === SPAN START: kv.DistSender: sending partial batch === 
        
           querying next range at /Table/58/1/10

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @rohany)

rohany

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @solongordon)

pkg/sql/distsqlrun/indexjoiner.go, line 226 at r4 (raw file):

Previously, solongordon (Solon) wrote…

I think this condition is redundant. It's implied by the other two.

Done.

pkg/sql/distsqlrun/joinreader.go, line 293 at r4 (raw file):

Previously, solongordon (Solon) wrote…

Usually the convention in Go is to return the bool "ok" value second, not first. But actually I'm thinking it would be simpler here to do away with it altogether. If the span can't be split, you could still return roachpb.Spans but only containing the input span. Then just call the function maybeSplitSpanIntoFamilies or whatever.

Done.

pkg/sql/distsqlrun/joinreader.go, line 296 at r4 (raw file):

Previously, solongordon (Solon) wrote…

I think this is too restrictive. We can also split if we have a unique secondary index.

Done.

pkg/sql/distsqlrun/joinreader.go, line 302 at r4 (raw file):

Previously, solongordon (Solon) wrote…

This sounds plausible except I suspect it doesn't hold true for interleaved tables. I would recommend revisiting this after you've done some reading on the interleaved encoding stuff. You also might want to take a look at MakeSpanFromEncDatums, which is what the lookup joiner uses to make a span for a single key lookup.

I believe this should work even for interleaved tables -- MakeSpanFromEncDatums generates all the ancestors of the interleaving that are needed, and then child interleaved tables keys must have the '#' separator instead of a family id, so there isn't a way that we get back a span that contains child interleaved rows.

pkg/sql/distsqlrun/joinreader.go, line 305 at r4 (raw file):

Previously, solongordon (Solon) wrote…

Redundant condition

Done.

pkg/sql/sqlbase/index_encoding.go, line 272 at r4 (raw file):

Previously, solongordon (Solon) wrote…

Can you make it clearer that this is only appropriate for point lookups?

Done.

pkg/sql/sqlbase/index_encoding.go, line 283 at r4 (raw file):

Previously, solongordon (Solon) wrote…

I think the previous comment was more accurate:
// This column family is adjacent to the previous one. We can merge
// the two spans into one.

Done.

rohany · 2019-06-25T21:24:00Z

I added an example logic test, take a look at let me know if its what you had in mind @solongordon

solongordon

Great to see that test! Could we get one for index joins too?

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @rohany, and @solongordon)

pkg/sql/distsqlrun/joinreader.go, line 302 at r4 (raw file):

Previously, rohany (Rohan Yadav) wrote…

I believe this should work even for interleaved tables -- MakeSpanFromEncDatums generates all the ancestors of the interleaving that are needed, and then child interleaved tables keys must have the '#' separator instead of a family id, so there isn't a way that we get back a span that contains child interleaved rows.

That sounds plausible... let's test it!

pkg/sql/logictest/testdata/logic_test/join, line 1081 at r7 (raw file):

query error aggregate functions are not allowed in ON
SELECT * FROM foo JOIN bar ON max(foo.c) < 2

Let's put this in the lookup_join test file. Then you shouldn't need to turn the optimizer on explicitly because it only uses configurations where the optimizer is already enabled.

pkg/sql/logictest/testdata/logic_test/join, line 1087 at r7 (raw file):


statement ok
CREATE TABLE familySplit1 (x INT, PRIMARY KEY (x)); INSERT INTO familySplit1 VALUES (1);

Nit: Could you make the INSERTs separate statements? Helps readability.
Another nit: Usually table names are in snake_case.

pkg/sql/logictest/testdata/logic_test/join, line 1093 at r7 (raw file):


statement ok
SET optimizer = on; SET tracing = on; SELECT familySplit2.x, familySplit2.z FROM familySplit1 INNER LOOKUP JOIN familySplit2 ON (familySplit1.x = familySplit2.x); SET tracing = off; set optimizer = off;

Nit: Extra parens around the ON condition.

pkg/sql/logictest/testdata/logic_test/join, line 1096 at r7 (raw file):


query T
select message from [SHOW TRACE FOR SESSION] where message in ('Scan /Table/81/1/1/{0-1}, /Table/81/1/1/2/{1-2}');

I would recommend changing the WHERE condition to WHERE message LIKE 'Scan%'. That way we can make sure there aren't any unnecessary scans happening.

pkg/sql/logictest/testdata/logic_test/join, line 1098 at r7 (raw file):

select message from [SHOW TRACE FOR SESSION] where message in ('Scan /Table/81/1/1/{0-1}, /Table/81/1/1/2/{1-2}');
----
Scan /Table/81/1/1/{0-1}, /Table/81/1/1/2/{1-2}

Nice! Yes this is exactly what we want to see.

rohany

Yup, added a test using interleaved tables as well.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @solongordon)

pkg/sql/distsqlrun/joinreader.go, line 302 at r4 (raw file):

Previously, solongordon (Solon) wrote…

That sounds plausible... let's test it!

Done.

pkg/sql/logictest/testdata/logic_test/join, line 1081 at r7 (raw file):

Previously, solongordon (Solon) wrote…

Let's put this in the lookup_join test file. Then you shouldn't need to turn the optimizer on explicitly because it only uses configurations where the optimizer is already enabled.

Done.

pkg/sql/logictest/testdata/logic_test/join, line 1096 at r7 (raw file):

Previously, solongordon (Solon) wrote…

I would recommend changing the WHERE condition to WHERE message LIKE 'Scan%'. That way we can make sure there aren't any unnecessary scans happening.

Done.

pkg/sql/logictest/testdata/logic_test/join, line 1098 at r7 (raw file):

Previously, solongordon (Solon) wrote…

Nice! Yes this is exactly what we want to see.

Done.

solongordon

I added some picky comments about your logic tests. Otherwise this is good to go.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @rohany, and @solongordon)

pkg/sql/logictest/testdata/logic_test/lookup_join, line 510 at r8 (raw file):

1     1  2  2  NULL  2
1     1  2  2  1     1

Nit: Extra blank line

pkg/sql/logictest/testdata/logic_test/lookup_join, line 517 at r8 (raw file):


statement ok
CREATE TABLE family_split_1 (x INT, PRIMARY KEY (x));

Nit: You don't need semicolons at the end of statements in logic tests.

pkg/sql/logictest/testdata/logic_test/lookup_join, line 529 at r8 (raw file):


statement ok
SET tracing = on; SELECT family_split_2.x, family_split_2.z FROM family_split_1 INNER LOOKUP JOIN family_split_2 ON family_split_1.x = family_split_2.x; SET tracing = off;

Minor: While we're running this query we might as well assert that the output is as expected. Like

query II
SET tracing = on; SELECT y,w FROM family_index_join WHERE y = 2
----
2 4

query T
SET tracing = off; SELECT message FROM [SHOW TRACE FOR SESSION] WHERE message LIKE 'Scan /Table/71/%, /Table/71/%';
----
Scan /Table/71/1/1/1/{1-2}, /Table/71/1/1/3/{1-2}

pkg/sql/logictest/testdata/logic_test/lookup_join, line 532 at r8 (raw file):


query T
SELECT message FROM [SHOW TRACE FOR SESSION] WHERE message LIKE 'Scan /Table/70/%, /Table/70/%'

I still think we should make the LIKE a bit more general: WHERE message LIKE 'Scan /Table/70/%. Just in case there are extra scans you're not catching.

pkg/sql/logictest/testdata/logic_test/lookup_join, line 535 at r8 (raw file):

----
Scan /Table/70/1/1/{0-1}, /Table/70/1/1/2/{1-2}

Nit: Extra blank line

pkg/sql/logictest/testdata/logic_test/lookup_join, line 544 at r8 (raw file):


statement ok
SET tracing = on; SELECT y,w FROM family_index_join WHERE y = 2; SET tracing = off;

Same thing about asserting the output.

Also, just to be safe we might want to force it to use the y index so that the index join will definitely happen: SELECT y, w FROM family_index_join@family_index_join_y_idx WHERE y = 2.

pkg/sql/logictest/testdata/logic_test/lookup_join, line 547 at r8 (raw file):


query T
SELECT message FROM [SHOW TRACE FOR SESSION] WHERE message LIKE 'Scan /Table/71/%, /Table/71/%';

Ditto about the more general LIKE.

pkg/sql/logictest/testdata/logic_test/lookup_join, line 551 at r8 (raw file):

Scan /Table/71/1/1/1/{1-2}, /Table/71/1/1/3/{1-2}

# test tight spans with interleaved tables

Proper grammar for comments please

pkg/sql/logictest/testdata/logic_test/lookup_join, line 568 at r8 (raw file):


query II
select family_interleave_1.x, family_interleave_1.z from family_interleave_2 INNER LOOKUP JOIN family_interleave_1 ON family_interleave_1.x = family_interleave_2.x;

Nit: upper case SELECT and FROM would be nice.

pkg/sql/logictest/testdata/logic_test/lookup_join, line 576 at r8 (raw file):


query T
SELECT message FROM [SHOW TRACE FOR SESSION] WHERE message LIKE 'Scan /Table/72/%, /Table/72/%';

Ditto about the more general LIKE.

rohany · 2019-07-08T20:50:18Z

RFAL

solongordon

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis)

joiners When only specific families are needed for a query, point lookups in joiners can issue family specific spans to retrieve only the data that they need, instead of querying for an entire row. Release note: None

rohany · 2019-07-09T14:58:32Z

bors r+

craig · 2019-07-09T15:10:36Z

Build failed (retrying...)

GitHub CI (Cockroach)

craig · 2019-07-09T15:41:04Z

Build failed (retrying...)

GitHub CI (Cockroach)

38301: sql: Tighter spans for column families in lookup and index joins r=rohany a=rohany When only specific families are needed for a query, point lookups in joiners can issue family specific spans to retrieve only the data that they need, instead of querying for an entire row. Release note: None Co-authored-by: Rohan Yadav <[email protected]>

craig · 2019-07-09T16:05:11Z

Build succeeded

GitHub CI (Cockroach)

solongordon

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis)

pkg/sql/distsqlrun/joinreader.go, line 296 at r4 (raw file):

Previously, rohany (Rohan Yadav) wrote…

Done.

Oops, I think I was wrong here. This optimization only applies to the primary index, because there's no such thing as column families on a secondary index.

PR cockroachdb#38301 inadvertently introduced a bug that unfortunately wasn't caught by any tests - in the process of improving the tight span generation for tables with column families, it accidentally started throwing away all but the spans for the final constraint passed in from the optimizer. This small omission has serious consequences: queries that have a disjunction of spans to examine (like `SELECT * FROM t WHERE key IN (values...)`) will return incorrect results if the table being queried has multiple column families (in certain cases) - it'll look like rows are missing. This will be problematic for mutations as well. Note that this bug was never present in a non-alpha release. Release note (bug fix): restore correct result generation for queries with index disjunctions on tables with multiple column families. Release justification: critical bugfix, low risk

40898: sql: column family span generation bugfix r=jordanlewis a=jordanlewis PR #38301 inadvertently introduced a bug that unfortunately wasn't caught by any tests - in the process of improving the tight span generation for tables with column families, it accidentally started throwing away all but the spans for the final constraint passed in from the optimizer. This small omission has serious consequences: queries that have a disjunction of spans to examine (like `SELECT * FROM t WHERE key IN (values...)`) will return incorrect results if the table being queried has multiple column families (in certain cases) - it'll look like rows are missing. This will be problematic for mutations as well. Note that this bug was never present in a non-alpha release. Fixes #40890. Release note (bug fix): restore correct result generation for queries with index disjunctions on tables with multiple column families. Release justification: critical bugfix, low risk Co-authored-by: Jordan Lewis <[email protected]>

rohany requested review from jordanlewis, solongordon and a team June 19, 2019 20:24

rohany changed the title ~~[WIP] tighter spans for column families in lookup joins~~ [WIP] tighter spans for column families in lookup and index joins Jun 20, 2019

rohany changed the title ~~[WIP] tighter spans for column families in lookup and index joins~~ Tighter spans for column families in lookup and index joins Jun 20, 2019

rohany changed the title ~~Tighter spans for column families in lookup and index joins~~ sql: Tighter spans for column families in lookup and index joins Jun 20, 2019

solongordon requested a review from a team June 24, 2019 15:48

solongordon suggested changes Jun 25, 2019

View reviewed changes

solongordon reviewed Jun 25, 2019

View reviewed changes

rohany commented Jun 25, 2019

View reviewed changes

rohany requested a review from a team June 25, 2019 20:35

rohany requested a review from a team June 25, 2019 20:45

solongordon suggested changes Jun 26, 2019

View reviewed changes

rohany commented Jun 26, 2019

View reviewed changes

rohany requested a review from solongordon July 2, 2019 16:51

solongordon suggested changes Jul 3, 2019

View reviewed changes

solongordon approved these changes Jul 9, 2019

View reviewed changes

sql: Produce family specific spans for point lookups in index+lookup

14390c7

joiners When only specific families are needed for a query, point lookups in joiners can issue family specific spans to retrieve only the data that they need, instead of querying for an entire row. Release note: None

rohany force-pushed the tight-spans branch from 2c0602b to 14390c7 Compare July 9, 2019 14:32

craig bot merged commit 14390c7 into cockroachdb:master Jul 9, 2019

solongordon reviewed Jul 16, 2019

View reviewed changes

jordanlewis mentioned this pull request Jul 31, 2019

sql: foreign key checks should only scan the primary column family of a row #30852

Closed

jordanlewis mentioned this pull request Sep 18, 2019

sql: DELETE FROM is broken for system.jobs table #40890

Closed

jordanlewis mentioned this pull request Sep 18, 2019

sql: column family span generation bugfix #40898

Merged

knz mentioned this pull request Nov 10, 2019

User-facing changes in 19.2 that were not picked up in release notes cockroachdb/docs#5819

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: Tighter spans for column families in lookup and index joins #38301

sql: Tighter spans for column families in lookup and index joins #38301

rohany commented Jun 19, 2019 •

edited

Loading

cockroach-teamcity commented Jun 19, 2019

rohany commented Jun 19, 2019

solongordon left a comment

solongordon left a comment

rohany left a comment

rohany commented Jun 25, 2019

solongordon left a comment

rohany left a comment

solongordon left a comment

rohany commented Jul 8, 2019

solongordon left a comment

rohany commented Jul 9, 2019

craig bot commented Jul 9, 2019

craig bot commented Jul 9, 2019

craig bot commented Jul 9, 2019

solongordon left a comment

	# Make sure that the scan for the update actually gets parallelized.
	statement ok
	SET tracing = on; UPDATE a SET b=10 WHERE a = 0 OR a = 10; SET tracing = off

	# The span "sending partial batch" means that the scan was parallelized.
	# If this test is failing and doesn't have that span, it means that the scanNode
	# was improperly configured to add a limit to the ScanRequest batch.
	# See #30943 for more details.
	query T
	SELECT message FROM [SHOW TRACE FOR SESSION] WHERE message IN
	('querying next range at /Table/58/1/0',
	'querying next range at /Table/58/1/10',
	'=== SPAN START: kv.DistSender: sending partial batch ==='
	)
	----
	querying next range at /Table/58/1/0
	=== SPAN START: kv.DistSender: sending partial batch ===
	querying next range at /Table/58/1/10

sql: Tighter spans for column families in lookup and index joins #38301

sql: Tighter spans for column families in lookup and index joins #38301

Conversation

rohany commented Jun 19, 2019 • edited Loading

cockroach-teamcity commented Jun 19, 2019

rohany commented Jun 19, 2019

solongordon left a comment

Choose a reason for hiding this comment

solongordon left a comment

Choose a reason for hiding this comment

rohany left a comment

Choose a reason for hiding this comment

rohany commented Jun 25, 2019

solongordon left a comment

Choose a reason for hiding this comment

rohany left a comment

Choose a reason for hiding this comment

solongordon left a comment

Choose a reason for hiding this comment

rohany commented Jul 8, 2019

solongordon left a comment

Choose a reason for hiding this comment

rohany commented Jul 9, 2019

craig bot commented Jul 9, 2019

Build failed (retrying...)

craig bot commented Jul 9, 2019

Build failed (retrying...)

craig bot commented Jul 9, 2019

Build succeeded

solongordon left a comment

Choose a reason for hiding this comment

rohany commented Jun 19, 2019 •

edited

Loading