-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: Tighter spans for column families in lookup and index joins #38301
Conversation
if this works, i believe a very similar strategy can be used to do the same optimization in the index joiner. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing is an interesting question. When I did this work for tablereader I added tests which looked at the spans in the explain plans, but that won't work here.
Right now the best thing I can think of would be to add a logic test where you use the SHOW TRACE FOR SESSION feature to see what scans are happening.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @rohany, and @solongordon)
pkg/sql/distsqlrun/indexjoiner.go, line 55 at r4 (raw file):
spans roachpb.Spans // Hold onto what families we need to query from if our
Nit: The Go convention is to prefix comments with the name of the thing they're describing, so neededFamilies is...
. That way if you grep the code for neededFamilies
you can see both the definition and the description.
pkg/sql/distsqlrun/indexjoiner.go, line 226 at r4 (raw file):
// looking up a full primary key if len(ij.neededFamilies) > 0 && len(ij.desc.Families) > 1 &&
I think this condition is redundant. It's implied by the other two.
pkg/sql/distsqlrun/joinreader.go, line 293 at r4 (raw file):
// TODO: maybe want to move this into the joinerbase? func (jr *joinReader) canSplitSpanIntoSeparateFamilies(span roachpb.Span) (bool, roachpb.Spans) {
Usually the convention in Go is to return the bool "ok" value second, not first. But actually I'm thinking it would be simpler here to do away with it altogether. If the span can't be split, you could still return roachpb.Spans
but only containing the input span. Then just call the function maybeSplitSpanIntoFamilies
or whatever.
pkg/sql/distsqlrun/joinreader.go, line 296 at r4 (raw file):
// check the following: // - we have more than one needed family // - we are looking at the primary index
I think this is too restrictive. We can also split if we have a unique secondary index.
pkg/sql/distsqlrun/joinreader.go, line 302 at r4 (raw file):
// up on is equal to the number of columns in the primary index, then // we are guaranteed to be looking at a single row, so we don't need to // be checking if span.Key.Equals(span.EndKey)
This sounds plausible except I suspect it doesn't hold true for interleaved tables. I would recommend revisiting this after you've done some reading on the interleaved encoding stuff. You also might want to take a look at MakeSpanFromEncDatums
, which is what the lookup joiner uses to make a span for a single key lookup.
pkg/sql/distsqlrun/joinreader.go, line 305 at r4 (raw file):
if len(jr.neededFamilies) > 0 && jr.index.ID == jr.desc.PrimaryIndex.ID && len(jr.desc.Families) > 1 &&
Redundant condition
pkg/sql/sqlbase/index_encoding.go, line 272 at r4 (raw file):
// into separate spans that request particular families from neededFamilies // instead of requesting all the families. It is up to the client to verify // whether the requested span represents a single row lookup, and when
Can you make it clearer that this is only appropriate for point lookups?
pkg/sql/sqlbase/index_encoding.go, line 283 at r4 (raw file):
tempSpan.EndKey = tempSpan.Key.PrefixEnd() if i > 0 && familyID == neededFamilies[i-1]+1 { // the family ID is the same, so collapse these spans
I think the previous comment was more accurate:
// This column family is adjacent to the previous one. We can merge
// the two spans into one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the general kind of test I'm thinking of:
cockroach/pkg/sql/logictest/testdata/planner_test/update
Lines 277 to 294 in 27fe883
# Make sure that the scan for the update actually gets parallelized. | |
statement ok | |
SET tracing = on; UPDATE a SET b=10 WHERE a = 0 OR a = 10; SET tracing = off | |
# The span "sending partial batch" means that the scan was parallelized. | |
# If this test is failing and doesn't have that span, it means that the scanNode | |
# was improperly configured to add a limit to the ScanRequest batch. | |
# See #30943 for more details. | |
query T | |
SELECT message FROM [SHOW TRACE FOR SESSION] WHERE message IN | |
('querying next range at /Table/58/1/0', | |
'querying next range at /Table/58/1/10', | |
'=== SPAN START: kv.DistSender: sending partial batch ===' | |
) | |
---- | |
querying next range at /Table/58/1/0 | |
=== SPAN START: kv.DistSender: sending partial batch === | |
querying next range at /Table/58/1/10 |
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @rohany)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @solongordon)
pkg/sql/distsqlrun/indexjoiner.go, line 226 at r4 (raw file):
Previously, solongordon (Solon) wrote…
I think this condition is redundant. It's implied by the other two.
Done.
pkg/sql/distsqlrun/joinreader.go, line 293 at r4 (raw file):
Previously, solongordon (Solon) wrote…
Usually the convention in Go is to return the bool "ok" value second, not first. But actually I'm thinking it would be simpler here to do away with it altogether. If the span can't be split, you could still return
roachpb.Spans
but only containing the input span. Then just call the functionmaybeSplitSpanIntoFamilies
or whatever.
Done.
pkg/sql/distsqlrun/joinreader.go, line 296 at r4 (raw file):
Previously, solongordon (Solon) wrote…
I think this is too restrictive. We can also split if we have a unique secondary index.
Done.
pkg/sql/distsqlrun/joinreader.go, line 302 at r4 (raw file):
Previously, solongordon (Solon) wrote…
This sounds plausible except I suspect it doesn't hold true for interleaved tables. I would recommend revisiting this after you've done some reading on the interleaved encoding stuff. You also might want to take a look at
MakeSpanFromEncDatums
, which is what the lookup joiner uses to make a span for a single key lookup.
I believe this should work even for interleaved tables -- MakeSpanFromEncDatums generates all the ancestors of the interleaving that are needed, and then child interleaved tables keys must have the '#' separator instead of a family id, so there isn't a way that we get back a span that contains child interleaved rows.
pkg/sql/distsqlrun/joinreader.go, line 305 at r4 (raw file):
Previously, solongordon (Solon) wrote…
Redundant condition
Done.
pkg/sql/sqlbase/index_encoding.go, line 272 at r4 (raw file):
Previously, solongordon (Solon) wrote…
Can you make it clearer that this is only appropriate for point lookups?
Done.
pkg/sql/sqlbase/index_encoding.go, line 283 at r4 (raw file):
Previously, solongordon (Solon) wrote…
I think the previous comment was more accurate:
// This column family is adjacent to the previous one. We can merge // the two spans into one.
Done.
I added an example logic test, take a look at let me know if its what you had in mind @solongordon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see that test! Could we get one for index joins too?
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @rohany, and @solongordon)
pkg/sql/distsqlrun/joinreader.go, line 302 at r4 (raw file):
Previously, rohany (Rohan Yadav) wrote…
I believe this should work even for interleaved tables -- MakeSpanFromEncDatums generates all the ancestors of the interleaving that are needed, and then child interleaved tables keys must have the '#' separator instead of a family id, so there isn't a way that we get back a span that contains child interleaved rows.
That sounds plausible... let's test it!
pkg/sql/logictest/testdata/logic_test/join, line 1081 at r7 (raw file):
query error aggregate functions are not allowed in ON SELECT * FROM foo JOIN bar ON max(foo.c) < 2
Let's put this in the lookup_join
test file. Then you shouldn't need to turn the optimizer on explicitly because it only uses configurations where the optimizer is already enabled.
pkg/sql/logictest/testdata/logic_test/join, line 1087 at r7 (raw file):
statement ok CREATE TABLE familySplit1 (x INT, PRIMARY KEY (x)); INSERT INTO familySplit1 VALUES (1);
Nit: Could you make the INSERTs separate statements? Helps readability.
Another nit: Usually table names are in snake_case
.
pkg/sql/logictest/testdata/logic_test/join, line 1093 at r7 (raw file):
statement ok SET optimizer = on; SET tracing = on; SELECT familySplit2.x, familySplit2.z FROM familySplit1 INNER LOOKUP JOIN familySplit2 ON (familySplit1.x = familySplit2.x); SET tracing = off; set optimizer = off;
Nit: Extra parens around the ON condition.
pkg/sql/logictest/testdata/logic_test/join, line 1096 at r7 (raw file):
query T select message from [SHOW TRACE FOR SESSION] where message in ('Scan /Table/81/1/1/{0-1}, /Table/81/1/1/2/{1-2}');
I would recommend changing the WHERE condition to WHERE message LIKE 'Scan%'
. That way we can make sure there aren't any unnecessary scans happening.
pkg/sql/logictest/testdata/logic_test/join, line 1098 at r7 (raw file):
select message from [SHOW TRACE FOR SESSION] where message in ('Scan /Table/81/1/1/{0-1}, /Table/81/1/1/2/{1-2}'); ---- Scan /Table/81/1/1/{0-1}, /Table/81/1/1/2/{1-2}
Nice! Yes this is exactly what we want to see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, added a test using interleaved tables as well.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis and @solongordon)
pkg/sql/distsqlrun/joinreader.go, line 302 at r4 (raw file):
Previously, solongordon (Solon) wrote…
That sounds plausible... let's test it!
Done.
pkg/sql/logictest/testdata/logic_test/join, line 1081 at r7 (raw file):
Previously, solongordon (Solon) wrote…
Let's put this in the
lookup_join
test file. Then you shouldn't need to turn the optimizer on explicitly because it only uses configurations where the optimizer is already enabled.
Done.
pkg/sql/logictest/testdata/logic_test/join, line 1096 at r7 (raw file):
Previously, solongordon (Solon) wrote…
I would recommend changing the WHERE condition to
WHERE message LIKE 'Scan%'
. That way we can make sure there aren't any unnecessary scans happening.
Done.
pkg/sql/logictest/testdata/logic_test/join, line 1098 at r7 (raw file):
Previously, solongordon (Solon) wrote…
Nice! Yes this is exactly what we want to see.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some picky comments about your logic tests. Otherwise this is good to go.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @rohany, and @solongordon)
pkg/sql/logictest/testdata/logic_test/lookup_join, line 510 at r8 (raw file):
1 1 2 2 NULL 2 1 1 2 2 1 1
Nit: Extra blank line
pkg/sql/logictest/testdata/logic_test/lookup_join, line 517 at r8 (raw file):
statement ok CREATE TABLE family_split_1 (x INT, PRIMARY KEY (x));
Nit: You don't need semicolons at the end of statements in logic tests.
pkg/sql/logictest/testdata/logic_test/lookup_join, line 529 at r8 (raw file):
statement ok SET tracing = on; SELECT family_split_2.x, family_split_2.z FROM family_split_1 INNER LOOKUP JOIN family_split_2 ON family_split_1.x = family_split_2.x; SET tracing = off;
Minor: While we're running this query we might as well assert that the output is as expected. Like
query II
SET tracing = on; SELECT y,w FROM family_index_join WHERE y = 2
----
2 4
query T
SET tracing = off; SELECT message FROM [SHOW TRACE FOR SESSION] WHERE message LIKE 'Scan /Table/71/%, /Table/71/%';
----
Scan /Table/71/1/1/1/{1-2}, /Table/71/1/1/3/{1-2}
pkg/sql/logictest/testdata/logic_test/lookup_join, line 532 at r8 (raw file):
query T SELECT message FROM [SHOW TRACE FOR SESSION] WHERE message LIKE 'Scan /Table/70/%, /Table/70/%'
I still think we should make the LIKE
a bit more general: WHERE message LIKE 'Scan /Table/70/%
. Just in case there are extra scans you're not catching.
pkg/sql/logictest/testdata/logic_test/lookup_join, line 535 at r8 (raw file):
---- Scan /Table/70/1/1/{0-1}, /Table/70/1/1/2/{1-2}
Nit: Extra blank line
pkg/sql/logictest/testdata/logic_test/lookup_join, line 544 at r8 (raw file):
statement ok SET tracing = on; SELECT y,w FROM family_index_join WHERE y = 2; SET tracing = off;
Same thing about asserting the output.
Also, just to be safe we might want to force it to use the y index so that the index join will definitely happen: SELECT y, w FROM family_index_join@family_index_join_y_idx WHERE y = 2
.
pkg/sql/logictest/testdata/logic_test/lookup_join, line 547 at r8 (raw file):
query T SELECT message FROM [SHOW TRACE FOR SESSION] WHERE message LIKE 'Scan /Table/71/%, /Table/71/%';
Ditto about the more general LIKE
.
pkg/sql/logictest/testdata/logic_test/lookup_join, line 551 at r8 (raw file):
Scan /Table/71/1/1/1/{1-2}, /Table/71/1/1/3/{1-2} # test tight spans with interleaved tables
Proper grammar for comments please
pkg/sql/logictest/testdata/logic_test/lookup_join, line 568 at r8 (raw file):
query II select family_interleave_1.x, family_interleave_1.z from family_interleave_2 INNER LOOKUP JOIN family_interleave_1 ON family_interleave_1.x = family_interleave_2.x;
Nit: upper case SELECT and FROM would be nice.
pkg/sql/logictest/testdata/logic_test/lookup_join, line 576 at r8 (raw file):
query T SELECT message FROM [SHOW TRACE FOR SESSION] WHERE message LIKE 'Scan /Table/72/%, /Table/72/%';
Ditto about the more general LIKE
.
RFAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis)
joiners When only specific families are needed for a query, point lookups in joiners can issue family specific spans to retrieve only the data that they need, instead of querying for an entire row. Release note: None
bors r+ |
Build failed (retrying...) |
Build failed (retrying...) |
38301: sql: Tighter spans for column families in lookup and index joins r=rohany a=rohany When only specific families are needed for a query, point lookups in joiners can issue family specific spans to retrieve only the data that they need, instead of querying for an entire row. Release note: None Co-authored-by: Rohan Yadav <[email protected]>
Build succeeded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis)
pkg/sql/distsqlrun/joinreader.go, line 296 at r4 (raw file):
Previously, rohany (Rohan Yadav) wrote…
Done.
Oops, I think I was wrong here. This optimization only applies to the primary index, because there's no such thing as column families on a secondary index.
PR cockroachdb#38301 inadvertently introduced a bug that unfortunately wasn't caught by any tests - in the process of improving the tight span generation for tables with column families, it accidentally started throwing away all but the spans for the final constraint passed in from the optimizer. This small omission has serious consequences: queries that have a disjunction of spans to examine (like `SELECT * FROM t WHERE key IN (values...)`) will return incorrect results if the table being queried has multiple column families (in certain cases) - it'll look like rows are missing. This will be problematic for mutations as well. Note that this bug was never present in a non-alpha release. Release note (bug fix): restore correct result generation for queries with index disjunctions on tables with multiple column families. Release justification: critical bugfix, low risk
40898: sql: column family span generation bugfix r=jordanlewis a=jordanlewis PR #38301 inadvertently introduced a bug that unfortunately wasn't caught by any tests - in the process of improving the tight span generation for tables with column families, it accidentally started throwing away all but the spans for the final constraint passed in from the optimizer. This small omission has serious consequences: queries that have a disjunction of spans to examine (like `SELECT * FROM t WHERE key IN (values...)`) will return incorrect results if the table being queried has multiple column families (in certain cases) - it'll look like rows are missing. This will be problematic for mutations as well. Note that this bug was never present in a non-alpha release. Fixes #40890. Release note (bug fix): restore correct result generation for queries with index disjunctions on tables with multiple column families. Release justification: critical bugfix, low risk Co-authored-by: Jordan Lewis <[email protected]>
When only specific families are needed for a query, point lookups
in joiners can issue family specific spans to retrieve only
the data that they need, instead of querying for an entire row.
Release note: None