Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gen4 Planner: AxB vs BxA #7274

Merged
merged 16 commits into from
Jan 28, 2021
Merged

Gen4 Planner: AxB vs BxA #7274

merged 16 commits into from
Jan 28, 2021

Conversation

systay
Copy link
Collaborator

@systay systay commented Jan 9, 2021

This PR teaches the Gen4 planner how to re-evaluate routes after a predicate has been pushed during the join tree comparisons.

Better plans

I'll use an example to show what the planner does:

SELECT user.col 
FROM user JOIN user_extra ON user.id = user_extra.col

There is a primary vindex on user.id, but for user_extra, there are no applicable vindexes.
This means first of all that we can't merge the two tables into a single route - for certain user.id values, the matching user_extra.col might well be placed in a different shard.

So, we are forced to do the join on the vtgate - we can't delegate this to mysql. Now the question is: should we first query the user table and push down the user.id to the user_extra route, or the other way around?

Given the available vindexes, we have to evaluate user_extra using a scatter query no matter if we have user.id available or not. For the user table on the other hand, if we have user_extra.col available, we could use a SelectEqualUnique which is way better than a scatter query.

Implementation comments

Because the planner now takes the same predicate (user.id = user_extra.col) and rewrites it in two different ways depending on which route is on the LHS, I needed to make it possible to clone expressions. I have hand-written the clone method for expressions, but I think it makes sense to use the visitorgen to generate this code, and do it for all AST structs and not just Expr.

I also had to move the pushing of predicates to the joinTree phase instead of waiting until we have logical plans.

Planner performance

I was afraid that all this new logic that is being done during joinTree selection would lead to much slower planning, but the benchmarks don't look too scary. 😅

BenchmarkPlanner/from_cases.txt-v3-16       	  	    	    1198	    853119 ns/op	  393968 B/op	    7319 allocs/op
BenchmarkPlanner/from_cases.txt-v4-16         	    		    1436	    796068 ns/op	  366321 B/op	    7553 allocs/op
BenchmarkPlanner/from_cases.txt-v4left2right-16         	    1545	    747779 ns/op	  342363 B/op	    6999 allocs/op
BenchmarkPlanner/filter_cases.txt-v3-16                 	    1544	    763470 ns/op	  340849 B/op	    6820 allocs/op
BenchmarkPlanner/filter_cases.txt-v4-16                 	    1749	    656536 ns/op	  291623 B/op	    5779 allocs/op
BenchmarkPlanner/filter_cases.txt-v4left2right-16       	    1764	    658813 ns/op	  288869 B/op	    5705 allocs/op
BenchmarkPlanner/large_cases.txt-v3-16                  	    8616	    126044 ns/op	   54688 B/op	    1189 allocs/op
BenchmarkPlanner/large_cases.txt-v4-16                  	    6630	    163335 ns/op	   72650 B/op	    1564 allocs/op
BenchmarkPlanner/large_cases.txt-v4left2right-16        	   13299	     88614 ns/op	   41088 B/op	    1003 allocs/op
BenchmarkPlanner/aggr_cases.txt-v3-16                   	    2864	    433544 ns/op	  148570 B/op	    4068 allocs/op
BenchmarkPlanner/aggr_cases.txt-v4-16                   	    3622	    337122 ns/op	  110093 B/op	    3157 allocs/op
BenchmarkPlanner/aggr_cases.txt-v4left2right-16         	    3675	    336505 ns/op	  110174 B/op	    3157 allocs/op
BenchmarkPlanner/select_cases.txt-v3-16                 	    1092	    967935 ns/op	  328761 B/op	    8525 allocs/op
BenchmarkPlanner/select_cases.txt-v4-16                 	    1335	    911256 ns/op	  285991 B/op	    7859 allocs/op
BenchmarkPlanner/select_cases.txt-v4left2right-16       	    1269	    900011 ns/op	  279665 B/op	    7769 allocs/op
BenchmarkPlanner/union_cases.txt-v3-16                  	    1315	    948248 ns/op	  338814 B/op	    9025 allocs/op
BenchmarkPlanner/union_cases.txt-v4-16                  	    1221	    925400 ns/op	  338622 B/op	    9025 allocs/op
BenchmarkPlanner/union_cases.txt-v4left2right-16        	    1280	    899773 ns/op	  338773 B/op	    9025 allocs/op
BenchmarkPlanner/wireup_cases.txt-v3-16                 	   19597	     58601 ns/op	   24786 B/op	     586 allocs/op
BenchmarkPlanner/wireup_cases.txt-v4-16                 	   25574	     45954 ns/op	   18512 B/op	     470 allocs/op
BenchmarkPlanner/wireup_cases.txt-v4left2right-16       	   27835	     42890 ns/op	   15732 B/op	     429 allocs/op

@systay systay mentioned this pull request Jan 11, 2021
13 tasks
Signed-off-by: Andres Taylor <[email protected]>
@systay systay changed the title V4 Gen4 Planner: AxB vs BxA Jan 16, 2021
Signed-off-by: Andres Taylor <[email protected]>
@systay systay marked this pull request as ready for review January 16, 2021 13:57
When picking the vindex to use, we must also check that the value
it is reading from can be used by vtgate for this purpose - the
PlanValue must support the expression type.

Signed-off-by: Andres Taylor <[email protected]>
if node == nil {
return nil
}
panic(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can be changed to "subquery clone not supported"

if node == nil {
return nil
}
panic(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

if node == nil {
return nil
}
panic(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

if node == nil {
return nil
}
panic(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

if node == nil {
return nil
}
panic(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

return nil
}

panic(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

"Sharded": true
},
"FieldQuery": "select id from music where 1 != 1",
"Query": "select id from music",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either we add the query to the plan with where clause that shows it will select 0 results or we do not add query to the plan to avoid confusion.

Comment on lines +1000 to +1033
"QueryType": "SELECT",
"Original": "select unsharded.id from user join unsharded where unsharded.id = user.id",
"Instructions": {
"OperatorType": "Join",
"Variant": "Join",
"JoinColumnIndexes": "-2",
"TableName": "unsharded_user",
"Inputs": [
{
"OperatorType": "Route",
"Variant": "SelectUnsharded",
"Keyspace": {
"Name": "main",
"Sharded": false
},
"FieldQuery": "select unsharded.id, unsharded.id from unsharded where 1 != 1",
"Query": "select unsharded.id, unsharded.id from unsharded",
"Table": "unsharded"
},
{
"OperatorType": "Route",
"Variant": "SelectEqualUnique",
"Keyspace": {
"Name": "user",
"Sharded": true
},
"FieldQuery": "select 1 from user where 1 != 1",
"Query": "select 1 from user where user.id = :unsharded_id",
"Table": "user",
"Values": [
":unsharded_id"
],
"Vindex": "user_index"
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for this query the better plan would be select scatter followed by query on selectunsharded.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe. would you mind if we look deeper into this later, when a larger percentage of the gen4 planner is done, and we can get a better overview of the options available?

Copy link
Member

@harshit-gangal harshit-gangal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code looks good to be. I think there is need to document the logic for the important methods involved in the planning.

@@ -104,3 +104,23 @@ func TestMergeJoins(t *testing.T) {
})
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked this file and I see the default planner is greedy. Is it intentional to miss the V4Left2Right planner test?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably add to it. it was not a conscious decision, no

go/vt/vtgate/planbuilder/route_planning.go Show resolved Hide resolved
Comment on lines 208 to 221
engine.SelectEqualUnique,
engine.SelectNext,
engine.SelectNone,
engine.SelectReference,
engine.SelectUnsharded:
return 1
case engine.SelectEqual:
return 5
case engine.SelectIN:
return 10
case engine.SelectMultiEqual:
return 10
case engine.SelectScatter:
return 20
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a todo to revisit these cost.

Comment on lines 289 to 302
if newVindexFound {
rp.pickBestAvailableVindex()
}

// any predicates that cover more than a single table need to be added here
rp.predicates = append(rp.predicates, predicates...)

return nil
}

// pickBestAvailableVindex goes over the available vindexes for this route and picks the best one available.
func (rp *routePlan) pickBestAvailableVindex() {
//TODO (Manan,Andres): Improve cost metric for vindexes
for _, v := range vindexPreds {
for _, v := range rp.vindexPreds {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we do not need to loop over on rp.vindexPreds.
On a given routePlan there will only we one vindex used so we can just check the cost of the best selected vindex with the current matched vindex.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the idea is that when we add a table to a routePlan, we also add all possible vindexes. later, when predicates are added, new vindexes can become available (in other words, covered as the struct field).

I tried removing vindexes that had already lost and did not need rechecking, but that didn't make any diff in the benchmarks, so I didn't keep that complexity in here

}

// pickBestAvailableVindex goes over the available vindexes for this route and picks the best one available.
func (rp *routePlan) pickBestAvailableVindex() {
//TODO (Manan,Andres): Improve cost metric for vindexes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this TODO to the place where cost metric is present :)

Signed-off-by: Andres Taylor <[email protected]>
Before optimisation:
BenchmarkPlanner/large_cases.txt-v4-16  70786    167749 ns/op   74315 B/op    1618 allocs/op
BenchmarkPlanner/large_cases.txt-v4-16  70998    168734 ns/op   74303 B/op    1618 allocs/op

After:
BenchmarkPlanner/large_cases.txt-v4-16  76375    156411 ns/op   67752 B/op    1376 allocs/op
BenchmarkPlanner/large_cases.txt-v4-16  76207    157553 ns/op   67816 B/op    1376 allocs/op

Signed-off-by: Andres Taylor <[email protected]>
Signed-off-by: Andres Taylor <[email protected]>
@harshit-gangal harshit-gangal merged commit 51f1c14 into vitessio:master Jan 28, 2021
@harshit-gangal harshit-gangal deleted the v4 branch January 28, 2021 11:30
@askdba askdba added the P3 label Feb 8, 2021
@askdba askdba added this to the v10.0 milestone Feb 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants