Optimizer does not propagate `col is not null` for join #8587

morgo · 2018-12-05T17:55:56Z

Bug Report

Please answer these questions before submitting your issue. Thanks!

What did you do?

(Edit: Updated description; it looks like the bug is constant propagation, not consideration of attached conditions.)

Similar problem to: https://bugs.mysql.com/bug.php?id=93491

An inner join on a.b_id = b.id semantically means that neither side can be NULL. The optimizer seems to understand this during execution:

mysql> EXPLAIN SELECT COUNT(*) FROM a INNER JOIN b ON a.b_id = b.id;
+--------------------------+--------+------+---------------------------------------------------------+
| id                       | count  | task | operator info                                           |
+--------------------------+--------+------+---------------------------------------------------------+
| StreamAgg_11             | 1.00   | root | funcs:count(1)                                          |
| └─MergeJoin_30           | 258.00 | root | inner join, left key:test.a.b_id, right key:test.b.id   |
|   ├─IndexReader_23       | 258.00 | root | index:IndexScan_22                                      |
|   │ └─IndexScan_22       | 258.00 | cop  | table:a, index:b_id, range:[NULL,+inf], keep order:true |
|   └─TableReader_25       | 256.00 | root | data:TableScan_24                                       |
|     └─TableScan_24       | 256.00 | cop  | table:b, range:[-inf,+inf], keep order:true             |
+--------------------------+--------+------+---------------------------------------------------------+
6 rows in set (0.00 sec)

i.e. the range starting at -inf shows nulls are not valid.

What did you expect to see?

But it doesn't look like the b.id IS NOT NULL is propagated back to a.b_id is NOT NULL. If I do this, I get a much better plan and a change from MergeJoin to IndexJoin:

mysql> EXPLAIN SELECT COUNT(*) FROM a INNER JOIN b ON a.b_id = b.id WHERE b_id IS NOT NULL;
+--------------------------+-------+------+------------------------------------------------------------------------------+
| id                       | count | task | operator info                                                                |
+--------------------------+-------+------+------------------------------------------------------------------------------+
| StreamAgg_12             | 1.00  | root | funcs:count(1)                                                               |
| └─IndexJoin_38           | 9.16  | root | inner join, inner:TableReader_37, outer key:test.a.b_id, inner key:test.b.id |
|   ├─IndexReader_49       | 9.16  | root | index:IndexScan_48                                                           |
|   │ └─IndexScan_48       | 9.16  | cop  | table:a, index:b_id, range:[-inf,+inf], keep order:false                     |
|   └─TableReader_37       | 1.00  | root | data:TableScan_36                                                            |
|     └─TableScan_36       | 1.00  | cop  | table:b, range: decided by [test.a.b_id], keep order:false                   |
+--------------------------+-------+------+------------------------------------------------------------------------------+
6 rows in set (0.00 sec)

What did you see instead?

(Answered above.) Full test case:

DROP TABLE IF EXISTS a,b;

CREATE TABLE a (
 id INT NOT NULL PRIMARY KEY auto_increment,
 cola BLOB,
 b_id INT,
 INDEX (b_id)
);

CREATE TABLE b (
 id INT NOT NULL PRIMARY KEY auto_increment,
 cola BLOB 
);

INSERT INTO a VALUES (NULL, RANDOM_BYTES(1024), NULL);
INSERT INTO a SELECT NULL, RANDOM_BYTES(1024), NULL FROM a;
INSERT INTO a SELECT NULL, RANDOM_BYTES(1024), NULL FROM a;
INSERT INTO a SELECT NULL, RANDOM_BYTES(1024), NULL FROM a;
INSERT INTO a SELECT NULL, RANDOM_BYTES(1024), NULL FROM a;
INSERT INTO a SELECT NULL, RANDOM_BYTES(1024), NULL FROM a;
INSERT INTO a SELECT NULL, RANDOM_BYTES(1024), NULL FROM a;
INSERT INTO a SELECT NULL, RANDOM_BYTES(1024), NULL FROM a;
INSERT INTO a SELECT NULL, RANDOM_BYTES(1024), NULL FROM a;

INSERT INTO b SELECT NULL, RANDOM_BYTES(1024) FROM a;

SELECT COUNT(*) FROM a;
SELECT COUNT(*) FROM b;

INSERT INTO a VALUES (NULL, RANDOM_BYTES(1024), 1),(NULL, RANDOM_BYTES(1024), 1);

ANALYZE TABLE a;
ANALYZE TABLE b;

EXPLAIN SELECT COUNT(*) FROM a INNER JOIN b ON a.b_id = b.id;
EXPLAIN SELECT COUNT(*) FROM a INNER JOIN b ON a.b_id = b.id WHERE b_id IS NOT NULL;

What version of TiDB are you using (tidb-server -V or run select tidb_version(); on TiDB)?

mysql> SELECT tidb_version()\G
*************************** 1. row ***************************
tidb_version(): Release Version: v2.1.0-rc.3-257-gc49eb6259-dirty
Git Commit Hash: c49eb62590554168a2de426771bc8e95cead6c27
Git Branch: analyze
UTC Build Time: 2018-12-05 05:44:27
GoVersion: go version go1.11 linux/amd64
Race Enabled: false
TiKV Min Version: 2.1.0-alpha.1-ff3dd160846b7d1aed9079c389fc188f7f5ea13e
Check Table Before Drop: false
1 row in set (0.00 sec)

The text was updated successfully, but these errors were encountered:

morgo · 2018-12-05T18:10:04Z

(I've updated the description. My test case was slightly wrong, and I misunderstood why it was not working. It looks like the bug is different from MySQL, and the issue is constant propagation. A = B and B IS NOT NULL, should infer A IS NOT NULL in an inner join).

zz-jason · 2018-12-06T02:32:17Z

a simple case to reproduce this issue:

TiDB(root@127.0.0.1:test) > create table t1(a bigint, b bigint);
Query OK, 0 rows affected (0.01 sec)

TiDB(root@127.0.0.1:test) > create table t2(a bigint, b bigint);
Query OK, 0 rows affected (0.02 sec)

TiDB(root@127.0.0.1:test) > desc select * from t1 inner join t2 on t1.a=t2.a and t1.a is not null;
+-------------------------+----------+------+--------------------------------------------------------------------+
| id                      | count    | task | operator info                                                      |
+-------------------------+----------+------+--------------------------------------------------------------------+
| HashRightJoin_7         | 12487.50 | root | inner join, inner:TableReader_10, equal:[eq(test.t1.a, test.t2.a)] |
| ├─TableReader_10        | 9990.00  | root | data:Selection_9                                                   |
| │ └─Selection_9         | 9990.00  | cop  | not(isnull(test.t1.a))                                             |
| │   └─TableScan_8       | 10000.00 | cop  | table:t1, range:[-inf,+inf], keep order:false, stats:pseudo        |
| └─TableReader_12        | 10000.00 | root | data:TableScan_11                                                  |
|   └─TableScan_11        | 10000.00 | cop  | table:t2, range:[-inf,+inf], keep order:false, stats:pseudo        |
+-------------------------+----------+------+--------------------------------------------------------------------+
6 rows in set (0.00 sec)

As you can see, the filter t1.a is not null is not propagated to t2.a based on the join equal condition t1.a=t2.a to produce a t2.a is not null filter.

eurekaka · 2018-12-06T02:50:56Z

I will take a look.

eurekaka · 2018-12-06T05:02:31Z

RCA: in order to not derive mistaken filter t2.a is null for condition t1.a = t2.a and t1.a is null, I marked isnull as unable to be propagated for column equal condition, and is not null is transformed to expression like not(isnull()) internally, so it is not propagated either.

We have 2 alternatives to solve this:

For inner join, we derive t1.a is not null and t2.a is not null for t1.a = t2.a unconditionally;
Push down Not first, then we check whether isnull is in Not context to decide propagate it or not;

Pros of first alternative: we can remove t1.a is not null from SQL because t1.a = t2.a implies that implicitly in inner join;
Cons of first alternative: we may derive many duplicate or useless is not null filters;

I prefer second alternative, because it applies to outer join also.

morgo · 2018-12-06T05:23:07Z

The first approach sounds more-similar to MySQL. In EXPLAIN FORMAT=JSON you can see something they called "attached conditions". From an inner join it will have one for is not null.

eurekaka · 2018-12-06T05:36:51Z

If we want to apply first approach to outer join, we need this adjustment: only derive is not null filter for inner table, and put it into join condition, not filter condition. Here is an example:

t1 left join t2 on t1.a = t2.a

logically equals to:

t1 left join t2 on t1.a = t2.a and t2.a is not null

Then only one problem left for first approach:

we may derive many duplicate or useless is not null filters;

For example:

t1 join t2 on t1.a = t2.a and t1.a is not null

or this star shape join:

select * from t1, t2, t3, t4, t5 where t1.a = t2.a and t1.a = t3.a and t1.a = t4.a and t1.a = t5.a

it is because we don't have general expression simplification or de-duplication now.

eurekaka · 2018-12-06T06:14:38Z

Seems https://github.com/pingcap/tidb/blob/master/expression/constraint_propagation.go can be used for de-duplication, I will give it a try and see whether it works.

winoros · 2018-12-06T07:12:58Z

In Spark, each plan has a member variable named constraint and they have a rule InferFiltersFromConstraints. They use the constraint to detect whether the newly created filter already exists. 😂

zz-jason · 2018-12-06T07:15:14Z

In Spark, each plan has a member variable named constraint and they have a rule InferFiltersFromConstraints. They use the constraint to detect whether the newly created filter already exists. 😂

Sounds interesting, could you post the code link here so we can have a deeper look and discussing?

morgo added type/bug The issue is confirmed as a bug. sig/planner SIG: Planner type/performance and removed type/bug The issue is confirmed as a bug. labels Dec 5, 2018

morgo mentioned this issue Dec 5, 2018

Optimizer selects merge join over nested loop join incorrectly #8563

Closed

morgo changed the title ~~Optimizer does not correctly consider attached conditions in planning~~ Optimizer does not correctly apply constant propagation in INNER JOIN Dec 5, 2018

eurekaka self-assigned this Dec 6, 2018

eurekaka mentioned this issue Dec 6, 2018

*: derive col is not null from col op col conditions #8603

Merged

eurekaka changed the title ~~Optimizer does not correctly apply constant propagation in INNER JOIN~~ Optimizer does not propagate col is not null for join Dec 6, 2018

eurekaka closed this as completed in #8603 Jan 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizer does not propagate `col is not null` for join #8587

Optimizer does not propagate `col is not null` for join #8587

morgo commented Dec 5, 2018 •

edited

Loading

morgo commented Dec 5, 2018 •

edited

Loading

zz-jason commented Dec 6, 2018

eurekaka commented Dec 6, 2018

eurekaka commented Dec 6, 2018

morgo commented Dec 6, 2018 •

edited

Loading

eurekaka commented Dec 6, 2018

eurekaka commented Dec 6, 2018

winoros commented Dec 6, 2018

zz-jason commented Dec 6, 2018

Optimizer does not propagate col is not null for join #8587

Optimizer does not propagate col is not null for join #8587

Comments

morgo commented Dec 5, 2018 • edited Loading

Bug Report

morgo commented Dec 5, 2018 • edited Loading

zz-jason commented Dec 6, 2018

eurekaka commented Dec 6, 2018

eurekaka commented Dec 6, 2018

morgo commented Dec 6, 2018 • edited Loading

eurekaka commented Dec 6, 2018

eurekaka commented Dec 6, 2018

winoros commented Dec 6, 2018

zz-jason commented Dec 6, 2018

Optimizer does not propagate `col is not null` for join #8587

Optimizer does not propagate `col is not null` for join #8587

morgo commented Dec 5, 2018 •

edited

Loading

morgo commented Dec 5, 2018 •

edited

Loading

morgo commented Dec 6, 2018 •

edited

Loading