Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](Nereids) fill miss slot in having subquery #27177

Merged
merged 3 commits into from
Nov 21, 2023

Conversation

keanji-x
Copy link
Contributor

@keanji-x keanji-x commented Nov 17, 2023

Proposed changes

fill miss slot in having subquery
such as

select * from t group by k having max(k) in (select k from t2)

the max(k) should be push down aggregate

Further comments

If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...

@keanji-x keanji-x changed the title [fix](Nereids) fill miss in subquery slots [fix](Nereids) fill miss slot in having subquery slots Nov 17, 2023
@keanji-x
Copy link
Contributor Author

run buildall

@keanji-x keanji-x changed the title [fix](Nereids) fill miss slot in having subquery slots [fix](Nereids) fill miss slot in having subquery Nov 17, 2023
@morrySnow
Copy link
Contributor

add case in desc

@wm1581066 wm1581066 added usercase Important user case type label dev/2.0.3 labels Nov 17, 2023
@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit e4dac84bc8b02cc45359eaf8754adedd4dc94ce1, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4940	4699	4688	4688
q2	366	147	158	147
q3	2060	1967	1971	1967
q4	1404	1276	1257	1257
q5	4013	3979	4044	3979
q6	252	131	133	131
q7	1433	875	893	875
q8	2780	2786	2796	2786
q9	9783	9784	9689	9689
q10	3476	3557	3536	3536
q11	379	241	236	236
q12	439	292	299	292
q13	4631	3826	3800	3800
q14	321	289	292	289
q15	582	542	519	519
q16	662	587	581	581
q17	1133	990	945	945
q18	7870	7326	7489	7326
q19	1680	1695	1659	1659
q20	562	304	294	294
q21	4391	3998	4012	3998
q22	485	384	364	364
Total cold run time: 53642 ms
Total hot run time: 49358 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4644	4593	4576	4576
q2	345	262	261	261
q3	4032	4021	4011	4011
q4	2713	2710	2728	2710
q5	9756	9734	9713	9713
q6	245	121	125	121
q7	2615	2283	2323	2283
q8	4458	4461	4463	4461
q9	13300	13244	13258	13244
q10	4098	4188	4181	4181
q11	793	662	661	661
q12	982	827	823	823
q13	4288	3655	3590	3590
q14	371	353	367	353
q15	568	521	522	521
q16	739	662	662	662
q17	3821	3857	3858	3857
q18	9613	9101	9126	9101
q19	1860	1789	1798	1789
q20	2408	2068	2053	2053
q21	8976	8811	8747	8747
q22	861	772	869	772
Total cold run time: 81486 ms
Total hot run time: 78490 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 45.76 seconds
stream load tsv: 567 seconds loaded 74807831229 Bytes, about 125 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.7 seconds inserted 10000000 Rows, about 348K ops/s
storage size: 17099045886 Bytes

@keanji-x keanji-x force-pushed the fill_miss_apply_slots branch 3 times, most recently from c0b88aa to 2bdedeb Compare November 17, 2023 09:01
@keanji-x
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.41 seconds
stream load tsv: 590 seconds loaded 74807831229 Bytes, about 120 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 34 seconds loaded 861443392 Bytes, about 24 MB/s
insert into select: 29.3 seconds inserted 10000000 Rows, about 341K ops/s
storage size: 17103595647 Bytes

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit 2bdedebe057b9bdbefa9c444aac3d6587a6d6855, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4978	4736	4678	4678
q2	367	149	158	149
q3	2080	1965	1963	1963
q4	1421	1292	1262	1262
q5	3985	3966	4039	3966
q6	252	130	132	130
q7	1412	885	895	885
q8	2780	2806	2787	2787
q9	9708	9635	9508	9508
q10	3483	3551	3513	3513
q11	383	250	244	244
q12	437	293	300	293
q13	4611	3794	3806	3794
q14	324	296	293	293
q15	580	528	528	528
q16	660	590	586	586
q17	1138	954	929	929
q18	7906	7493	7443	7443
q19	1675	1685	1684	1684
q20	545	311	307	307
q21	4453	4030	4068	4030
q22	480	382	387	382
Total cold run time: 53658 ms
Total hot run time: 49354 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4616	4597	4590	4590
q2	341	218	263	218
q3	4013	4030	4015	4015
q4	2733	2725	2707	2707
q5	9608	9572	9530	9530
q6	254	124	124	124
q7	3026	2501	2477	2477
q8	4412	4443	4402	4402
q9	12978	12845	12894	12845
q10	4062	4154	4144	4144
q11	772	651	683	651
q12	980	823	818	818
q13	4303	3568	3582	3568
q14	387	350	347	347
q15	576	516	523	516
q16	730	653	671	653
q17	3794	3822	3915	3822
q18	9589	9155	9221	9155
q19	1821	1804	1792	1792
q20	2423	2058	2064	2058
q21	9002	8591	8701	8591
q22	925	832	783	783
Total cold run time: 81345 ms
Total hot run time: 77806 ms

@keanji-x keanji-x force-pushed the fill_miss_apply_slots branch from 2bdedeb to c83f6bb Compare November 20, 2023 02:21
@keanji-x keanji-x force-pushed the fill_miss_apply_slots branch from c83f6bb to 55e235a Compare November 20, 2023 02:24
@keanji-x
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit 55e235a1802909e7a4e8de5a8691c6b1bc00b19e, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4885	4651	4708	4651
q2	365	162	163	162
q3	2054	1940	1897	1897
q4	1392	1283	1256	1256
q5	3943	3915	3930	3915
q6	260	144	128	128
q7	1396	891	891	891
q8	2746	2761	2761	2761
q9	9619	9446	9487	9446
q10	3425	3515	3484	3484
q11	377	243	238	238
q12	440	289	294	289
q13	4540	3837	3829	3829
q14	321	299	282	282
q15	583	528	518	518
q16	662	593	582	582
q17	1121	973	894	894
q18	7817	7428	7508	7428
q19	1682	1671	1681	1671
q20	537	323	296	296
q21	4404	4001	4003	4001
q22	473	377	386	377
Total cold run time: 53042 ms
Total hot run time: 48996 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4587	4567	4567	4567
q2	332	227	276	227
q3	4005	4018	3997	3997
q4	2725	2719	2704	2704
q5	9753	9653	9637	9637
q6	250	123	122	122
q7	3052	2459	2477	2459
q8	4445	4405	4465	4405
q9	12973	12872	12890	12872
q10	4068	4162	4144	4144
q11	833	724	654	654
q12	970	802	801	801
q13	4270	3564	3558	3558
q14	376	352	348	348
q15	561	524	517	517
q16	736	675	662	662
q17	3932	3846	3890	3846
q18	9636	9123	9194	9123
q19	1826	1776	1785	1776
q20	2405	2087	2064	2064
q21	8837	8565	8617	8565
q22	966	818	795	795
Total cold run time: 81538 ms
Total hot run time: 77843 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 44.41 seconds
stream load tsv: 566 seconds loaded 74807831229 Bytes, about 126 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.6 seconds inserted 10000000 Rows, about 349K ops/s
storage size: 17099283948 Bytes

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 20, 2023
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@jackwener jackwener merged commit 4454842 into apache:master Nov 21, 2023
keanji-x added a commit to keanji-x/doris that referenced this pull request Nov 22, 2023
fill miss slot in having subquery.

such as
```
select * from t group by k having max(k) in (select k from t2)
```

the max(k) should be push down aggregate
keanji-x added a commit to keanji-x/doris that referenced this pull request Nov 23, 2023
fill miss slot in having subquery.

such as
```
select * from t group by k having max(k) in (select k from t2)
```

the max(k) should be push down aggregate
eldenmoon pushed a commit to eldenmoon/incubator-doris that referenced this pull request Nov 27, 2023
eldenmoon added a commit that referenced this pull request Nov 27, 2023
* [fix](stats) Fix update rows for unique table didn't get updated properly #26968 (#27337)

* [FIX](jsonb) fix jsonb in predict column #27325 (#27424)

* [fix](fe) slots in having clause should be set to need materialized(#27412) (#27429)

* [Bug](insert)fix insert wrong data on mv when stmt have multiple values (#27297) (#27382)

fix insert wrong data on mv when stmt have multiple values

* [fix](fe ut) Fix OlapQueryCacheTest failed (#27305) (#27406)

1.
```
java.lang.NullPointerException: null
        at org.apache.doris.catalog.Env.getCurrentSystemInfo(Env.java:793) ~[classes/:?]
        at org.apache.doris.qe.SimpleScheduler$UpdateBlacklistThread.run(SimpleScheduler.java:206) ~[classes/:?]
        at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_382]

java.lang.NullPointerException
        at org.apache.doris.qe.OlapQueryCacheTest.setUp(OlapQueryCacheTest.java:226)
```

2.
```
[ERROR] testSqlCacheKeyWithNestedViewForNereids  Time elapsed: 1.962 s  <<< FAILURE!
java.lang.AssertionError: SELECT command denied to user 'testCluster:testUser'@'192.168.1.1' for table 'internal: testCluster:testDb: appevent'
	at org.apache.doris.qe.OlapQueryCacheTest.parseSqlByNereids(OlapQueryCacheTest.java:579)
	at org.apache.doris.qe.OlapQueryCacheTest.testSqlCacheKeyWithNestedViewForNereids(OlapQueryCacheTest.java:1338)
```

3.
```
[ERROR] Tests run: 28, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 113.63 s <<< FAILURE! - in org.apache.doris.qe.OlapQueryCacheTest
[ERROR] testCacheModeTable  Time elapsed: 1.657 s  <<< ERROR!
java.lang.IllegalArgumentException: Value of type org.apache.doris.qe.QueryState incompatible with return type org.apache.doris.system.SystemInfoService of org.apache.doris.catalog.Env#getCurrentSystemInfo()
        at org.apache.doris.qe.OlapQueryCacheTest.setUp(OlapQueryCacheTest.java:156)
```

* [regression test](schema change) add some schema change regression cases (#27112) (#27418)

* [fix](Nereids) result type of add precision is 1 more than expected (#27136) (#27426)

* [fix](Nereids): fill miss slot in having subquery (#27177) (#27394)

* [fix](memory) Fix make_top_consumption_snapshots heap-use-after-free #27434 (#27465)

* [fix](function) make TIMESTAMP function DEPEND_ON_ARGUMENT (#27343) (#27458)

* [fix](test) order by clause in test_map(#27390) (#27391)

pick #27390

* [performance](Planner): optimize getStringValue() in DateLiteral (#27363) (#27470)

- reduce cost of `getStringValue()`
- original code don't consider `microsecond` part in `getStringValue()`

(cherry picked from commit 044a295)

* [Chore](pick) do not push down agg on aggregate column (#27356) (#27498)

* [fix](stats) table not exists error msg not print objects name #27074 (#27463)

* [improve](nereids) support agg function of count(const value) pushdown #26677 (#27499)

support sql: select count(1)-count(not null) from table, the agg of count could push down.

* [test](fe-ut) fix unstable MysqlServerTest (#27459)

Need to find a unbind port for MysqlServerTest

* [opt](MergedIO) no need to merge large columns (#27315) (#27497)

1. Fix a profile bug of `MergeRangeFileReader`, and add a profile `ApplyBytes` to show the total bytes  of ranges.
2. There's no need to merge large columns, because `MergeRangeFileReader` will increase the copy time.

* [improvement](drop tablet)  impr gc shutdown tablet lock (#26151) (#27478)

* [doc](stats) SQL manual for stats (#27461)

* [chore](merge-on-write) disable rowid conversion check for mow table by default (#27482) (#27508)

* [fix](regression)Fix hive p2 case (#27466) (#27511)

* [fix](statistics)Fix auto analyze remove finished job bug #27486 (#27510)

* [Bug](bitmap) Fix heap-use-after-free in the bitmap functions #27411 (#27521)

* [Pick](nereids) Pick: partition prune fails in case of NOT expression (#27047) (#27507)

* [fix](clone) Fix engine_clone file exist (#27361) (#27536)

* [chore](case) adjust timeout of broker load case #27540

* Fix auto analyze doesn't filter unsupported type bug. (#27547)

Fix auto analyze doesn't filter unsupported type bug.
Catch throwable in auto analyze thread for each database, otherwise the thread will quit when one database failed to create jobs and all other databases will not get analyzed.
change FE config item full_auto_analyze_simultaneously_running_task_num to auto_analyze_simultaneously_running_task_num
backport #27559

* [chore](fe plugin) Upgrade dependency to doris 2.0-SNAPSHOT #27522 (#27558)

* [Bug](materialized-view) add limitation for duplicate expr on materialized view (#27523) (#27562)

* [fix](planner)join node should output required slot from parent node #27526 (#27551)

* [branch-2.0](hive) enable hive view by default (#27550)

* [pick](nereids) adjust bc join and shuffle join #27113 (#27566)

* [Fix](hive-transactional-table) Fix NPE when query empty hive transactional table. (#27567)

---------

Co-authored-by: AKIRA <[email protected]>
Co-authored-by: amory <[email protected]>
Co-authored-by: Jerry Hu <[email protected]>
Co-authored-by: Pxl <[email protected]>
Co-authored-by: Xinyi Zou <[email protected]>
Co-authored-by: Luwei <[email protected]>
Co-authored-by: morrySnow <[email protected]>
Co-authored-by: 谢健 <[email protected]>
Co-authored-by: Mryange <[email protected]>
Co-authored-by: jakevin <[email protected]>
Co-authored-by: zhangstar333 <[email protected]>
Co-authored-by: Mingyu Chen <[email protected]>
Co-authored-by: Ashin Gau <[email protected]>
Co-authored-by: yujun <[email protected]>
Co-authored-by: Xin Liao <[email protected]>
Co-authored-by: Jibing-Li <[email protected]>
Co-authored-by: xy720 <[email protected]>
Co-authored-by: minghong <[email protected]>
Co-authored-by: Jack Drogon <[email protected]>
Co-authored-by: Dongyang Li <[email protected]>
Co-authored-by: zhiqiang <[email protected]>
Co-authored-by: starocean999 <[email protected]>
Co-authored-by: Qi Chen <[email protected]>
seawinde pushed a commit to seawinde/doris that referenced this pull request Nov 28, 2023
fill miss slot in having subquery.

such as 
```
select * from t group by k having max(k) in (select k from t2)
```

the max(k) should be push down aggregate
gnehil pushed a commit to gnehil/doris that referenced this pull request Dec 4, 2023
@xiaokang xiaokang mentioned this pull request Dec 4, 2023
XuJianxu pushed a commit to XuJianxu/doris that referenced this pull request Dec 14, 2023
fill miss slot in having subquery.

such as 
```
select * from t group by k having max(k) in (select k from t2)
```

the max(k) should be push down aggregate
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.0.3-merged reviewed usercase Important user case type label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants