[fix](statistics)Fix auto analyze bugs. #27559

Jibing-Li · 2023-11-24T11:31:54Z

Fix auto analyze doesn't filter unsupported type bug.
Catch throwable in auto analyze thread for each database, otherwise the thread will quit when one database failed to create jobs and all other databases will not get analyzed.
change FE config item full_auto_analyze_simultaneously_running_task_num to auto_analyze_simultaneously_running_task_num

github-actions · 2023-11-24T12:05:18Z

PR approved by anyone and no changes requested.

Fix auto analyze doesn't filter unsupported type bug. Catch throwable in auto analyze thread for each database, otherwise the thread will quit when one database failed to create jobs and all other databases will not get analyzed. change FE config item full_auto_analyze_simultaneously_running_task_num to auto_analyze_simultaneously_running_task_num backport #27559

Jibing-Li · 2023-11-24T13:29:21Z

run buildall

doris-robot · 2023-11-24T14:06:27Z

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 45.48 seconds
stream load tsv: 567 seconds loaded 74807831229 Bytes, about 125 MB/s
stream load json: 27 seconds loaded 2358488459 Bytes, about 83 MB/s
stream load orc: 71 seconds loaded 1101869774 Bytes, about 14 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.4 seconds inserted 10000000 Rows, about 352K ops/s
storage size: 17101579227 Bytes

doris-robot · 2023-11-24T14:15:50Z

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit b3dc97b7a71cf8a95cd046f1046b057ce54cdc1b, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4911	4650	4632	4632
q2	362	160	169	160
q3	2041	1892	1871	1871
q4	1401	1258	1273	1258
q5	3967	3936	4017	3936
q6	246	128	130	128
q7	1449	887	876	876
q8	2776	2803	2775	2775
q9	9946	9568	9458	9458
q10	3466	3508	3500	3500
q11	383	247	251	247
q12	452	282	290	282
q13	4583	3796	3834	3796
q14	325	290	292	290
q15	592	531	525	525
q16	666	579	579	579
q17	1131	969	945	945
q18	8001	7517	7552	7517
q19	1684	1724	1662	1662
q20	580	320	306	306
q21	4441	3980	4085	3980
q22	479	369	387	369
Total cold run time: 53882 ms
Total hot run time: 49092 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4596	4577	4617	4577
q2	356	222	277	222
q3	4058	4044	4035	4035
q4	2722	2713	2728	2713
q5	9813	9793	9804	9793
q6	245	123	123	123
q7	3036	2480	2460	2460
q8	4441	4427	4441	4427
q9	12915	12859	12776	12776
q10	4063	4147	4167	4147
q11	822	701	714	701
q12	985	808	800	800
q13	4294	3568	3531	3531
q14	379	352	336	336
q15	566	525	526	525
q16	735	672	675	672
q17	3881	3880	3846	3846
q18	9714	9049	9090	9049
q19	1805	1778	1813	1778
q20	2392	2058	2057	2057
q21	8899	8765	8716	8716
q22	882	797	802	797
Total cold run time: 81599 ms
Total hot run time: 78081 ms

fe/fe-core/src/main/java/org/apache/doris/statistics/StatisticsAutoCollector.java

morningman · 2023-11-24T15:10:42Z

run buildall

morningman

LGTM

github-actions · 2023-11-24T15:15:59Z

PR approved by at least one committer and no changes requested.

doris-robot · 2023-11-24T17:16:29Z

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 45.91 seconds
stream load tsv: 568 seconds loaded 74807831229 Bytes, about 125 MB/s
stream load json: 27 seconds loaded 2358488459 Bytes, about 83 MB/s
stream load orc: 71 seconds loaded 1101869774 Bytes, about 14 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.6 seconds inserted 10000000 Rows, about 349K ops/s
storage size: 17099285694 Bytes

doris-robot · 2023-11-24T17:19:12Z

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Tpch sf100 test result on commit e96698855497b7f52a9cf04607b320f697abbc7d, data reload: false

run tpch-sf100 query with default conf and session variables
q1	5004	4745	4649	4649
q2	362	155	140	140
q3	2054	1850	1853	1850
q4	1392	1257	1261	1257
q5	3940	3964	4037	3964
q6	251	137	132	132
q7	1432	903	874	874
q8	2777	2791	2791	2791
q9	9691	10727	9529	9529
q10	3470	3509	3529	3509
q11	369	246	240	240
q12	441	290	295	290
q13	4615	3860	3805	3805
q14	324	294	302	294
q15	598	545	524	524
q16	659	590	594	590
q17	1133	989	949	949
q18	7807	7362	7342	7342
q19	1712	1684	1673	1673
q20	537	334	286	286
q21	4407	3927	3969	3927
q22	473	371	375	371
Total cold run time: 53448 ms
Total hot run time: 48986 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4634	4598	4618	4598
q2	348	238	267	238
q3	4001	4015	3968	3968
q4	2697	2693	2690	2690
q5	9603	9597	9648	9597
q6	248	123	122	122
q7	3009	2513	2477	2477
q8	4403	4474	4457	4457
q9	12953	12826	12831	12826
q10	4066	4174	4160	4160
q11	803	630	655	630
q12	983	833	838	833
q13	4296	3580	3568	3568
q14	371	370	345	345
q15	577	519	528	519
q16	732	664	679	664
q17	3830	3934	3834	3834
q18	9358	8797	8806	8797
q19	1864	1789	1779	1779
q20	2400	2069	2071	2069
q21	8834	8428	8456	8428
q22	894	822	797	797
Total cold run time: 80904 ms
Total hot run time: 77396 ms

Fix auto analyze doesn't filter unsupported type bug. Catch throwable in auto analyze thread for each database, otherwise the thread will quit when one database failed to create jobs and all other databases will not get analyzed. change FE config item full_auto_analyze_simultaneously_running_task_num to auto_analyze_simultaneously_running_task_num backport apache#27559

* [fix](stats) Fix update rows for unique table didn't get updated properly #26968 (#27337) * [FIX](jsonb) fix jsonb in predict column #27325 (#27424) * [fix](fe) slots in having clause should be set to need materialized(#27412) (#27429) * [Bug](insert)fix insert wrong data on mv when stmt have multiple values (#27297) (#27382) fix insert wrong data on mv when stmt have multiple values * [fix](fe ut) Fix OlapQueryCacheTest failed (#27305) (#27406) 1. ``` java.lang.NullPointerException: null at org.apache.doris.catalog.Env.getCurrentSystemInfo(Env.java:793) ~[classes/:?] at org.apache.doris.qe.SimpleScheduler$UpdateBlacklistThread.run(SimpleScheduler.java:206) ~[classes/:?] at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_382] java.lang.NullPointerException at org.apache.doris.qe.OlapQueryCacheTest.setUp(OlapQueryCacheTest.java:226) ``` 2. ``` [ERROR] testSqlCacheKeyWithNestedViewForNereids Time elapsed: 1.962 s <<< FAILURE! java.lang.AssertionError: SELECT command denied to user 'testCluster:testUser'@'192.168.1.1' for table 'internal: testCluster:testDb: appevent' at org.apache.doris.qe.OlapQueryCacheTest.parseSqlByNereids(OlapQueryCacheTest.java:579) at org.apache.doris.qe.OlapQueryCacheTest.testSqlCacheKeyWithNestedViewForNereids(OlapQueryCacheTest.java:1338) ``` 3. ``` [ERROR] Tests run: 28, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 113.63 s <<< FAILURE! - in org.apache.doris.qe.OlapQueryCacheTest [ERROR] testCacheModeTable Time elapsed: 1.657 s <<< ERROR! java.lang.IllegalArgumentException: Value of type org.apache.doris.qe.QueryState incompatible with return type org.apache.doris.system.SystemInfoService of org.apache.doris.catalog.Env#getCurrentSystemInfo() at org.apache.doris.qe.OlapQueryCacheTest.setUp(OlapQueryCacheTest.java:156) ``` * [regression test](schema change) add some schema change regression cases (#27112) (#27418) * [fix](Nereids) result type of add precision is 1 more than expected (#27136) (#27426) * [fix](Nereids): fill miss slot in having subquery (#27177) (#27394) * [fix](memory) Fix make_top_consumption_snapshots heap-use-after-free #27434 (#27465) * [fix](function) make TIMESTAMP function DEPEND_ON_ARGUMENT (#27343) (#27458) * [fix](test) order by clause in test_map(#27390) (#27391) pick #27390 * [performance](Planner): optimize getStringValue() in DateLiteral (#27363) (#27470) - reduce cost of `getStringValue()` - original code don't consider `microsecond` part in `getStringValue()` (cherry picked from commit 044a295) * [Chore](pick) do not push down agg on aggregate column (#27356) (#27498) * [fix](stats) table not exists error msg not print objects name #27074 (#27463) * [improve](nereids) support agg function of count(const value) pushdown #26677 (#27499) support sql: select count(1)-count(not null) from table, the agg of count could push down. * [test](fe-ut) fix unstable MysqlServerTest (#27459) Need to find a unbind port for MysqlServerTest * [opt](MergedIO) no need to merge large columns (#27315) (#27497) 1. Fix a profile bug of `MergeRangeFileReader`, and add a profile `ApplyBytes` to show the total bytes of ranges. 2. There's no need to merge large columns, because `MergeRangeFileReader` will increase the copy time. * [improvement](drop tablet) impr gc shutdown tablet lock (#26151) (#27478) * [doc](stats) SQL manual for stats (#27461) * [chore](merge-on-write) disable rowid conversion check for mow table by default (#27482) (#27508) * [fix](regression)Fix hive p2 case (#27466) (#27511) * [fix](statistics)Fix auto analyze remove finished job bug #27486 (#27510) * [Bug](bitmap) Fix heap-use-after-free in the bitmap functions #27411 (#27521) * [Pick](nereids) Pick: partition prune fails in case of NOT expression (#27047) (#27507) * [fix](clone) Fix engine_clone file exist (#27361) (#27536) * [chore](case) adjust timeout of broker load case #27540 * Fix auto analyze doesn't filter unsupported type bug. (#27547) Fix auto analyze doesn't filter unsupported type bug. Catch throwable in auto analyze thread for each database, otherwise the thread will quit when one database failed to create jobs and all other databases will not get analyzed. change FE config item full_auto_analyze_simultaneously_running_task_num to auto_analyze_simultaneously_running_task_num backport #27559 * [chore](fe plugin) Upgrade dependency to doris 2.0-SNAPSHOT #27522 (#27558) * [Bug](materialized-view) add limitation for duplicate expr on materialized view (#27523) (#27562) * [fix](planner)join node should output required slot from parent node #27526 (#27551) * [branch-2.0](hive) enable hive view by default (#27550) * [pick](nereids) adjust bc join and shuffle join #27113 (#27566) * [Fix](hive-transactional-table) Fix NPE when query empty hive transactional table. (#27567) --------- Co-authored-by: AKIRA <[email protected]> Co-authored-by: amory <[email protected]> Co-authored-by: Jerry Hu <[email protected]> Co-authored-by: Pxl <[email protected]> Co-authored-by: Xinyi Zou <[email protected]> Co-authored-by: Luwei <[email protected]> Co-authored-by: morrySnow <[email protected]> Co-authored-by: 谢健 <[email protected]> Co-authored-by: Mryange <[email protected]> Co-authored-by: jakevin <[email protected]> Co-authored-by: zhangstar333 <[email protected]> Co-authored-by: Mingyu Chen <[email protected]> Co-authored-by: Ashin Gau <[email protected]> Co-authored-by: yujun <[email protected]> Co-authored-by: Xin Liao <[email protected]> Co-authored-by: Jibing-Li <[email protected]> Co-authored-by: xy720 <[email protected]> Co-authored-by: minghong <[email protected]> Co-authored-by: Jack Drogon <[email protected]> Co-authored-by: Dongyang Li <[email protected]> Co-authored-by: zhiqiang <[email protected]> Co-authored-by: starocean999 <[email protected]> Co-authored-by: Qi Chen <[email protected]>

Fix auto analyze doesn't filter unsupported type bug. Catch throwable in auto analyze thread for each database, otherwise the thread will quit when one database failed to create jobs and all other databases will not get analyzed. change FE config item full_auto_analyze_simultaneously_running_task_num to auto_analyze_simultaneously_running_task_num

Fix auto analyze doesn't filter unsupported type bug. Catch throwable in auto analyze thread for each database, otherwise the thread will quit when one database failed to create jobs and all other databases will not get analyzed. change FE config item full_auto_analyze_simultaneously_running_task_num to auto_analyze_simultaneously_running_task_num backport apache#27559

Fix auto analyze doesn't filter unsupported type bug. Catch throwable in auto analyze thread for each database, otherwise the thread will quit when one database failed to create jobs and all other databases will not get analyzed. change FE config item full_auto_analyze_simultaneously_running_task_num to auto_analyze_simultaneously_running_task_num

Jibing-Li mentioned this pull request Nov 24, 2023

[fix](statistics)Fix auto analyze bugs. #27547

Merged

Jibing-Li marked this pull request as ready for review November 24, 2023 11:32

Kikyou1997 approved these changes Nov 24, 2023

View reviewed changes

github-actions bot added the reviewed label Nov 24, 2023

xiaokang added the dev/2.0.3-merged label Nov 24, 2023

morningman reviewed Nov 24, 2023

View reviewed changes

fe/fe-core/src/main/java/org/apache/doris/statistics/StatisticsAutoCollector.java Outdated Show resolved Hide resolved

Fix auto analyze doesn't filter unsupported type bug.

e966988

Jibing-Li force-pushed the fixauto branch from b3dc97b to e966988 Compare November 24, 2023 15:08

morningman approved these changes Nov 24, 2023

View reviewed changes

github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 24, 2023

yiguolei merged commit 6b1428d into apache:master Nov 25, 2023

Jibing-Li deleted the fixauto branch November 25, 2023 04:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fix](statistics)Fix auto analyze bugs. #27559

[fix](statistics)Fix auto analyze bugs. #27559

Jibing-Li commented Nov 24, 2023

github-actions bot commented Nov 24, 2023

Jibing-Li commented Nov 24, 2023

doris-robot commented Nov 24, 2023

doris-robot commented Nov 24, 2023

morningman commented Nov 24, 2023

morningman left a comment

github-actions bot commented Nov 24, 2023

doris-robot commented Nov 24, 2023

doris-robot commented Nov 24, 2023

[fix](statistics)Fix auto analyze bugs. #27559

[fix](statistics)Fix auto analyze bugs. #27559

Conversation

Jibing-Li commented Nov 24, 2023

github-actions bot commented Nov 24, 2023

Jibing-Li commented Nov 24, 2023

doris-robot commented Nov 24, 2023

doris-robot commented Nov 24, 2023

morningman commented Nov 24, 2023

morningman left a comment

Choose a reason for hiding this comment

github-actions bot commented Nov 24, 2023

doris-robot commented Nov 24, 2023

doris-robot commented Nov 24, 2023