Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[local exchange](fix) Fix correctness caused by local exchange #41555

Merged
merged 2 commits into from
Oct 9, 2024

Conversation

Gabriel39
Copy link
Contributor

@Gabriel39 Gabriel39 commented Oct 8, 2024

Proposed changes

For plan local exchange (hash shuffle) -> union -> colocated agg, we must ensure local exchange use the same hash algorithm as MPP shuffling.

This problem is covered by our test cases but only can be reproduced on multiple BEs so no case is added in this PR.

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@Gabriel39
Copy link
Contributor Author

run buildall

Copy link
Contributor

github-actions bot commented Oct 8, 2024

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TPC-H: Total hot run time: 40984 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit d6fd887acec38bfcbe5647a214b20d75fe31e688, data reload: false

------ Round 1 ----------------------------------
q1	18266	7546	7382	7382
q2	2475	167	193	167
q3	11252	1198	1174	1174
q4	10389	813	748	748
q5	7900	2945	2907	2907
q6	237	154	153	153
q7	1037	642	624	624
q8	9723	1975	1995	1975
q9	6568	6368	6400	6368
q10	6975	2324	2301	2301
q11	450	251	256	251
q12	403	223	223	223
q13	17787	3031	3014	3014
q14	236	215	212	212
q15	578	535	529	529
q16	654	623	585	585
q17	1011	595	532	532
q18	7207	6689	6550	6550
q19	1346	973	1046	973
q20	494	208	203	203
q21	3976	3301	3130	3130
q22	1090	992	983	983
Total cold run time: 110054 ms
Total hot run time: 40984 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7287	7291	7303	7291
q2	337	237	228	228
q3	2937	2765	2782	2765
q4	1916	1654	1717	1654
q5	5451	5462	5474	5462
q6	230	139	141	139
q7	2146	1735	1742	1735
q8	3249	3414	3444	3414
q9	8568	8562	8612	8562
q10	3494	3401	3446	3401
q11	610	474	472	472
q12	767	606	634	606
q13	7374	3015	3024	3015
q14	292	258	267	258
q15	573	513	514	513
q16	690	627	631	627
q17	1791	1577	1556	1556
q18	7958	7287	7406	7287
q19	1681	1453	1417	1417
q20	2085	1795	1827	1795
q21	5442	5167	5155	5155
q22	1099	1039	1011	1011
Total cold run time: 65977 ms
Total hot run time: 58363 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.28% (9632/25835)
Line Coverage: 28.68% (79898/278590)
Region Coverage: 28.11% (41306/146930)
Branch Coverage: 24.73% (21045/85084)
Coverage Report: http://coverage.selectdb-in.cc/coverage/d6fd887acec38bfcbe5647a214b20d75fe31e688_d6fd887acec38bfcbe5647a214b20d75fe31e688/report/index.html

@doris-robot
Copy link

TPC-DS: Total hot run time: 192084 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit d6fd887acec38bfcbe5647a214b20d75fe31e688, data reload: false

query1	979	376	365	365
query2	6529	2088	2088	2088
query3	6711	210	222	210
query4	33843	23316	23460	23316
query5	4276	456	460	456
query6	254	168	162	162
query7	4632	306	312	306
query8	286	230	208	208
query9	9710	2726	2717	2717
query10	425	282	275	275
query11	17934	15298	15320	15298
query12	154	101	97	97
query13	1678	491	437	437
query14	10408	7374	7597	7374
query15	315	181	178	178
query16	8005	436	469	436
query17	1714	571	570	570
query18	2078	308	308	308
query19	387	153	153	153
query20	118	107	108	107
query21	212	101	103	101
query22	4493	4348	4361	4348
query23	34862	33980	34103	33980
query24	11169	2815	2834	2815
query25	659	406	416	406
query26	1254	162	160	160
query27	2821	294	294	294
query28	8036	2436	2428	2428
query29	876	438	431	431
query30	317	162	159	159
query31	1022	801	814	801
query32	100	55	58	55
query33	781	296	317	296
query34	941	499	510	499
query35	842	751	706	706
query36	1105	945	953	945
query37	153	90	89	89
query38	4108	3886	3919	3886
query39	1497	1474	1447	1447
query40	218	97	100	97
query41	49	46	47	46
query42	129	96	97	96
query43	543	507	494	494
query44	1267	820	814	814
query45	197	168	172	168
query46	1154	715	730	715
query47	1953	1872	1816	1816
query48	448	368	356	356
query49	1138	418	410	410
query50	832	421	416	416
query51	7154	7047	6982	6982
query52	105	92	86	86
query53	260	191	187	187
query54	1335	488	490	488
query55	82	79	77	77
query56	297	278	263	263
query57	1292	1187	1135	1135
query58	256	268	265	265
query59	3255	2974	3012	2974
query60	311	283	284	283
query61	141	100	101	100
query62	881	661	669	661
query63	220	192	189	189
query64	5263	632	602	602
query65	3325	3178	3204	3178
query66	1426	301	308	301
query67	15941	15558	15592	15558
query68	4413	577	576	576
query69	452	297	293	293
query70	1196	1106	1108	1106
query71	339	273	259	259
query72	6343	4021	4000	4000
query73	763	343	352	343
query74	9986	9027	9139	9027
query75	3402	2647	2650	2647
query76	2659	940	899	899
query77	400	301	284	284
query78	10483	9638	9505	9505
query79	1704	583	598	583
query80	1288	456	445	445
query81	580	239	243	239
query82	969	137	137	137
query83	238	135	141	135
query84	262	84	82	82
query85	1309	293	297	293
query86	371	274	293	274
query87	4371	4383	4210	4210
query88	3831	2479	2455	2455
query89	400	287	292	287
query90	1883	190	189	189
query91	145	105	116	105
query92	64	49	46	46
query93	1225	538	531	531
query94	976	284	291	284
query95	355	251	259	251
query96	614	293	288	288
query97	3345	3165	3201	3165
query98	216	202	194	194
query99	1664	1307	1288	1288
Total cold run time: 301402 ms
Total hot run time: 192084 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 33.14 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit d6fd887acec38bfcbe5647a214b20d75fe31e688, data reload: false

query1	0.05	0.04	0.04
query2	0.06	0.02	0.02
query3	0.23	0.06	0.07
query4	1.63	0.10	0.10
query5	0.52	0.51	0.51
query6	1.13	0.74	0.73
query7	0.03	0.02	0.01
query8	0.05	0.03	0.02
query9	0.56	0.50	0.50
query10	0.55	0.55	0.56
query11	0.14	0.12	0.11
query12	0.14	0.11	0.12
query13	0.62	0.60	0.59
query14	2.71	2.86	2.72
query15	0.90	0.83	0.83
query16	0.38	0.38	0.36
query17	1.05	1.10	1.07
query18	0.22	0.22	0.22
query19	1.93	1.86	2.05
query20	0.01	0.01	0.01
query21	15.40	0.59	0.58
query22	2.72	2.31	2.12
query23	16.92	1.10	0.76
query24	3.23	1.13	0.94
query25	0.19	0.28	0.32
query26	0.39	0.14	0.12
query27	0.04	0.03	0.04
query28	10.80	1.10	1.07
query29	12.56	3.23	3.23
query30	0.25	0.06	0.07
query31	2.87	0.39	0.38
query32	3.26	0.47	0.46
query33	2.98	3.04	3.08
query34	16.96	4.44	4.40
query35	4.49	4.60	4.47
query36	0.66	0.50	0.48
query37	0.08	0.06	0.06
query38	0.04	0.04	0.04
query39	0.03	0.02	0.02
query40	0.15	0.12	0.12
query41	0.09	0.02	0.02
query42	0.03	0.03	0.02
query43	0.03	0.02	0.02
Total cold run time: 107.08 s
Total hot run time: 33.14 s

@wm1581066 wm1581066 requested a review from HappenLee October 8, 2024 10:52
@Gabriel39
Copy link
Contributor Author

run buildall

Copy link
Contributor

github-actions bot commented Oct 8, 2024

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.28% (9632/25835)
Line Coverage: 28.67% (79885/278591)
Region Coverage: 28.10% (41290/146930)
Branch Coverage: 24.73% (21041/85084)
Coverage Report: http://coverage.selectdb-in.cc/coverage/b467bf297faf679d8d7c83ed77f280c3c5f1f5be_b467bf297faf679d8d7c83ed77f280c3c5f1f5be/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 40822 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit b467bf297faf679d8d7c83ed77f280c3c5f1f5be, data reload: false

------ Round 1 ----------------------------------
q1	17576	7430	7251	7251
q2	2016	279	278	278
q3	12330	1083	1241	1083
q4	10566	693	724	693
q5	7757	2888	2843	2843
q6	237	153	151	151
q7	1026	629	631	629
q8	9344	1929	2023	1929
q9	6588	6435	6404	6404
q10	6950	2329	2333	2329
q11	436	251	251	251
q12	406	219	217	217
q13	17796	2964	2981	2964
q14	257	208	216	208
q15	570	518	523	518
q16	663	573	578	573
q17	977	515	629	515
q18	7293	6651	6775	6651
q19	1340	992	976	976
q20	476	208	204	204
q21	4069	3332	3147	3147
q22	1095	1008	1019	1008
Total cold run time: 109768 ms
Total hot run time: 40822 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7234	7224	7249	7224
q2	320	244	233	233
q3	3003	3005	2963	2963
q4	2075	1822	1764	1764
q5	5709	5743	5751	5743
q6	233	153	144	144
q7	2305	1870	1828	1828
q8	3381	3510	3476	3476
q9	8917	8891	8887	8887
q10	3605	3579	3592	3579
q11	606	485	499	485
q12	852	655	610	610
q13	9263	3224	3177	3177
q14	314	271	288	271
q15	589	535	511	511
q16	669	658	640	640
q17	1820	1623	1631	1623
q18	8287	7890	7635	7635
q19	1697	1473	1420	1420
q20	2112	1904	1874	1874
q21	5666	5493	5451	5451
q22	1170	1088	1066	1066
Total cold run time: 69827 ms
Total hot run time: 60604 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 192210 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit b467bf297faf679d8d7c83ed77f280c3c5f1f5be, data reload: false

query1	859	386	408	386
query2	6255	2097	2097	2097
query3	8688	200	206	200
query4	34305	23633	23449	23449
query5	3425	458	457	457
query6	270	163	163	163
query7	4220	300	307	300
query8	290	216	219	216
query9	9549	2724	2709	2709
query10	471	277	292	277
query11	17911	15341	15272	15272
query12	165	99	94	94
query13	1583	470	440	440
query14	9566	7312	7157	7157
query15	246	168	173	168
query16	7856	485	468	468
query17	1602	632	607	607
query18	1949	315	324	315
query19	310	173	162	162
query20	123	123	115	115
query21	214	109	113	109
query22	4763	4760	4592	4592
query23	35023	34148	34232	34148
query24	11112	2893	2829	2829
query25	636	418	415	415
query26	1202	164	165	164
query27	2346	300	304	300
query28	7593	2446	2428	2428
query29	840	448	433	433
query30	274	160	153	153
query31	1024	799	856	799
query32	97	55	57	55
query33	775	310	307	307
query34	914	509	514	509
query35	880	742	728	728
query36	1116	939	988	939
query37	156	92	87	87
query38	4007	3858	3925	3858
query39	1469	1449	1444	1444
query40	200	98	98	98
query41	49	44	45	44
query42	114	93	98	93
query43	527	484	482	482
query44	1261	808	810	808
query45	193	163	170	163
query46	1164	709	721	709
query47	1928	1844	1862	1844
query48	434	350	349	349
query49	950	411	390	390
query50	843	419	413	413
query51	7152	6805	6963	6805
query52	103	86	87	86
query53	259	185	181	181
query54	1227	479	486	479
query55	83	78	80	78
query56	301	277	287	277
query57	1233	1137	1171	1137
query58	240	271	250	250
query59	3107	3135	2920	2920
query60	295	270	281	270
query61	106	100	102	100
query62	860	680	655	655
query63	218	187	185	185
query64	3947	623	632	623
query65	3231	3165	3188	3165
query66	786	300	295	295
query67	16031	15518	15542	15518
query68	4059	581	561	561
query69	538	297	301	297
query70	1194	1126	1123	1123
query71	406	277	282	277
query72	7596	4035	3955	3955
query73	766	342	358	342
query74	10074	9063	8871	8871
query75	4184	2661	2660	2660
query76	3234	945	967	945
query77	696	297	317	297
query78	10315	9647	9502	9502
query79	2022	597	601	597
query80	2527	465	453	453
query81	578	238	235	235
query82	705	136	140	136
query83	302	135	136	135
query84	281	80	72	72
query85	1597	293	284	284
query86	432	303	296	296
query87	4580	4290	4308	4290
query88	3912	2473	2440	2440
query89	409	282	288	282
query90	2083	189	189	189
query91	156	113	109	109
query92	64	48	49	48
query93	1587	562	547	547
query94	1143	310	292	292
query95	349	254	257	254
query96	624	291	286	286
query97	3252	3203	3175	3175
query98	217	207	192	192
query99	1590	1302	1302	1302
Total cold run time: 302398 ms
Total hot run time: 192210 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 33.34 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit b467bf297faf679d8d7c83ed77f280c3c5f1f5be, data reload: false

query1	0.05	0.05	0.05
query2	0.06	0.03	0.02
query3	0.23	0.07	0.07
query4	1.64	0.10	0.11
query5	0.52	0.52	0.53
query6	1.14	0.72	0.72
query7	0.02	0.02	0.02
query8	0.04	0.02	0.03
query9	0.57	0.51	0.50
query10	0.54	0.55	0.55
query11	0.13	0.10	0.10
query12	0.13	0.11	0.11
query13	0.60	0.59	0.60
query14	2.72	2.82	2.78
query15	0.90	0.81	0.83
query16	0.38	0.37	0.38
query17	1.02	1.00	0.98
query18	0.21	0.19	0.20
query19	1.88	1.87	2.02
query20	0.01	0.02	0.01
query21	15.38	0.60	0.58
query22	2.54	2.21	2.29
query23	16.91	1.02	0.78
query24	3.50	1.52	1.28
query25	0.14	0.09	0.09
query26	0.65	0.15	0.14
query27	0.04	0.04	0.04
query28	9.89	1.10	1.06
query29	12.55	3.23	3.21
query30	0.24	0.06	0.05
query31	2.87	0.38	0.37
query32	3.28	0.48	0.46
query33	3.01	3.02	3.04
query34	17.03	4.50	4.45
query35	4.51	4.45	4.51
query36	0.67	0.48	0.49
query37	0.08	0.06	0.06
query38	0.04	0.04	0.03
query39	0.03	0.02	0.03
query40	0.16	0.12	0.12
query41	0.07	0.02	0.02
query42	0.04	0.02	0.02
query43	0.03	0.03	0.02
Total cold run time: 106.45 s
Total hot run time: 33.34 s

Copy link
Contributor

@HappenLee HappenLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Oct 9, 2024
Copy link
Contributor

github-actions bot commented Oct 9, 2024

PR approved by at least one committer and no changes requested.

Copy link
Contributor

github-actions bot commented Oct 9, 2024

PR approved by anyone and no changes requested.

@Gabriel39 Gabriel39 merged commit c065364 into apache:master Oct 9, 2024
24 of 28 checks passed
eldenmoon pushed a commit to eldenmoon/incubator-doris that referenced this pull request Oct 10, 2024
…e#41555)

For plan `local exchange (hash shuffle) -> union -> colocated agg`, we
must ensure local exchange use the same hash algorithm as MPP shuffling.

This problem is covered by our test cases but only can be reproduced on
multiple BEs so no case is added in this PR.
Gabriel39 added a commit to Gabriel39/incubator-doris that referenced this pull request Oct 11, 2024
…e#41555)

For plan `local exchange (hash shuffle) -> union -> colocated agg`, we
must ensure local exchange use the same hash algorithm as MPP shuffling.

This problem is covered by our test cases but only can be reproduced on
multiple BEs so no case is added in this PR.
Gabriel39 added a commit to Gabriel39/incubator-doris that referenced this pull request Oct 11, 2024
…e#41555)

For plan `local exchange (hash shuffle) -> union -> colocated agg`, we
must ensure local exchange use the same hash algorithm as MPP shuffling.

This problem is covered by our test cases but only can be reproduced on
multiple BEs so no case is added in this PR.
cjj2010 pushed a commit to cjj2010/doris that referenced this pull request Oct 12, 2024
…e#41555)

For plan `local exchange (hash shuffle) -> union -> colocated agg`, we
must ensure local exchange use the same hash algorithm as MPP shuffling.

This problem is covered by our test cases but only can be reproduced on
multiple BEs so no case is added in this PR.
amorynan pushed a commit to amorynan/doris that referenced this pull request Oct 12, 2024
…e#41555)

For plan `local exchange (hash shuffle) -> union -> colocated agg`, we
must ensure local exchange use the same hash algorithm as MPP shuffling.

This problem is covered by our test cases but only can be reproduced on
multiple BEs so no case is added in this PR.
Gabriel39 added a commit to Gabriel39/incubator-doris that referenced this pull request Oct 14, 2024
…e#41555)

For plan `local exchange (hash shuffle) -> union -> colocated agg`, we
must ensure local exchange use the same hash algorithm as MPP shuffling.

This problem is covered by our test cases but only can be reproduced on
multiple BEs so no case is added in this PR.
Gabriel39 added a commit that referenced this pull request Oct 14, 2024
Gabriel39 added a commit to Gabriel39/incubator-doris that referenced this pull request Oct 14, 2024
…e#41555)

For plan `local exchange (hash shuffle) -> union -> colocated agg`, we
must ensure local exchange use the same hash algorithm as MPP shuffling.

This problem is covered by our test cases but only can be reproduced on
multiple BEs so no case is added in this PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.1.7-merged dev/3.0.3-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants