Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](auto bucket) fix auto buckets calc using the first k partition #41675

Merged
merged 5 commits into from
Oct 14, 2024

Conversation

yujun777
Copy link
Collaborator

@yujun777 yujun777 commented Oct 10, 2024

If the first k (at most 7) partition data size is ascending, the result will be partion_size[k-1] + ema(first k partitons delta).

This is a bug, should use the last k partitions, but not the first k partitions to calculate.

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@yujun777
Copy link
Collaborator Author

run buildall

@yujun777 yujun777 changed the title [fix](auto bucket) fix auto buckets calc error [fix](auto bucket) fix auto buckets calc using the first k partition if their data size is ascending Oct 10, 2024
@yujun777 yujun777 changed the title [fix](auto bucket) fix auto buckets calc using the first k partition if their data size is ascending [fix](auto bucket) fix auto buckets calc using the first k partition Oct 10, 2024
@yujun777
Copy link
Collaborator Author

run buildall

@yujun777
Copy link
Collaborator Author

run buildall

@yujun777
Copy link
Collaborator Author

run performance

@yujun777
Copy link
Collaborator Author

run external

dataroaring
dataroaring previously approved these changes Oct 11, 2024
Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Oct 11, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A ut is better.

@yujun777
Copy link
Collaborator Author

A ut is better.

waiting

@yujun777 yujun777 marked this pull request as draft October 11, 2024 08:46
@yujun777
Copy link
Collaborator Author

wait for a ut

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Oct 11, 2024
@yujun777
Copy link
Collaborator Author

run buildall

@yujun777 yujun777 marked this pull request as ready for review October 11, 2024 09:22
@yujun777
Copy link
Collaborator Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40970 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 595e18cf3f419d6dd56b45dc3860f27d747e74e8, data reload: false

------ Round 1 ----------------------------------
q1	17618	7479	7293	7293
q2	2016	288	265	265
q3	12077	1069	1167	1069
q4	10567	791	709	709
q5	7780	2920	2792	2792
q6	245	153	153	153
q7	1000	636	617	617
q8	9362	1962	1954	1954
q9	7453	6429	6401	6401
q10	6998	2292	2319	2292
q11	440	255	252	252
q12	412	219	226	219
q13	17771	3018	3041	3018
q14	242	213	218	213
q15	573	529	524	524
q16	638	571	584	571
q17	977	639	506	506
q18	7088	6673	6653	6653
q19	1356	1136	994	994
q20	453	202	207	202
q21	3967	3295	3278	3278
q22	1100	1023	995	995
Total cold run time: 110133 ms
Total hot run time: 40970 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7294	7255	7333	7255
q2	331	231	235	231
q3	3036	2945	2951	2945
q4	2063	1850	1811	1811
q5	5771	5779	5720	5720
q6	239	148	147	147
q7	2248	1896	1833	1833
q8	3386	3568	3465	3465
q9	8923	8930	8844	8844
q10	3593	3549	3518	3518
q11	592	483	495	483
q12	841	671	664	664
q13	9438	3191	3189	3189
q14	316	274	272	272
q15	562	522	507	507
q16	701	639	637	637
q17	1844	1624	1629	1624
q18	8319	7655	7454	7454
q19	1736	1411	1507	1411
q20	2109	1835	1898	1835
q21	5599	5218	5325	5218
q22	1119	1028	1079	1028
Total cold run time: 70060 ms
Total hot run time: 60091 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 191508 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 595e18cf3f419d6dd56b45dc3860f27d747e74e8, data reload: false

query1	936	403	413	403
query2	6263	2083	2033	2033
query3	8675	193	199	193
query4	33953	23499	23365	23365
query5	3513	483	481	481
query6	274	168	167	167
query7	4181	299	305	299
query8	279	224	228	224
query9	9272	2684	2679	2679
query10	450	305	269	269
query11	17929	15235	15167	15167
query12	137	102	97	97
query13	1576	466	466	466
query14	9610	7462	7038	7038
query15	254	173	174	173
query16	7732	447	443	443
query17	1649	613	591	591
query18	2076	315	315	315
query19	280	159	158	158
query20	126	113	116	113
query21	216	107	104	104
query22	4728	4363	4503	4363
query23	34620	33915	34135	33915
query24	11178	2940	2863	2863
query25	620	414	411	411
query26	1158	162	166	162
query27	2275	295	305	295
query28	7503	2440	2433	2433
query29	807	439	426	426
query30	261	161	156	156
query31	1027	807	801	801
query32	97	56	54	54
query33	772	298	304	298
query34	920	516	513	513
query35	882	727	713	713
query36	1114	962	976	962
query37	153	83	88	83
query38	4096	3868	3903	3868
query39	1501	1451	1418	1418
query40	205	97	100	97
query41	47	43	44	43
query42	116	99	99	99
query43	517	486	481	481
query44	1316	822	838	822
query45	199	165	163	163
query46	1140	737	735	735
query47	1936	1838	1824	1824
query48	452	365	358	358
query49	949	436	391	391
query50	810	425	426	425
query51	7176	7020	7021	7020
query52	96	84	86	84
query53	264	188	184	184
query54	1235	481	481	481
query55	84	78	79	78
query56	304	261	271	261
query57	1261	1174	1142	1142
query58	219	226	245	226
query59	3370	3130	2981	2981
query60	294	271	264	264
query61	107	107	100	100
query62	877	671	674	671
query63	227	192	189	189
query64	4019	630	596	596
query65	3225	3211	3189	3189
query66	835	304	303	303
query67	15778	15608	15594	15594
query68	4531	579	570	570
query69	529	298	296	296
query70	1081	1130	1063	1063
query71	382	273	309	273
query72	7399	3945	3945	3945
query73	774	346	352	346
query74	10255	9011	8881	8881
query75	3414	2680	2664	2664
query76	3274	903	955	903
query77	431	298	308	298
query78	10379	9554	9469	9469
query79	1591	602	599	599
query80	1048	465	451	451
query81	600	232	233	232
query82	727	139	144	139
query83	234	152	135	135
query84	242	73	78	73
query85	1360	306	291	291
query86	391	304	307	304
query87	4481	4324	4340	4324
query88	3366	2434	2375	2375
query89	406	293	291	291
query90	1921	188	185	185
query91	153	105	101	101
query92	67	49	48	48
query93	1305	539	553	539
query94	1005	285	283	283
query95	357	256	254	254
query96	612	279	275	275
query97	3247	3093	3124	3093
query98	207	203	202	202
query99	1525	1293	1296	1293
Total cold run time: 297445 ms
Total hot run time: 191508 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 33.55 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 595e18cf3f419d6dd56b45dc3860f27d747e74e8, data reload: false

query1	0.03	0.03	0.03
query2	0.07	0.03	0.02
query3	0.23	0.07	0.07
query4	1.64	0.10	0.10
query5	0.52	0.49	0.51
query6	1.13	0.73	0.72
query7	0.02	0.02	0.01
query8	0.04	0.03	0.02
query9	0.57	0.50	0.49
query10	0.55	0.53	0.55
query11	0.14	0.10	0.11
query12	0.14	0.10	0.11
query13	0.60	0.60	0.63
query14	2.74	2.75	2.75
query15	0.91	0.83	0.83
query16	0.39	0.39	0.38
query17	1.04	1.00	1.04
query18	0.20	0.20	0.20
query19	1.98	1.81	2.02
query20	0.01	0.01	0.01
query21	15.37	0.59	0.60
query22	2.27	2.85	2.75
query23	17.17	0.99	0.88
query24	2.83	2.00	0.99
query25	0.35	0.12	0.04
query26	0.51	0.13	0.13
query27	0.05	0.03	0.03
query28	10.01	1.10	1.07
query29	12.50	3.26	3.23
query30	0.24	0.05	0.06
query31	2.87	0.38	0.38
query32	3.29	0.47	0.47
query33	3.00	2.99	3.07
query34	17.23	4.44	4.42
query35	4.51	4.50	4.53
query36	0.68	0.49	0.47
query37	0.09	0.06	0.06
query38	0.04	0.04	0.03
query39	0.04	0.02	0.02
query40	0.15	0.14	0.12
query41	0.08	0.02	0.03
query42	0.03	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 106.29 s
Total hot run time: 33.55 s

@yujun777
Copy link
Collaborator Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 41297 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 92efa8475a4406d63b886872d2b60d88fd7258fd, data reload: false

------ Round 1 ----------------------------------
q1	17573	7352	7277	7277
q2	2020	285	272	272
q3	12096	1062	1150	1062
q4	10578	822	825	822
q5	7748	3097	3071	3071
q6	238	152	149	149
q7	1042	621	591	591
q8	9353	1928	1928	1928
q9	6589	6482	6402	6402
q10	7030	2426	2424	2424
q11	432	242	246	242
q12	406	216	212	212
q13	17776	3004	3039	3004
q14	256	225	209	209
q15	582	515	525	515
q16	659	599	575	575
q17	981	580	637	580
q18	7236	6685	6774	6685
q19	1354	951	1010	951
q20	480	192	193	192
q21	3945	3141	3353	3141
q22	1112	1001	993	993
Total cold run time: 109486 ms
Total hot run time: 41297 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7282	7258	7241	7241
q2	333	235	236	235
q3	3093	2939	2936	2936
q4	2095	1813	1787	1787
q5	5799	5796	5786	5786
q6	236	141	143	141
q7	2298	1851	1865	1851
q8	3435	3595	3494	3494
q9	8963	8968	8891	8891
q10	3613	3541	3545	3541
q11	584	488	489	488
q12	857	637	618	618
q13	10635	3180	3195	3180
q14	311	289	270	270
q15	578	537	527	527
q16	673	670	649	649
q17	1874	1635	1599	1599
q18	8413	7869	7618	7618
q19	1753	1546	1567	1546
q20	2125	1857	1895	1857
q21	5564	5468	5350	5350
q22	1186	1038	1056	1038
Total cold run time: 71700 ms
Total hot run time: 60643 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 191608 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 92efa8475a4406d63b886872d2b60d88fd7258fd, data reload: false

query1	867	390	401	390
query2	6289	2097	2038	2038
query3	8698	194	202	194
query4	34162	23679	23667	23667
query5	3736	468	464	464
query6	309	169	166	166
query7	4194	292	297	292
query8	271	212	212	212
query9	9207	2637	2632	2632
query10	463	280	264	264
query11	17939	15160	15358	15160
query12	147	101	95	95
query13	1606	432	425	425
query14	9487	7159	7110	7110
query15	253	172	177	172
query16	7669	440	442	440
query17	1573	606	622	606
query18	1511	295	318	295
query19	266	167	161	161
query20	123	120	117	117
query21	206	100	105	100
query22	4685	4686	4603	4603
query23	35168	34813	33789	33789
query24	11044	2756	2722	2722
query25	609	397	397	397
query26	1077	152	160	152
query27	2493	300	286	286
query28	7435	2442	2431	2431
query29	750	462	426	426
query30	256	157	154	154
query31	1037	783	803	783
query32	98	52	58	52
query33	768	273	295	273
query34	927	503	523	503
query35	868	764	725	725
query36	1115	937	963	937
query37	148	90	90	90
query38	4060	4020	3969	3969
query39	1526	1465	1418	1418
query40	206	96	98	96
query41	47	44	46	44
query42	114	96	97	96
query43	529	500	484	484
query44	1242	809	827	809
query45	194	177	170	170
query46	1129	715	688	688
query47	1944	1822	1853	1822
query48	445	334	330	330
query49	909	407	418	407
query50	807	392	379	379
query51	7209	7097	6988	6988
query52	97	93	88	88
query53	260	179	179	179
query54	1243	417	435	417
query55	88	75	77	75
query56	297	272	266	266
query57	1275	1171	1175	1171
query58	233	238	238	238
query59	3255	3110	2958	2958
query60	294	261	255	255
query61	102	100	105	100
query62	844	665	679	665
query63	217	185	186	185
query64	4095	631	619	619
query65	3364	3208	3184	3184
query66	726	307	312	307
query67	15904	15841	15725	15725
query68	4468	555	542	542
query69	512	279	291	279
query70	1173	1110	1151	1110
query71	366	263	311	263
query72	7233	3948	3965	3948
query73	775	353	357	353
query74	10365	9004	8985	8985
query75	3447	2681	2674	2674
query76	2933	918	912	912
query77	605	304	287	287
query78	10593	9701	9564	9564
query79	1865	595	595	595
query80	1103	449	451	449
query81	563	240	238	238
query82	648	144	137	137
query83	288	132	135	132
query84	283	71	76	71
query85	1308	288	278	278
query86	414	308	287	287
query87	4453	4417	4353	4353
query88	3362	2205	2148	2148
query89	403	289	294	289
query90	1960	186	181	181
query91	143	103	102	102
query92	66	50	50	50
query93	2320	537	536	536
query94	903	288	277	277
query95	352	246	243	243
query96	624	278	276	276
query97	3258	3108	3176	3108
query98	224	200	188	188
query99	1577	1298	1306	1298
Total cold run time: 298696 ms
Total hot run time: 191608 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 33.97 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 92efa8475a4406d63b886872d2b60d88fd7258fd, data reload: false

query1	0.03	0.03	0.03
query2	0.07	0.03	0.03
query3	0.23	0.06	0.06
query4	1.65	0.10	0.10
query5	0.49	0.52	0.51
query6	1.13	0.74	0.72
query7	0.02	0.01	0.01
query8	0.04	0.03	0.03
query9	0.56	0.50	0.50
query10	0.56	0.53	0.54
query11	0.13	0.11	0.10
query12	0.13	0.11	0.10
query13	0.61	0.59	0.59
query14	2.70	2.75	2.78
query15	0.89	0.82	0.81
query16	0.38	0.39	0.39
query17	1.06	1.06	1.07
query18	0.20	0.19	0.20
query19	1.95	1.87	2.04
query20	0.01	0.01	0.00
query21	15.35	0.63	0.60
query22	2.65	2.65	2.64
query23	16.84	1.24	0.77
query24	2.94	1.39	1.73
query25	0.17	0.14	0.13
query26	0.58	0.13	0.14
query27	0.04	0.05	0.05
query28	9.93	1.10	1.07
query29	12.61	3.22	3.19
query30	0.24	0.06	0.06
query31	2.89	0.40	0.37
query32	3.27	0.46	0.46
query33	3.00	3.02	3.02
query34	16.90	4.46	4.51
query35	4.50	4.52	4.49
query36	0.67	0.48	0.51
query37	0.08	0.07	0.06
query38	0.04	0.03	0.04
query39	0.03	0.02	0.02
query40	0.16	0.11	0.12
query41	0.08	0.02	0.02
query42	0.03	0.02	0.02
query43	0.03	0.02	0.03
Total cold run time: 105.87 s
Total hot run time: 33.97 s

@yujun777
Copy link
Collaborator Author

run external

@yujun777
Copy link
Collaborator Author

run feut

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Oct 14, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

@deardeng deardeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dataroaring dataroaring merged commit 77ec0a2 into apache:master Oct 14, 2024
25 of 27 checks passed
yujun777 added a commit to yujun777/doris that referenced this pull request Oct 14, 2024
…pache#41675)

If the first k (at most 7) partition data size is ascending, the result
will be partion_size[k-1] + ema(first k partitons delta).

This is a bug, should use the last k partitions, but not the first k
partitions to calculate.
yujun777 added a commit to yujun777/doris that referenced this pull request Oct 14, 2024
…pache#41675)

If the first k (at most 7) partition data size is ascending, the result
will be partion_size[k-1] + ema(first k partitons delta).

This is a bug, should use the last k partitions, but not the first k
partitions to calculate.
yujun777 added a commit to yujun777/doris that referenced this pull request Oct 15, 2024
…pache#41675)

If the first k (at most 7) partition data size is ascending, the result
will be partion_size[k-1] + ema(first k partitons delta).

This is a bug, should use the last k partitions, but not the first k
partitions to calculate.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants