Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](array/map) fix resize impl in array/map #41595

Merged
merged 2 commits into from
Oct 10, 2024

Conversation

amorynan
Copy link
Contributor

@amorynan amorynan commented Oct 9, 2024

Proposed changes

this resize function called shrink function which defined in IColumn, and function set_num_rows in Block.cpp call shrink to make column rows changed, maybe less or more.
In this situation if we would make less rows which like cut the rows, so we should also make nested data in array/map resize to the less number of rows , otherwise we will meet exception :

mysql> /*set ShuffleSendBytes=0|ShuffleSendRows=0|SqlHash=c359573c8c7c33238ec34129e8ef0131|peakMemoryBytes=203840|SqlDigest=|cloudClusterName=UNKNOWN|TraceId=|WorkloadGroup=normal|FuzzyVariables=batch_size=4064,broker_load_batch_size=16352,disable_streaming_preaggregations=false,enable_distinct_streaming_aggregation=true,parallel_fragment_exec_instance_num=3,parallel_pipeline_task_num=5,profile_level=1,enable_pipeline_engine=true,enable_parallel_scan=true,parallel_scan_max_scanners_count=48,parallel_scan_min_rows_per_scanner=16384,parallel_prepare_threshold=13,enable_fold_constant_by_be=true,enable_rewrite_element_at_to_slot=true,runtime_filter_type=12,enable_parallel_result_sink=true,sort_phase_num=0,rewrite_or_to_in_predicate_threshold=100000,enable_function_pushdown=false,enable_common_expr_pushdown=true,enable_local_exchange=false,partitioned_hash_join_rows_threshold=1048576,partitioned_hash_agg_rows_threshold=1048576,partition_pruning_expand_threshold=10,enable_share_hash_table_for_broadcast_join=false,enable_two_phase_read_opt=true,enable_common_expr_pushdown_for_inverted_index=true,enable_delete_sub_predicate_v2=false,min_revocable_mem=33554432,fetch_remote_schema_timeout_seconds=120,max_fetch_remote_schema_tablet_count=512,enable_join_spill=false,enable_sort_spill=false,enable_agg_spill=false,enable_force_spill=false,data_queue_max_blocks=1,spill_streaming_agg_mem_limit=268435456,external_agg_partition_bits=5*/ select col4 from table_7052055 limit 10;
ERROR 1105 (HY000): errCode = 2, detailMessage = [INTERNAL_ERROR][E6] nested_column's size 660, is not consistent with offsets_column's 100

	0#  doris::get_stack_trace_by_libunwind(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
	1#  doris::get_stack_trace(int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)
	2#  doris::Exception::Exception(int, std::basic_string_view<char, std::char_traits<char> > const&)
	3#  doris::Exception::Excep

Issue Number: close #xxx

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@amorynan
Copy link
Contributor Author

amorynan commented Oct 9, 2024

run buildall

Copy link
Contributor

github-actions bot commented Oct 9, 2024

Possible file(s) that should be tracked in LFS detected: 🚨

The following file(s) exceeds the file size limit: 1048576 bytes, as set in the .yml configuration files:

  • regression-test/data/datatype_p0/nested_types/query/test_nested_type_with_resize.csv

Consider using git-lfs to manage large files.

@github-actions github-actions bot added the lfs-detected! Warning Label for use when LFS is detected in the commits of a Pull Request label Oct 9, 2024
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@@ -17,11 +17,12 @@

#include <gtest/gtest.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: 'gtest/gtest.h' file not found [clang-diagnostic-error]

#include <gtest/gtest.h>
         ^

@doris-robot
Copy link

TPC-H: Total hot run time: 40780 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 45c3441f77b3b40c1d49b1434df87b543e1fc96c, data reload: false

------ Round 1 ----------------------------------
q1	17571	7426	7229	7229
q2	2016	293	274	274
q3	12039	1037	1175	1037
q4	10544	734	701	701
q5	7760	2877	2791	2791
q6	237	151	151	151
q7	993	628	601	601
q8	9352	1954	1964	1954
q9	7319	6451	6379	6379
q10	6959	2289	2314	2289
q11	446	250	253	250
q12	401	228	222	222
q13	17796	2969	2968	2968
q14	253	220	215	215
q15	561	511	527	511
q16	657	602	584	584
q17	969	484	489	484
q18	7217	6699	6778	6699
q19	1351	1055	1079	1055
q20	488	203	198	198
q21	4002	3207	3229	3207
q22	1088	981	983	981
Total cold run time: 110019 ms
Total hot run time: 40780 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7203	7206	7172	7172
q2	336	225	231	225
q3	3019	2919	2939	2919
q4	2090	1928	1810	1810
q5	5742	5769	5759	5759
q6	237	145	145	145
q7	2237	1887	1859	1859
q8	3372	3569	3468	3468
q9	8908	8934	8886	8886
q10	3622	3553	3554	3553
q11	594	512	491	491
q12	857	599	629	599
q13	9661	3174	3203	3174
q14	319	271	272	271
q15	576	517	511	511
q16	695	641	643	641
q17	1834	1623	1593	1593
q18	8299	7880	7439	7439
q19	1715	1400	1491	1400
q20	2113	1892	1875	1875
q21	5432	5421	5406	5406
q22	1125	1084	1089	1084
Total cold run time: 69986 ms
Total hot run time: 60280 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.28% (9631/25836)
Line Coverage: 28.66% (79832/278590)
Region Coverage: 28.09% (41272/146928)
Branch Coverage: 24.71% (21025/85078)
Coverage Report: http://coverage.selectdb-in.cc/coverage/45c3441f77b3b40c1d49b1434df87b543e1fc96c_45c3441f77b3b40c1d49b1434df87b543e1fc96c/report/index.html

@doris-robot
Copy link

TPC-DS: Total hot run time: 191742 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 45c3441f77b3b40c1d49b1434df87b543e1fc96c, data reload: false

query1	909	408	413	408
query2	6284	2116	2059	2059
query3	8678	191	205	191
query4	34021	23485	23501	23485
query5	3565	495	464	464
query6	271	171	188	171
query7	4188	321	318	318
query8	294	245	224	224
query9	9542	2688	2670	2670
query10	475	286	279	279
query11	17845	15285	15355	15285
query12	163	99	97	97
query13	1597	451	455	451
query14	9577	7380	6920	6920
query15	247	165	169	165
query16	8054	470	476	470
query17	1676	619	564	564
query18	2181	301	310	301
query19	354	153	151	151
query20	120	109	119	109
query21	212	103	112	103
query22	4938	4734	4552	4552
query23	35404	34362	34001	34001
query24	10987	2869	2881	2869
query25	615	414	406	406
query26	1184	165	161	161
query27	2245	305	300	300
query28	7607	2437	2435	2435
query29	863	417	419	417
query30	257	149	148	148
query31	1013	815	796	796
query32	90	51	54	51
query33	745	290	298	290
query34	899	500	493	493
query35	884	733	743	733
query36	1077	948	934	934
query37	147	82	84	82
query38	4033	3916	3866	3866
query39	1477	1420	1424	1420
query40	206	96	93	93
query41	45	42	45	42
query42	112	93	94	93
query43	539	499	505	499
query44	1218	808	803	803
query45	197	161	169	161
query46	1145	717	702	702
query47	1947	1826	1845	1826
query48	453	346	357	346
query49	834	392	403	392
query50	837	421	416	416
query51	7089	6875	6748	6748
query52	103	87	89	87
query53	254	179	181	179
query54	1194	467	465	465
query55	78	75	73	73
query56	284	276	268	268
query57	1273	1148	1134	1134
query58	229	246	249	246
query59	3324	3002	2970	2970
query60	290	263	269	263
query61	106	99	101	99
query62	835	670	662	662
query63	219	183	184	183
query64	3918	640	614	614
query65	3233	3192	3181	3181
query66	732	295	303	295
query67	15888	15567	15580	15567
query68	3250	594	580	580
query69	431	296	299	296
query70	1103	1137	1048	1048
query71	335	280	288	280
query72	5984	4012	4029	4012
query73	778	358	351	351
query74	9727	9128	9078	9078
query75	3390	2711	2665	2665
query76	2023	972	920	920
query77	400	301	298	298
query78	10573	9561	9500	9500
query79	1121	600	615	600
query80	703	461	452	452
query81	545	242	241	241
query82	256	136	139	136
query83	163	139	143	139
query84	244	79	80	79
query85	1272	310	290	290
query86	358	304	275	275
query87	4379	4325	4259	4259
query88	3636	2434	2416	2416
query89	386	297	284	284
query90	2137	192	184	184
query91	138	108	105	105
query92	59	48	50	48
query93	1044	541	538	538
query94	946	300	295	295
query95	350	255	253	253
query96	605	281	280	280
query97	3227	3106	3165	3106
query98	216	209	191	191
query99	1525	1320	1297	1297
Total cold run time: 292604 ms
Total hot run time: 191742 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.24 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 45c3441f77b3b40c1d49b1434df87b543e1fc96c, data reload: false

query1	0.04	0.04	0.03
query2	0.07	0.03	0.03
query3	0.22	0.06	0.06
query4	1.64	0.10	0.11
query5	0.50	0.52	0.50
query6	1.12	0.72	0.72
query7	0.02	0.04	0.01
query8	0.04	0.03	0.03
query9	0.57	0.49	0.50
query10	0.56	0.53	0.55
query11	0.14	0.11	0.11
query12	0.14	0.11	0.10
query13	0.60	0.59	0.59
query14	2.69	2.77	2.82
query15	0.89	0.83	0.82
query16	0.37	0.38	0.38
query17	1.02	1.01	1.04
query18	0.20	0.20	0.20
query19	1.85	1.86	2.03
query20	0.01	0.01	0.01
query21	15.36	0.62	0.58
query22	2.43	1.85	1.66
query23	17.08	1.01	0.77
query24	2.42	0.75	2.00
query25	0.35	0.17	0.05
query26	0.37	0.14	0.14
query27	0.05	0.04	0.04
query28	10.53	1.09	1.07
query29	12.58	3.25	3.24
query30	0.24	0.06	0.06
query31	2.88	0.38	0.38
query32	3.26	0.46	0.46
query33	2.98	3.01	3.07
query34	17.11	4.44	4.45
query35	4.49	4.47	4.45
query36	0.66	0.49	0.47
query37	0.08	0.06	0.06
query38	0.04	0.04	0.04
query39	0.03	0.02	0.02
query40	0.16	0.13	0.13
query41	0.08	0.02	0.02
query42	0.03	0.02	0.02
query43	0.04	0.03	0.03
Total cold run time: 105.94 s
Total hot run time: 32.24 s

@amorynan
Copy link
Contributor Author

amorynan commented Oct 9, 2024

run p0

Copy link
Contributor

Possible file(s) that should be tracked in LFS detected: 🚨

The following file(s) exceeds the file size limit: 1048576 bytes, as set in the .yml configuration files:

  • regression-test/data/datatype_p0/nested_types/query/test_nested_type_with_resize.csv

Consider using git-lfs to manage large files.

@amorynan
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.32% (9642/25837)
Line Coverage: 28.70% (79952/278610)
Region Coverage: 28.10% (41328/147070)
Branch Coverage: 24.72% (21048/85140)
Coverage Report: http://coverage.selectdb-in.cc/coverage/60f781555d3ce4c1354ba2c87afec41f716b3ad5_60f781555d3ce4c1354ba2c87afec41f716b3ad5/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 40794 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 60f781555d3ce4c1354ba2c87afec41f716b3ad5, data reload: false

------ Round 1 ----------------------------------
q1	17585	7450	7221	7221
q2	2027	284	269	269
q3	12203	1099	1187	1099
q4	10548	755	752	752
q5	7762	2896	2836	2836
q6	234	152	148	148
q7	972	630	604	604
q8	9360	1967	1969	1967
q9	6514	6414	6396	6396
q10	6967	2329	2282	2282
q11	450	240	245	240
q12	399	214	215	214
q13	17762	3033	2988	2988
q14	246	208	216	208
q15	564	522	525	522
q16	634	590	561	561
q17	991	527	528	527
q18	7354	6640	6833	6640
q19	1347	995	1033	995
q20	482	213	195	195
q21	4004	3134	3687	3134
q22	1104	996	1007	996
Total cold run time: 109509 ms
Total hot run time: 40794 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7382	7200	7259	7200
q2	340	232	228	228
q3	3097	2964	2976	2964
q4	2049	1846	1829	1829
q5	5806	5785	5786	5785
q6	229	151	149	149
q7	2261	1845	1808	1808
q8	3374	3583	3404	3404
q9	8963	8945	8971	8945
q10	3565	3546	3527	3527
q11	598	484	481	481
q12	875	626	612	612
q13	9159	3228	3207	3207
q14	315	292	280	280
q15	587	520	529	520
q16	678	652	645	645
q17	1832	1632	1611	1611
q18	8410	7824	7591	7591
q19	1760	1484	1491	1484
q20	2116	1862	1876	1862
q21	5599	5443	5440	5440
q22	1195	1086	1076	1076
Total cold run time: 70190 ms
Total hot run time: 60648 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 191936 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 60f781555d3ce4c1354ba2c87afec41f716b3ad5, data reload: false

query1	891	409	414	409
query2	6276	2088	1984	1984
query3	8678	192	194	192
query4	34053	23499	23360	23360
query5	3496	497	452	452
query6	270	164	164	164
query7	4193	307	302	302
query8	292	224	222	222
query9	9506	2669	2672	2669
query10	472	277	273	273
query11	17867	15280	15259	15259
query12	152	95	97	95
query13	1570	483	430	430
query14	9834	7583	7739	7583
query15	256	167	178	167
query16	7785	473	497	473
query17	1689	668	588	588
query18	1935	333	333	333
query19	369	149	153	149
query20	126	111	109	109
query21	211	107	111	107
query22	4567	4485	4434	4434
query23	34906	34093	34007	34007
query24	11022	2893	2798	2798
query25	616	413	399	399
query26	1159	159	162	159
query27	2239	302	294	294
query28	7177	2411	2409	2409
query29	798	446	423	423
query30	257	165	150	150
query31	1040	800	811	800
query32	95	53	59	53
query33	760	294	297	294
query34	931	518	490	490
query35	876	759	729	729
query36	1096	935	969	935
query37	148	91	92	91
query38	4032	3909	3985	3909
query39	1472	1446	1417	1417
query40	210	95	96	95
query41	50	44	47	44
query42	125	99	100	99
query43	529	486	477	477
query44	1230	819	814	814
query45	194	167	167	167
query46	1170	756	701	701
query47	1916	1860	1819	1819
query48	428	345	355	345
query49	888	414	430	414
query50	820	425	422	422
query51	7171	6827	6907	6827
query52	100	86	92	86
query53	262	185	186	185
query54	1204	480	487	480
query55	78	76	79	76
query56	274	268	271	268
query57	1234	1144	1124	1124
query58	238	228	234	228
query59	3039	2966	2831	2831
query60	288	264	269	264
query61	109	99	102	99
query62	876	671	667	667
query63	221	184	183	183
query64	3962	634	625	625
query65	3224	3189	3189	3189
query66	847	306	313	306
query67	15982	15581	15689	15581
query68	4287	597	587	587
query69	504	290	301	290
query70	1229	1134	1120	1120
query71	355	271	259	259
query72	7382	3944	3971	3944
query73	783	338	362	338
query74	10054	9093	9126	9093
query75	3458	2663	2654	2654
query76	3058	903	956	903
query77	613	296	282	282
query78	10429	9590	9528	9528
query79	2502	582	599	582
query80	1043	450	463	450
query81	562	244	241	241
query82	630	140	135	135
query83	320	133	137	133
query84	272	77	73	73
query85	1264	297	289	289
query86	381	280	297	280
query87	4389	4306	4317	4306
query88	3875	2401	2354	2354
query89	416	286	285	285
query90	1986	183	183	183
query91	144	105	108	105
query92	64	47	48	47
query93	1994	571	550	550
query94	988	297	281	281
query95	358	253	255	253
query96	612	286	287	286
query97	3236	3102	3101	3101
query98	236	210	188	188
query99	1535	1289	1294	1289
Total cold run time: 298740 ms
Total hot run time: 191936 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.41 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 60f781555d3ce4c1354ba2c87afec41f716b3ad5, data reload: false

query1	0.03	0.03	0.03
query2	0.06	0.02	0.03
query3	0.23	0.06	0.06
query4	1.63	0.10	0.10
query5	0.52	0.50	0.49
query6	1.14	0.73	0.73
query7	0.02	0.02	0.02
query8	0.04	0.03	0.03
query9	0.55	0.51	0.49
query10	0.56	0.53	0.54
query11	0.14	0.10	0.12
query12	0.14	0.11	0.11
query13	0.62	0.60	0.60
query14	2.71	2.73	2.68
query15	0.90	0.83	0.83
query16	0.39	0.40	0.38
query17	1.06	1.06	1.10
query18	0.20	0.19	0.20
query19	1.96	1.90	1.96
query20	0.01	0.01	0.01
query21	15.36	0.60	0.58
query22	2.59	3.12	1.47
query23	17.00	1.09	0.75
query24	2.73	1.47	1.19
query25	0.27	0.26	0.04
query26	0.36	0.14	0.13
query27	0.05	0.04	0.04
query28	10.38	1.10	1.07
query29	12.60	3.28	3.26
query30	0.24	0.06	0.06
query31	2.85	0.38	0.38
query32	3.29	0.46	0.46
query33	2.98	3.01	3.04
query34	16.74	4.42	4.50
query35	4.54	4.50	4.42
query36	0.69	0.48	0.49
query37	0.08	0.06	0.06
query38	0.04	0.03	0.03
query39	0.03	0.02	0.02
query40	0.15	0.12	0.12
query41	0.06	0.02	0.02
query42	0.03	0.02	0.02
query43	0.03	0.03	0.02
Total cold run time: 106 s
Total hot run time: 32.41 s

@amorynan
Copy link
Contributor Author

run p0

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Oct 10, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

Copy link
Contributor

@HappenLee HappenLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@xiaokang xiaokang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xiaokang xiaokang merged commit 641c4dc into apache:master Oct 10, 2024
27 of 31 checks passed
eldenmoon pushed a commit to eldenmoon/incubator-doris that referenced this pull request Oct 10, 2024
this resize function called shrink function which defined in IColumn,
and function set_num_rows in Block.cpp call shrink to make column rows
changed, maybe less or more.
In this situation if we would make less rows which like cut the rows, so
we should also make nested data in array/map resize to the less number
of rows , otherwise we will meet exception
amorynan added a commit to amorynan/doris that referenced this pull request Oct 11, 2024
this resize function called shrink function which defined in IColumn,
and function set_num_rows in Block.cpp call shrink to make column rows
changed, maybe less or more.
In this situation if we would make less rows which like cut the rows, so
we should also make nested data in array/map resize to the less number
of rows , otherwise we will meet exception
cjj2010 pushed a commit to cjj2010/doris that referenced this pull request Oct 12, 2024
this resize function called shrink function which defined in IColumn,
and function set_num_rows in Block.cpp call shrink to make column rows
changed, maybe less or more.
In this situation if we would make less rows which like cut the rows, so
we should also make nested data in array/map resize to the less number
of rows , otherwise we will meet exception
eldenmoon pushed a commit that referenced this pull request Oct 15, 2024
eldenmoon pushed a commit that referenced this pull request Oct 15, 2024
## Proposed changes
backport: #41595
Issue Number: close #xxx

<!--Describe your changes.-->
amorynan added a commit to amorynan/doris that referenced this pull request Oct 17, 2024
this resize function called shrink function which defined in IColumn,
and function set_num_rows in Block.cpp call shrink to make column rows
changed, maybe less or more.
In this situation if we would make less rows which like cut the rows, so
we should also make nested data in array/map resize to the less number
of rows , otherwise we will meet exception
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.0.16-merged dev/2.1.7-merged dev/3.0.3-merged lfs-detected! Warning Label for use when LFS is detected in the commits of a Pull Request p0_c reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants