Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix]Remove be special handling of date types and bugs in registration functions #45159

Open
wants to merge 26 commits into
base: master
Choose a base branch
from

Conversation

koarz
Copy link
Contributor

@koarz koarz commented Dec 8, 2024

What problem does this PR solve?

  1. be will do something special with datetype in get_function, now remove that part of the code.
  2. Some functions may have the same parameters but have multiple return types, but in the registration of the function does not deal with this situation, which will lead to this type of function only to keep a copy (be end of only one return type), so in the registration of the function to the key_str plus the return value type to distinguish between
  3. If a function's return value type depends on its parameters, and the parameter types can only be determined at runtime, then the dont_append_return_type_name_when_register_function function should be overloaded to ensure that get_function works.
  4. Some of the date functions are not always nullable on the be, so I've changed the nullable handling of the fe functions.
  5. If a function has explicit return type,and dont_append_return_type_name_when_register_function return false then we need override get_return_type(const DataTypes) const.
  6. if the function cant return directly we need override ColumnsWithNameAndType version such as
get_return_type(const DataTypes arguments) override {
  if (arguments[0] is null) {
    return Nullable(Type1)  
  }
  if (...) {
    return ...Type1 
  }
  return Type1
}
the function always return Type1, but may do check for arguments's type
please override it to
get_return_type(const ColumnsWithNameAndType arguments) override {
  if (arguments[0].type is null) {
    return Nullable(Type1)  
  }
  if (...) {
    return ...Type1 
  }
  return Type1
}
get_return_type(const DataTypes arguments) override {
  return Type1
}

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@koarz
Copy link
Contributor Author

koarz commented Dec 8, 2024

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 38.99% (10174/26092)
Line Coverage: 29.65% (84356/284532)
Region Coverage: 28.71% (43296/150786)
Branch Coverage: 25.21% (21915/86932)
Coverage Report: http://coverage.selectdb-in.cc/coverage/ccad421c5072729e13a277000d4e9236994c607b_ccad421c5072729e13a277000d4e9236994c607b/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 39869 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ccad421c5072729e13a277000d4e9236994c607b, data reload: false

------ Round 1 ----------------------------------
q1	17624	7493	7287	7287
q2	2042	179	162	162
q3	10769	1071	1215	1071
q4	10514	728	756	728
q5	7596	2712	2673	2673
q6	237	147	147	147
q7	984	611	607	607
q8	9232	1880	1924	1880
q9	6592	6501	6455	6455
q10	6988	2310	2325	2310
q11	462	262	241	241
q12	426	221	229	221
q13	17757	3018	3021	3018
q14	247	223	208	208
q15	570	548	529	529
q16	639	579	583	579
q17	970	576	556	556
q18	7143	6694	6579	6579
q19	1337	1066	965	965
q20	458	181	179	179
q21	4041	3157	3186	3157
q22	375	317	318	317
Total cold run time: 107003 ms
Total hot run time: 39869 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7254	7273	7232	7232
q2	327	229	229	229
q3	2862	2792	2916	2792
q4	2047	1805	1826	1805
q5	5659	5646	5642	5642
q6	228	142	144	142
q7	2241	1836	1808	1808
q8	3366	3558	3491	3491
q9	9009	9049	9094	9049
q10	3594	3522	3528	3522
q11	605	511	499	499
q12	817	639	612	612
q13	11381	3218	3213	3213
q14	322	286	275	275
q15	577	530	519	519
q16	693	649	655	649
q17	1861	1607	1598	1598
q18	8259	7663	7603	7603
q19	1685	1571	1523	1523
q20	2093	1867	1838	1838
q21	5578	5447	5386	5386
q22	655	563	567	563
Total cold run time: 71113 ms
Total hot run time: 59990 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 197769 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ccad421c5072729e13a277000d4e9236994c607b, data reload: false

query1	1250	964	948	948
query2	6230	2080	2039	2039
query3	10951	4439	4404	4404
query4	67345	28907	23662	23662
query5	4949	470	468	468
query6	423	193	188	188
query7	5473	310	305	305
query8	325	250	247	247
query9	8617	2632	2627	2627
query10	418	249	244	244
query11	17068	15200	15974	15200
query12	156	111	103	103
query13	1455	447	431	431
query14	9929	7433	7393	7393
query15	214	174	187	174
query16	7068	457	477	457
query17	1421	563	569	563
query18	1774	308	296	296
query19	201	148	148	148
query20	121	118	111	111
query21	211	103	97	97
query22	4756	4695	4684	4684
query23	34865	34428	34470	34428
query24	5463	2491	2555	2491
query25	523	382	383	382
query26	652	155	159	155
query27	1837	279	288	279
query28	4539	2484	2453	2453
query29	691	412	426	412
query30	208	157	151	151
query31	1036	864	839	839
query32	65	53	56	53
query33	462	277	281	277
query34	952	506	534	506
query35	907	785	762	762
query36	1102	971	964	964
query37	121	71	76	71
query38	4481	4351	4380	4351
query39	1521	1477	1468	1468
query40	198	103	97	97
query41	42	43	42	42
query42	114	97	102	97
query43	536	495	488	488
query44	1248	826	833	826
query45	184	173	170	170
query46	1171	737	725	725
query47	2066	1949	1916	1916
query48	423	306	323	306
query49	717	416	392	392
query50	834	398	397	397
query51	7343	7062	7076	7062
query52	95	83	87	83
query53	252	178	179	178
query54	509	400	405	400
query55	78	74	73	73
query56	277	244	249	244
query57	1263	1100	1137	1100
query58	224	225	217	217
query59	3165	3087	2970	2970
query60	289	253	261	253
query61	133	123	128	123
query62	785	670	702	670
query63	222	197	193	193
query64	1339	674	669	669
query65	3308	3198	3246	3198
query66	691	301	296	296
query67	15941	15675	15737	15675
query68	3723	592	587	587
query69	425	255	246	246
query70	1176	1118	1048	1048
query71	387	252	245	245
query72	6377	4078	4083	4078
query73	765	350	354	350
query74	10204	9067	8964	8964
query75	3385	2664	2659	2659
query76	1839	1128	1126	1126
query77	461	273	265	265
query78	10531	9602	9509	9509
query79	1612	587	593	587
query80	1020	428	434	428
query81	483	231	245	231
query82	1240	120	116	116
query83	159	148	154	148
query84	282	68	69	68
query85	930	296	298	296
query86	330	292	307	292
query87	4771	4582	4696	4582
query88	3734	2158	2127	2127
query89	423	302	286	286
query90	2035	185	190	185
query91	137	100	106	100
query92	61	49	48	48
query93	1922	558	557	557
query94	760	294	282	282
query95	365	247	244	244
query96	617	278	273	273
query97	2895	2666	2675	2666
query98	226	188	195	188
query99	1603	1297	1305	1297
Total cold run time: 318105 ms
Total hot run time: 197769 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 33.24 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit ccad421c5072729e13a277000d4e9236994c607b, data reload: false

query1	0.03	0.03	0.06
query2	0.07	0.03	0.04
query3	0.23	0.08	0.07
query4	1.61	0.10	0.11
query5	0.43	0.44	0.40
query6	1.15	0.64	0.65
query7	0.02	0.01	0.01
query8	0.04	0.03	0.03
query9	0.58	0.51	0.50
query10	0.57	0.54	0.57
query11	0.13	0.10	0.10
query12	0.14	0.11	0.11
query13	0.60	0.61	0.58
query14	2.74	2.81	2.87
query15	0.90	0.82	0.82
query16	0.39	0.39	0.40
query17	1.05	1.04	1.02
query18	0.22	0.21	0.20
query19	1.90	1.85	1.95
query20	0.01	0.01	0.01
query21	15.37	0.61	0.60
query22	2.30	2.50	2.01
query23	16.88	1.13	0.79
query24	2.99	1.27	1.78
query25	0.27	0.24	0.13
query26	0.48	0.14	0.14
query27	0.06	0.04	0.04
query28	9.81	1.10	1.08
query29	12.56	3.28	3.27
query30	0.25	0.06	0.06
query31	2.88	0.38	0.37
query32	3.28	0.47	0.46
query33	2.97	3.06	3.09
query34	16.83	4.46	4.50
query35	4.50	4.48	4.47
query36	0.66	0.48	0.50
query37	0.09	0.06	0.06
query38	0.05	0.03	0.04
query39	0.03	0.02	0.02
query40	0.17	0.13	0.13
query41	0.08	0.02	0.02
query42	0.04	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 105.39 s
Total hot run time: 33.24 s

@koarz
Copy link
Contributor Author

koarz commented Dec 8, 2024

run p0

Copy link
Contributor

@zclllyybb zclllyybb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some be-ut to show we can't pass the check because of use wrong return type.(to mock FE send a function signature with wrong return type)

@@ -158,19 +167,21 @@ class SimpleFunctionFactory {
FunctionBasePtr get_function(const std::string& name, const ColumnsWithTypeAndName& arguments,
const DataTypePtr& return_type, const FunctionAttr& attr = {},
int be_version = BeExecVersionManager::get_newest_version()) {
std::string key_str = name;
std::string key_str, ori_name = name;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

split definition. use origin_name

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add comment on these two var

@@ -299,6 +299,10 @@ class FunctionBuilderImpl : public IFunctionBuilder {

ColumnNumbers get_arguments_that_are_always_constant() const override { return {}; }

// if a function's get_variadic_argument_types() not override and get_return_type_impl()
// result is not compile time be sure, the function should override return true
virtual bool dont_append_return_type_name_when_register_function() { return false; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

impl it in IFunctionBase, so we can do check in function factory.
if not possible, add comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the register_function() arg Creator& ptr is a FunctionBuilder so cant impl it in IFunctionBase it move to IFunctionBuilder

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add regression-test about null

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert this


@Override
public Expression withConstantArgs(Expression literal) {
return new Date(literal);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't return new ToDate(literal)? maybe adding Monotonic of Datev2, ToDate, ToDateV2 need some tests in partition prune. Can we choose to not add Monotonic for Datev2, ToDate, ToDateV2 in this pr?


@Override
public Expression withConstantArgs(Expression literal) {
return new Date(literal);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't return new ToDateV2(literal)?

@koarz
Copy link
Contributor Author

koarz commented Dec 13, 2024

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@@ -17,6 +17,7 @@

#pragma once

#include <glog/logging.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: 'glog/logging.h' file not found [clang-diagnostic-error]

#include <glog/logging.h>
         ^

@koarz
Copy link
Contributor Author

koarz commented Dec 13, 2024

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

be/test/vec/function/function_test_util.h Show resolved Hide resolved
be/test/vec/function/function_time_test.cpp Outdated Show resolved Hide resolved
@koarz
Copy link
Contributor Author

koarz commented Dec 13, 2024

run buildall

@koarz
Copy link
Contributor Author

koarz commented Dec 13, 2024

run performance

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 39.27% (10277/26171)
Line Coverage: 29.84% (85294/285811)
Region Coverage: 28.87% (43720/151448)
Branch Coverage: 25.32% (22119/87342)
Coverage Report: http://coverage.selectdb-in.cc/coverage/2c93a9b138e2cf40b39b9bba6afa41065258fa22_2c93a9b138e2cf40b39b9bba6afa41065258fa22/report/index.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants