-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement] add partition scan num limit when query internal olap table #53747
[Enhancement] add partition scan num limit when query internal olap table #53747
Conversation
Signed-off-by: MatthewH00 <[email protected]>
Signed-off-by: hmx <[email protected]>
Signed-off-by: hmx <[email protected]>
Signed-off-by: hmx <[email protected]>
Signed-off-by: hmx <[email protected]>
@kevincai Hi Could you please review the pr when have free time? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should add UT to enforce the behavior of the session variable. Sqltester is an add-on testing in case it is too complicated to cover the code path in UT.
LOG.warn("fail to get variable scan_olap_partition_num_limit, set default value 0, msg: {}", e.getMessage()); | ||
} | ||
if (scanOlapPartitionNumLimit > 0 && selectedPartitionNum > scanOlapPartitionNumLimit) { | ||
String msg = "Exceeded the limit of " + scanOlapPartitionNumLimit + " max scan olap partitions. " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exceeded the limit of number of olap table partitions to be scanned. Number of partitions allowed: {}, number of partitions to be scanned: {}. Please adjust the SQL or change the limit ...
checkScanPartitionLimit(selectedPartitionIds.size()); | ||
} catch (AnalysisException e) { | ||
LOG.warn("{} queryId: {}", e.getMessage(), DebugUtil.printId(ConnectContext.get().getQueryId())); | ||
throw new StarRocksPlannerException(e.getMessage(), ErrorType.INTERNAL_ERROR); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
INTERNAL_ERROR or USER_ERROR?
you mean set variable diffrent value like |
direct Unit test cases are preferred in fe/fe-core/ |
Signed-off-by: hmx <[email protected]>
Signed-off-by: hmx <[email protected]>
Signed-off-by: hmx <[email protected]>
Signed-off-by: hmx <[email protected]>
Signed-off-by: hmx <[email protected]>
@kevincai Please review again. i add ut in the pr and fix the problem you raise above. |
Wondering why |
the function |
@kevincai Could you please help to push the pr find other R&D to review? For the |
Signed-off-by: hmx <[email protected]>
Signed-off-by: hmx <[email protected]>
@kaijianding I also tend to keep both. In your pr #53916 , you set the scan limit for each table, but the parameter name is partition_scan_number_limit_rule. Can the two parameters be unified? Set a scan limit that takes effect for all partitions instead of specifying it separately for each table. |
I think every table should have its own limit. In a complex query, a bigger table should have smaller limit, a smaller table may not be limited at all. |
The row limit of large tables should be larger than that of small tables, so is it okay to use just one value directly. Setting different rules for different tables seems uncommon. Moreover, the data of the tables is dynamic. If different values are set for each table, does the user need to adjust the rules frequently? Here, if only considering the resource usage limit, it is more reasonable to set the same threshold for all tables. |
This rule is to limit partition scan number, it's not row limit. In my prod env, this rule is not adjusted since it's creation due to we know which big tables should be limited with partitions scan number from beginning. |
I think the resource group is bound to the computing resources, rather than to the table or even the partition. The partition limit seems to be applicable only when the user clearly knows the size of their table. It is difficult to set this value in scenarios where the size cannot be clearly estimated. |
@kevincai The PR code review looks have passed last week. Could you help to merge it to the main branch when have free time? |
User can modify this rule after their table has data according to their query needs. It's easy to know the size of a table or a partition by Yes, I think the purpose to limit the partition scan number is because that there are limited computing resources, a query should be rejected if it can ocuppy too many resources. |
@Mergifyio backport branch-3.4 |
@Mergifyio backport branch-3.3 |
✅ Backports have been created
|
✅ Backports have been created
|
…able (#53747) Why I'm doing: when query big size internal olap table with full table scan or scan too many partitions, would cause BE/CN node high load, lead to cluster instability. What I'm doing: add a new FE session variable scan_olap_partition_num_limit to limit partition scan num when query internal olap table. (default value is 0, means no limitation) Signed-off-by: MatthewH00 <[email protected]> Signed-off-by: hmx <[email protected]> (cherry picked from commit a0a25b4)
…able (#53747) Why I'm doing: when query big size internal olap table with full table scan or scan too many partitions, would cause BE/CN node high load, lead to cluster instability. What I'm doing: add a new FE session variable scan_olap_partition_num_limit to limit partition scan num when query internal olap table. (default value is 0, means no limitation) Signed-off-by: MatthewH00 <[email protected]> Signed-off-by: hmx <[email protected]> (cherry picked from commit a0a25b4) # Conflicts: # fe/fe-core/src/main/java/com/starrocks/qe/SessionVariable.java
…able (backport #53747) (#54352) Co-authored-by: hmx <[email protected]>
…able (backport #53747) (#54353) Signed-off-by: Kevin Xiaohua Cai <[email protected]> Co-authored-by: hmx <[email protected]> Co-authored-by: Kevin Xiaohua Cai <[email protected]>
Refs StarRocks#54353, StarRocks#53747 Signed-off-by: Kevin Xiaohua Cai <[email protected]>
…able (StarRocks#53747) Why I'm doing: when query big size internal olap table with full table scan or scan too many partitions, would cause BE/CN node high load, lead to cluster instability. What I'm doing: add a new FE session variable scan_olap_partition_num_limit to limit partition scan num when query internal olap table. (default value is 0, means no limitation) Signed-off-by: MatthewH00 <[email protected]> Signed-off-by: hmx <[email protected]>
…able (backport StarRocks#53747) (StarRocks#54353) Signed-off-by: Kevin Xiaohua Cai <[email protected]> Co-authored-by: hmx <[email protected]> Co-authored-by: Kevin Xiaohua Cai <[email protected]>
Why I'm doing:
when query big size internal olap table with full table scan or scan too many partitions, would cause BE/CN node high load, lead to cluster instability.
What I'm doing:
add a new FE session variable
scan_olap_partition_num_limit
to limit partition scan num when query internal olap table.(default value is 0, means no limitation)
Fixes #issue
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check: