-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement] check partition scan number limit in resource group #53916
base: main
Are you sure you want to change the base?
Conversation
if (partitionScanNumberLimitRule != null) { | ||
twg.setPartition_scan_number_limit_rule(partitionScanNumberLimitRule); | ||
} | ||
|
||
twg.setExclusive_cpu_cores(getNormalizedExclusiveCpuCores()); | ||
|
||
twg.setVersion(version); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most risky bug in this code is:
The method getPartitionScanNumberLimitRule()
may return null, which can lead to a NullPointerException
if the returned value is used without a null check. This risk arises when clients of this method assume that the rule is never null and don't implement proper error handling.
You can modify the code like this:
public String getPartitionScanNumberLimitRule() {
return partitionScanNumberLimitRule != null ? partitionScanNumberLimitRule : "";
}
This modification ensures that the method returns an empty string instead of null, reducing the chance of a NullPointerException
.
public void setCheckPartitionScanNumberLimitWhenExplain(boolean checkPartitionScanNumberLimitWhenExplain) { | ||
this.checkPartitionScanNumberLimitWhenExplain = checkPartitionScanNumberLimitWhenExplain; | ||
} | ||
|
||
// Serialize to thrift object | ||
// used for rest api | ||
public TQueryOptions toThrift() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most risky bug in this code is:
Missing serialization for new variables checkPartitionScanNumberLimit
and checkPartitionScanNumberLimitWhenExplain
in toThrift()
method.
You can modify the code like this:
// Serialize to thrift object
// used for rest api
public TQueryOptions toThrift() {
TQueryOptions queryOptions = new TQueryOptions();
// Existing serialization logic here
// Add serialization for new variables
queryOptions.setCheckPartitionScanNumberLimit(checkPartitionScanNumberLimit);
queryOptions.setCheckPartitionScanNumberLimitWhenExplain(checkPartitionScanNumberLimitWhenExplain);
return queryOptions;
}
" group " + workGroup.getName()); | ||
} | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The most risky bug in this code is:
Potential NullPointerException
when accessing workGroup.getPartition_scan_number_limit_rule()
.
You can modify the code like this:
private void checkPartitionScanNumberLimit() throws StarRocksException {
if (connectContext.isExplain()
&& !connectContext.getSessionVariable().isCheckPartitionScanNumberLimitWhenExplain()) {
return;
}
if (!connectContext.getSessionVariable().isCheckPartitionScanNumberLimit()) {
return;
}
TWorkGroup workGroup = jobSpec.getResourceGroup();
if (workGroup == null || workGroup.getPartition_scan_number_limit_rule() == null) {
return;
}
Map<String, Integer> rule = GsonUtils.GSON.fromJson(workGroup.getPartition_scan_number_limit_rule(),
new TypeToken<Map<String, Integer>>() {
}.getType());
for (ScanNode scanNode : jobSpec.getScanNodes()) {
if (!(scanNode instanceof OlapScanNode)) {
continue;
}
TableName tableName =
connectContext.getResolvedTables().get(((OlapScanNode) scanNode).getOlapTable().getId());
if (tableName == null) {
continue;
}
String tblName = tableName.getDb() + "." + tableName.getTbl();
Integer limit = rule.get(tblName);
if (limit == null) {
continue;
}
if (((OlapScanNode) scanNode).getSelectedPartitionIds().size() > limit) {
throw new StarRocksException(tblName + " scans more than " + limit +
" partition(s), which violates the limit defined in partition_scan_number_limit_rule in resource" +
" group " + workGroup.getName());
}
}
}
saw a similar PR to this partition scan limit in #53747 might be good to put together and discuss how this feature can be designed. |
5e68e43
to
5695465
Compare
Signed-off-by: kaijian.ding <[email protected]>
|
[Java-Extensions Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
[FE Incremental Coverage Report]✅ pass : 59 / 67 (88.06%) file detail
|
[BE Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
Why I'm doing:
Want to limit the number of scanned partitions to prevent starrocks from heavy load
What I'm doing:
Fixes #issue
Add a limit rule in resource group.

eg :
partition_scan_number_limit_rule: {"db_1.tbl_1":7, "db_1.tbl_2":10}
If table scan in a single sub-query exceeds the limit, the query will be rejected.
What type of PR is this:
Does this PR entail a change in behavior?
Checklist:
Bugfix cherry-pick branch check: