[fix](nereids)Solve the problem of pruning wrong partitions in multi-column partition pruning #43332

feiniaofeiafei · 2024-11-06T06:56:45Z

What problem does this PR solve?

For example, with a partition defined as PARTITION BY RANGE (a, dt) [(0, '2024-01-01 00:00:00'), (10, '2024-01-10 00:00:00')). With the predicate WHERE a = 0 AND date_trunc(dt, 'day') <= '2024-01-10 00:00:00', partition pruning will expand the partition ranges to:
a = 0, dt in ['2024-01-01 00:00:00', +∞)
a = 1, dt in (-∞, +∞)
a = 2, dt in (-∞, +∞)
...
a = 10, dt in (-∞, '2024-01-10 00:00:00')
Each of these eleven ranges will be evaluated against the predicate. If all evaluations return False, the partition can be pruned.
During the evaluation of the first range (a = 0, dt in ['2024-01-01 00:00:00', +∞)), the range of date_trunc(dt, 'day') is calculated as ['2024-01-01', +∞) and stored in rangeMap. However, subsequent evaluations (e.g., for a = 2, dt in (-∞, +∞)) reuse this range ['2024-01-01', +∞), which is incorrect. For a = 2, the correct range should be (-∞, +∞) for date_trunc(dt, 'day').
Due to this incorrect reuse, the range for a = 2, dt in (-∞, +∞) will incorrectly evaluate to False, causing improper pruning of the partition.
The correct approach is to place rangeMap within the context, so that a new rangeMap is constructed for each evaluation.

Issue Number: close #xxx

Related PR: introduced by #38849

Problem Summary:

Check List (For Committer)

Test
- Regression test
- Unit Test
- Manual test (add detailed scripts or steps below)
- No need to test or manual test. Explain why:
  - This is a refactor/code format and no logic has been changed.
  - Previous test can cover this change.
  - No colde files have been changed.
  - Other reason
Behavior changed:
- No.
- Yes.
Does this need documentation?
- No.
- Yes.
Release note

None

Check List (For Reviewer who merge this PR)

Confirm the release note
Confirm test cases
Confirm document
Add branch pick label

doris-robot · 2024-11-06T06:56:50Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

feiniaofeiafei · 2024-11-06T06:56:56Z

run buildall

feiniaofeiafei · 2024-11-06T07:49:50Z

run buildall

github-actions · 2024-11-07T03:21:09Z

PR approved by at least one committer and no changes requested.

github-actions · 2024-11-07T03:21:12Z

PR approved by anyone and no changes requested.

feiniaofeiafei · 2024-11-07T04:17:29Z

run buildall

morrySnow · 2024-11-07T07:33:36Z

run p0

feiniaofeiafei · 2024-11-09T10:27:38Z

run p0

feiniaofeiafei · 2024-11-11T02:14:11Z

run buildall

github-actions · 2024-11-11T06:18:09Z

PR approved by at least one committer and no changes requested.

…column partition pruning (#43332) For example, with a partition defined as PARTITION BY RANGE (a, dt) [(0, '2024-01-01 00:00:00'), (10, '2024-01-10 00:00:00')). With the predicate： WHERE a = 0 AND date_trunc(dt, 'day') <= '2024-01-10 00:00:00', partition pruning will expand the partition ranges to: a = 0, dt in ['2024-01-01 00:00:00', +∞) a = 1, dt in (-∞, +∞) a = 2, dt in (-∞, +∞) ... a = 10, dt in (-∞, '2024-01-10 00:00:00') Each of these eleven ranges will be evaluated against the predicate. If all evaluations return False, the partition can be pruned. During the evaluation of the first range (a = 0, dt in ['2024-01-01 00:00:00', +∞)), the range of date_trunc(dt, 'day') is calculated as ['2024-01-01', +∞) and stored in rangeMap. However, subsequent evaluations (e.g., for a = 2, dt in (-∞, +∞) reuse this range ['2024-01-01', +∞), which is incorrect. For a = 2, the correct range should be (-∞, +∞) for date_trunc(dt, 'day'). Due to this incorrect reuse, the range for a = 2, dt in (-∞, +∞) will incorrectly evaluate to False, causing improper pruning of the partition. The correct approach is to place rangeMap within the context, so that a new rangeMap is constructed for each evaluation.

…column partition pruning (apache#43332) For example, with a partition defined as PARTITION BY RANGE (a, dt) [(0, '2024-01-01 00:00:00'), (10, '2024-01-10 00:00:00')). With the predicate： WHERE a = 0 AND date_trunc(dt, 'day') <= '2024-01-10 00:00:00', partition pruning will expand the partition ranges to: a = 0, dt in ['2024-01-01 00:00:00', +∞) a = 1, dt in (-∞, +∞) a = 2, dt in (-∞, +∞) ... a = 10, dt in (-∞, '2024-01-10 00:00:00') Each of these eleven ranges will be evaluated against the predicate. If all evaluations return False, the partition can be pruned. During the evaluation of the first range (a = 0, dt in ['2024-01-01 00:00:00', +∞)), the range of date_trunc(dt, 'day') is calculated as ['2024-01-01', +∞) and stored in rangeMap. However, subsequent evaluations (e.g., for a = 2, dt in (-∞, +∞) reuse this range ['2024-01-01', +∞), which is incorrect. For a = 2, the correct range should be (-∞, +∞) for date_trunc(dt, 'day'). Due to this incorrect reuse, the range for a = 2, dt in (-∞, +∞) will incorrectly evaluate to False, causing improper pruning of the partition. The correct approach is to place rangeMap within the context, so that a new rangeMap is constructed for each evaluation.

…-column partition pruning (#43332) (#43664) cherry-pick from master #43332

…ns in multi-column partition pruning (#43658) Cherry-picked from #43332 Co-authored-by: feiniaofeiafei <[email protected]>

…column partition pruning (apache#43332) For example, with a partition defined as PARTITION BY RANGE (a, dt) [(0, '2024-01-01 00:00:00'), (10, '2024-01-10 00:00:00')). With the predicate： WHERE a = 0 AND date_trunc(dt, 'day') <= '2024-01-10 00:00:00', partition pruning will expand the partition ranges to: a = 0, dt in ['2024-01-01 00:00:00', +∞) a = 1, dt in (-∞, +∞) a = 2, dt in (-∞, +∞) ... a = 10, dt in (-∞, '2024-01-10 00:00:00') Each of these eleven ranges will be evaluated against the predicate. If all evaluations return False, the partition can be pruned. During the evaluation of the first range (a = 0, dt in ['2024-01-01 00:00:00', +∞)), the range of date_trunc(dt, 'day') is calculated as ['2024-01-01', +∞) and stored in rangeMap. However, subsequent evaluations (e.g., for a = 2, dt in (-∞, +∞) reuse this range ['2024-01-01', +∞), which is incorrect. For a = 2, the correct range should be (-∞, +∞) for date_trunc(dt, 'day'). Due to this incorrect reuse, the range for a = 2, dt in (-∞, +∞) will incorrectly evaluate to False, causing improper pruning of the partition. The correct approach is to place rangeMap within the context, so that a new rangeMap is constructed for each evaluation.

morrySnow added dev/2.1.x dev/3.0.x labels Nov 7, 2024

morrySnow previously approved these changes Nov 7, 2024

View reviewed changes

github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 7, 2024

github-actions bot added the reviewed label Nov 7, 2024

feiniaofeiafei dismissed morrySnow’s stale review via 85f09cd November 7, 2024 04:17

github-actions bot removed the approved Indicates a PR has been approved by one committer. label Nov 7, 2024

morrySnow added the p0_b label Nov 7, 2024

feiniaofeiafei added 3 commits November 11, 2024 10:13

fix multi column partition prune

c3434ea

put rangeMap in evaluate context:EvaluateRangeInput

3bf1320

fix regresion

cf9dd4c

feiniaofeiafei force-pushed the fix_multi_column_partition_prune_with_func branch from 85f09cd to cf9dd4c Compare November 11, 2024 02:14

morrySnow approved these changes Nov 11, 2024

View reviewed changes

github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 11, 2024

924060929 approved these changes Nov 12, 2024

View reviewed changes

morrySnow merged commit 1294bbb into apache:master Nov 12, 2024
27 of 28 checks passed

github-actions bot mentioned this pull request Nov 12, 2024

branch-3.0: [fix](nereids)Solve the problem of pruning wrong partitions in multi-column partition pruning #43657

Closed

github-actions bot mentioned this pull request Nov 12, 2024

branch-2.1: [fix](nereids)Solve the problem of pruning wrong partitions in multi-column partition pruning #43658

Merged

feiniaofeiafei mentioned this pull request Nov 12, 2024

[fix](nereids)Solve the problem of pruning wrong partitions in multi-column partition pruning (#43332) #43664

Merged

feiniaofeiafei mentioned this pull request Nov 12, 2024

[fix](nereids)Solve the problem of pruning wrong partitions in multi-column partition pruning (#43332) #43666

Closed

morrySnow pushed a commit that referenced this pull request Nov 12, 2024

[fix](nereids) Solve the problem of pruning wrong partitions in multi…

f650a16

…-column partition pruning (#43332) (#43664) cherry-pick from master #43332

morrySnow added dev/3.0.3-merged and removed dev/3.0.x labels Nov 12, 2024

yiguolei pushed a commit that referenced this pull request Nov 12, 2024

branch-2.1: [fix](nereids)Solve the problem of pruning wrong partitio…

7b33574

…ns in multi-column partition pruning (#43658) Cherry-picked from #43332 Co-authored-by: feiniaofeiafei <[email protected]>

yiguolei added dev/2.1.8-merged and removed dev/2.1.x labels Nov 12, 2024

gavinchou mentioned this pull request Nov 26, 2024

Release Note 3.0.3 #44522

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fix](nereids)Solve the problem of pruning wrong partitions in multi-column partition pruning #43332

[fix](nereids)Solve the problem of pruning wrong partitions in multi-column partition pruning #43332

feiniaofeiafei commented Nov 6, 2024 •

edited

Loading

doris-robot commented Nov 6, 2024

feiniaofeiafei commented Nov 6, 2024

feiniaofeiafei commented Nov 6, 2024

github-actions bot commented Nov 7, 2024

github-actions bot commented Nov 7, 2024

feiniaofeiafei commented Nov 7, 2024

morrySnow commented Nov 7, 2024

feiniaofeiafei commented Nov 9, 2024

feiniaofeiafei commented Nov 11, 2024

github-actions bot commented Nov 11, 2024

[fix](nereids)Solve the problem of pruning wrong partitions in multi-column partition pruning #43332

[fix](nereids)Solve the problem of pruning wrong partitions in multi-column partition pruning #43332

Conversation

feiniaofeiafei commented Nov 6, 2024 • edited Loading

What problem does this PR solve?

Check List (For Committer)

Check List (For Reviewer who merge this PR)

doris-robot commented Nov 6, 2024

feiniaofeiafei commented Nov 6, 2024

feiniaofeiafei commented Nov 6, 2024

github-actions bot commented Nov 7, 2024

github-actions bot commented Nov 7, 2024

feiniaofeiafei commented Nov 7, 2024

morrySnow commented Nov 7, 2024

feiniaofeiafei commented Nov 9, 2024

feiniaofeiafei commented Nov 11, 2024

github-actions bot commented Nov 11, 2024

feiniaofeiafei commented Nov 6, 2024 •

edited

Loading