Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

branch-2.1: [fix](nereids)Solve the problem of pruning wrong partitions in multi-column partition pruning #43658

Merged
merged 1 commit into from
Nov 12, 2024

Conversation

github-actions[bot]
Copy link
Contributor

Cherry-picked from #43332

…column partition pruning (#43332)

For example, with a partition defined as PARTITION BY RANGE (a, dt)
[(0, '2024-01-01 00:00:00'), (10, '2024-01-10 00:00:00')).
With the predicate:
WHERE a = 0 AND date_trunc(dt, 'day') <= '2024-01-10 00:00:00',

partition pruning will expand the partition ranges to:

a = 0, dt in ['2024-01-01 00:00:00', +∞)
a = 1, dt in (-∞, +∞)
a = 2, dt in (-∞, +∞)
...
a = 10, dt in (-∞, '2024-01-10 00:00:00')

Each of these eleven ranges will be evaluated against the predicate. If
all evaluations return False, the partition can be pruned.
During the evaluation of the first range
(a = 0, dt in ['2024-01-01 00:00:00', +∞)),
the range of date_trunc(dt, 'day') is calculated as
['2024-01-01', +∞) and stored in rangeMap.

However, subsequent evaluations (e.g., for a = 2, dt in (-∞, +∞)
 reuse this range ['2024-01-01', +∞),
which is incorrect. For a = 2, the correct range should be
(-∞, +∞) for date_trunc(dt, 'day').

Due to this incorrect reuse, the range for a = 2, dt in (-∞, +∞) will
incorrectly evaluate to False, causing improper pruning of the
partition.
The correct approach is to place rangeMap within the context, so that a
new rangeMap is constructed for each evaluation.
@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@doris-robot
Copy link

run buildall

@yiguolei yiguolei closed this Nov 12, 2024
@yiguolei yiguolei reopened this Nov 12, 2024
@yiguolei yiguolei merged commit 7b33574 into branch-2.1 Nov 12, 2024
20 of 21 checks passed
@dataroaring dataroaring deleted the auto-pick-43332-branch-2.1 branch December 27, 2024 07:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants