[BugFix] isLocalBucketShuffleJoin return wrong result (backport #51954) #51981
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Why I'm doing:
No data sent from the right side of Bucket Shuffle Right Join Build (ExchangeSink)
First, the Bucket seqs 3, 5, 7, 8, 9 of the left table in the Bucket shuffle join were pruned.
Generally, when bucket seqs are pruned, the instance id on the receiver side in the channel map is recorded as -1 to indicate no data is being sent. However, for right outer join, the NULL values from the build side need to be output, so for such cases, any valid instance id must be assigned to the pruned bucket seq.
2. The issue where NULL values on the HashJoin Build side need to be output, and assigning any instance id to the pruned bucketSeqs
Cases where NULL values on the build side need to be output:
a. Right join, full join, Right Anti JOIN
b. Null-safe-eq join:
<=>
. It seems this is not currently considered in the logic, further confirmation is required.Determine whether NULL values on the build side need to be output by
isRightOrFullBucketShuffleFragment
.isRightOrFullBucketShuffleFragment
is expected to betrue
, but is actuallyfalse
.Top-Down Determination of
isRightOrFullBucketShuffle
: depends on first visitedIn the erroneous plan, there are three JOINS, and the above bucket shuffle left join is the first visited, directly determining
isRightOrFullBucketShuffle = false
.What I'm doing:
Fix function isLocalBucketShuffleJoin to return correct result.
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check:
This is an automatic backport of pull request #51954 done by [Mergify](https://mergify.com). ## Why I'm doing:
No data sent from the right side of Bucket Shuffle Right Join Build (ExchangeSink)
First, the Bucket seqs 3, 5, 7, 8, 9 of the left table in the Bucket shuffle join were pruned.
Generally, when bucket seqs are pruned, the instance id on the receiver side in the channel map is recorded as -1 to indicate no data is being sent. However, for right outer join, the NULL values from the build side need to be output, so for such cases, any valid instance id must be assigned to the pruned bucket seq.
2. The issue where NULL values on the HashJoin Build side need to be output, and assigning any instance id to the pruned bucketSeqs
Cases where NULL values on the build side need to be output:
a. Right join, full join, Right Anti JOIN
b. Null-safe-eq join:
<=>
. It seems this is not currently considered in the logic, further confirmation is required.Determine whether NULL values on the build side need to be output by
isRightOrFullBucketShuffleFragment
.isRightOrFullBucketShuffleFragment
is expected to betrue
, but is actuallyfalse
.Top-Down Determination of
isRightOrFullBucketShuffle
: depends on first visitedIn the erroneous plan, there are three JOINS, and the above bucket shuffle left join is the first visited, directly determining
isRightOrFullBucketShuffle = false
.What I'm doing:
Fix function isLocalBucketShuffleJoin to return correct result.
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist: