Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-43966: [Java] Check for nullabilities when comparing StructVector #43968

Merged
merged 1 commit into from
Sep 9, 2024
Merged

GH-43966: [Java] Check for nullabilities when comparing StructVector #43968

merged 1 commit into from
Sep 9, 2024

Conversation

hellishfire
Copy link
Contributor

@hellishfire hellishfire commented Sep 5, 2024

Rationale for this change

See #43966

What changes are included in this PR?

Check for nullabilities when comparing StructVector with RangeEqualsVisitor.

Are these changes tested?

Yes

Are there any user-facing changes?

No

Copy link

github-actions bot commented Sep 5, 2024

⚠️ GitHub issue #43966 has been automatically assigned in GitHub to PR creator.

@vibhatha
Copy link
Collaborator

vibhatha commented Sep 5, 2024

@hellishfire let's make the PR title as same as the issue title.

@@ -354,6 +355,18 @@ protected boolean compareStructVectors(Range range) {
return false;
}

if (leftVector instanceof StructVector || rightVector instanceof StructVector) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we do this check for all vectors not just StructVector?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing is this visitor interface is used for all vectors, and specifying a logic just for a single vector type doesn't make much sense. Further, if we are to check the range considering nulls, it should be for all vectors?

Copy link
Contributor Author

@hellishfire hellishfire Sep 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vibhatha
There's already null comparison for other types of vector, as in other compareXXXVectors methods.
The instanceof check here is to see whether either struct vector is nullable. If neither is nullable, there's no need to compare nullability.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I misinterpreted.

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Sep 5, 2024
@hellishfire
Copy link
Contributor Author

hellishfire commented Sep 5, 2024

@hellishfire let's make the PR title as same as the issue title.

I renamed the issue title to be the same as PR title

@vibhatha
Copy link
Collaborator

vibhatha commented Sep 5, 2024

Overall LGTM!

Copy link
Collaborator

@vibhatha vibhatha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making this PR.

@vibhatha
Copy link
Collaborator

vibhatha commented Sep 6, 2024

@lidavidm appreciate your review and also please approve the workflows.

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do null slots still compare equal when the child values are different? I'd like to see a unit test for that. (They should, but it looks like we just directly dispatch to RangeEqualsVisitor.)

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Sep 6, 2024
@hellishfire
Copy link
Contributor Author

Do null slots still compare equal when the child values are different? I'd like to see a unit test for that. (They should, but it looks like we just directly dispatch to RangeEqualsVisitor.)

Currently not. Further code change is needed to achieve this.

@vibhatha
Copy link
Collaborator

vibhatha commented Sep 6, 2024

Probably we should address that as well. I think it is in scope. Sorry I missed that.

@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Sep 6, 2024
@hellishfire
Copy link
Contributor Author

@vibhatha @lidavidm
I rewrote the compare logic and related tests

writeStructVector(writer2, 2, 20L);
writeStructVector(writer2, 3, 30L);
writeStructVector(writer2, 4, 40L);
writeStructVector(writer2, 0, 0L);
writer2.setValueCount(5);

RangeEqualsVisitor visitor = new RangeEqualsVisitor(vector1, vector2);
assertTrue(visitor.rangeEquals(new Range(1, 1, 3)));
assertTrue(visitor.rangeEquals(new Range(2, 1, 3)));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe write a new test to evaluate the new case?

@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting change review Awaiting change review labels Sep 9, 2024
@lidavidm lidavidm merged commit d0f9f3e into apache:main Sep 9, 2024
16 of 17 checks passed
@lidavidm lidavidm removed the awaiting merge Awaiting merge label Sep 9, 2024
Copy link

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit d0f9f3e.

There were 7 benchmark results indicating a performance regression:

The full Conbench report has more details. It also includes information about 309 possible false positives for unstable benchmarks that are known to sometimes produce them.

khwilson pushed a commit to khwilson/arrow that referenced this pull request Sep 14, 2024
…ector (apache#43968)

### Rationale for this change

See apache#43966

### What changes are included in this PR?

Check for nullabilities when comparing StructVector with RangeEqualsVisitor.

### Are these changes tested?

Yes

### Are there any user-facing changes?
No

* GitHub Issue: apache#43966

Authored-by: youming.whl <[email protected]>
Signed-off-by: David Li <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants