Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Null Policy of EqualJoin on StructType #7911

Closed
sperlingxx opened this issue Apr 8, 2021 · 2 comments
Closed

[BUG] Null Policy of EqualJoin on StructType #7911

sperlingxx opened this issue Apr 8, 2021 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@sperlingxx
Copy link
Contributor

sperlingxx commented Apr 8, 2021

Describe the bug
Currently, cuDF flatten all nested columns before performing the join. After flattening, validity masks of nested type columns are lost. It means that cuDF will regard join key composed by non-null struct who only contains single null child value as a null key by mistake.

For instance,
If the schema of join key is StructType([("a", LongType(nullable=True))], nullable=False), the non-null record Struct([("a", null)]) will be regarded as a null record because of the flattening of nested columns.

@sperlingxx sperlingxx added bug Something isn't working Needs Triage Need team to review and classify labels Apr 8, 2021
@jrhemstad
Copy link
Contributor

After flattening, validity masks of nested type columns are lost.

No they aren't. The child validity masks are projected into a BOOL column.

will regard join key composed by non-null struct who only contains single null child value as a null key

This is precisely why we project the validity mask into a BOOL column to avoid this.

{NULL} and NULL will not compare as equal because it will expand to TRUE, NULL and FALSE, NULL.

@sperlingxx
Copy link
Contributor Author

Hi @jrhemstad, thanks for correcting me! And I filed another issue(#7934) to describe my problem.

@bdice bdice removed the Needs Triage Need team to review and classify label Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants