[BUG] Left joins on struct key producing incorrect null results for right table #13109
Labels
2 - In Progress
Currently a work in progress
bug
Something isn't working
libcudf
Affects libcudf (C++/CUDA) code.
Spark
Functionality that helps Spark RAPIDS
Describe the bug
After #12787 some RAPIDS Accelerator tests started failing for joins on struct keys. See NVIDIA/spark-rapids#8061. Examining the expected vs. actual data, there are nulls in the GPU result that are not expected, implying the row comparison is not working properly in some cases.
Oddly, an inner join on the same data seems to do the right thing.
Steps/Code to reproduce bug
I'm attaching two Parquet files, left and rightnotnull which are the the inputs to the following test. There should be 4 rows that have matches in the join results, but the GPU produces zero rows that have matches instead, so the right gather map indicates (incorrectly) all null rows.
left.gz
rightnotnull.gz
Expected behavior
Instead of producing all null results (i.e.: minint for all entries in the right gather map), there should be four not-null matches in the right gather map. Change to an inner join to see the expected matches.
The text was updated successfully, but these errors were encountered: