[BUG] explode_outer_position doesn't match to Spark's counterpart #7721

sperlingxx · 2021-03-25T10:36:18Z

Describe the bug
In cuDF, explode_outer_position will mark the position values of empty rows with 0. Meanwhile, the position values of empty rows are marked as null in Spark.

Steps/Code to reproduce bug
For input data like:

[[5,null,15], 100]
[null, 200]
[[], 300]

cuDF returns

[0, 5, 100]
[1, null, 100]
[2, 15, 100]
[0, null, 200]
[0, null, 300]

But Spark returns

[0, 5, 100]
[1, null, 100]
[2, 15, 100]
[null, null, 200]
[null, null, 300]

The text was updated successfully, but these errors were encountered:

hyperbolic2346 · 2021-03-25T22:50:23Z

I assume this holds for explode_position as well?

sperlingxx · 2021-03-26T02:05:58Z

I assume this holds for explode_position as well?

I think current explode_position implementation is identicial to Spark, since the empty/null elements of array will be regarded as other values.

@hyperbolic2346

… of null rows (#7754) `explode_outer` supports writing a position column, but if the row was null it would incorrectly set the position to 0 and the row valid. Instead, it should null that position row as well. Luckily the null column matches 100% with the null column of the exploded column, so we can just copy it after it is created. Fixes #7721 Authors: - Mike Wilson (@hyperbolic2346) Approvers: - Conor Hoekstra (@codereport) - Jake Hemstad (@jrhemstad) URL: #7754

sperlingxx added bug Something isn't working Needs Triage Need team to review and classify labels Mar 25, 2021

sperlingxx assigned hyperbolic2346 Mar 25, 2021

kkraus14 added libcudf Affects libcudf (C++/CUDA) code. and removed Needs Triage Need team to review and classify labels Mar 26, 2021

hyperbolic2346 mentioned this issue Mar 29, 2021

Fixing issue with explode_outer position not nulling position entries of null rows #7754

Merged

rapids-bot bot closed this as completed in #7754 Mar 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] explode_outer_position doesn't match to Spark's counterpart #7721

[BUG] explode_outer_position doesn't match to Spark's counterpart #7721

sperlingxx commented Mar 25, 2021

hyperbolic2346 commented Mar 25, 2021

sperlingxx commented Mar 26, 2021

[BUG] explode_outer_position doesn't match to Spark's counterpart #7721

[BUG] explode_outer_position doesn't match to Spark's counterpart #7721

Comments

sperlingxx commented Mar 25, 2021

hyperbolic2346 commented Mar 25, 2021

sperlingxx commented Mar 26, 2021