You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
explode_outer and explode_outer_position both appear to not propagate the null value to child struct columns when the input column is an empty list or null.
If I have data like LIST<STRUCT<KEY: INT32, VALUE: INT32>> that is.
[]// Empty List
null // Null Value for the list
And then I do a an explode_outer_position just because it shows where the problems are better than an explode_outer, followed by getting the key and value children from the struct I get something that looks like.
pos
key
value
null
12849
99
null
0
0
but the key and value change each time I run it because it looks like the memory for that value is not initialized and it looks like the validity of the parent struct is not pushed down into the children.
I thought that generally for structs the validity of the parent was going to always be pushed down into the children. But that is not the case here, and from reading the code it appears to actually be an issue with gather. So this problem may be a lot more wide-spread than just explode_outer*.
Steps/Code to reproduce bug
See above
Expected behavior
I would expect the validity of the parent struct to be pushed down into the child for data that we computed ourselves.
The text was updated successfully, but these errors were encountered:
I thought that generally for structs the validity of the parent was going to always be pushed down into the children
From my recollection, that was only true of the struct column factories. In general, this is outside of the Arrow spec and so we don't currently guarantee this everywhere.
In short, explode_outer() on a List<Struct<Key,Value>> column whose second row is null produces a STRUCT column with the row corresponding to the null row also set to null. The nulls are pushed down to the children.
Describe the bug
explode_outer and explode_outer_position both appear to not propagate the null value to child struct columns when the input column is an empty list or null.
If I have data like
LIST<STRUCT<KEY: INT32, VALUE: INT32>>
that is.[]
// Empty Listnull
// Null Value for the listAnd then I do a an
explode_outer_position
just because it shows where the problems are better than an explode_outer, followed by getting the key and value children from the struct I get something that looks like.but the key and value change each time I run it because it looks like the memory for that value is not initialized and it looks like the validity of the parent struct is not pushed down into the children.
I thought that generally for structs the validity of the parent was going to always be pushed down into the children. But that is not the case here, and from reading the code it appears to actually be an issue with gather. So this problem may be a lot more wide-spread than just explode_outer*.
Steps/Code to reproduce bug
See above
Expected behavior
I would expect the validity of the parent struct to be pushed down into the child for data that we computed ourselves.
The text was updated successfully, but these errors were encountered: