-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Python] Can't provide Arrow array when filling nulls of a fixed size list array #35624
Comments
Thanks for the report! The code in question is: arrow/python/pyarrow/compute.py Lines 520 to 525 in 6d3d2fc
So if the types don't match, it tries to convert the fill_value to a scalar of the correct type. There seems to be multiple things fishy about this: 1) this should probably cast to the correct type instead of going through >>> arr = pa.array([1, 2, None, 4, None])
>>> arr.fill_null(pa.array([10, 20, 30, 40, 50]))
<pyarrow.lib.Int64Array object at 0x7f41c56983a0>
[
1,
2,
30,
4,
50
]
>>> arr.fill_null(pa.array([10, 20, 30, 40, 50], type="int32"))
...
AttributeError: 'pyarrow.lib.Int32Array' object has no attribute 'as_py' Now, for your original example using a list array, the situation is a bit more complicated. Because for the case where you pass a pyarrow.Array, the types also don't match (list of float vs float), and automatically casting in that case also wouldn't work. But so if you pass a pyarrow object, I think we will need to require that you pass a "correct" scalar instead (because when passing an array, we assume that this is meant for filling with an array element-wise, and not for filling with a scalar). If you start from a pyarrow array, you can convert this to a scalar of the correct type before passing it to
So that's also something we should fix. |
Oh wow,
But it should say something like "Replace each null element in values with a corresponding element from fill_value." That's not a perfect phrasing, since it's a bit complicated and hard to fit into a short sentence. As I understand it, the behavior is:
This issue seems to fracture into three issues:
Does that seem right? |
Yeah, that's indeed certainly not properly documented. I am not sure this was fully intentional (maybe when it was added, mostly the scalar fill_value was considered), but this is essentially an alias for Your suggested rewording sounds good!
Yes, that's a good summary.
That's indeed a related issue. It's also related to #21761, for accepting pyarrow values in |
…35813) Discussed in #35624, particularly at #35624 (comment). Doesn't close that issue, but it came up in discussion. Lead-authored-by: Spencer Nelson <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
Describe the bug, including details regarding any error messages, version, and platform.
Suppose you have a fixed-sized list array with null values:
How do you fill the nulls? Like, maybe I want it to look like
[[null, null], [null, null], [null, null]]
. Or even[[0, 0], [0, 0], [0, 0]]
.You can't call
fill_null
with a pyarrow array value, because the arrays don't have an 'as_py
' method:But you can do it with a plain old Python list!
This seems like an oversight. If I give an array of the right shape, surely it should be permitted.
Component(s)
Python
The text was updated successfully, but these errors were encountered: