-
Notifications
You must be signed in to change notification settings - Fork 933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Constructing from an unbound sequence doesn't raise/hangs #13049
Comments
Is the title of this issue accurate? Unbound sequences also cause pandas to fail: import pandas as pd
def count_forever():
i = 1
while True:
yield i
i +=1
pd.Series(count_forever()) # Does not return. I believe this is actually a special case where pandas seems to get stuck in the function The class |
Thanks, @bdice. Happy to chnage the title to something else. I used "unbound sequence" loosely here to refer to an object that is effectively a |
I think the issue is a bit similar to #13056. I think this should be treated as an issue with constructing a Series from a scalar (here, the scalar is an instance of an object that defines getitem!) and not an issue with unbound data. Here, the getitem method should never be called and we should error at the point that instances of A have no supported GPU dtype. Getting into the question of “is this iterable unbound” feels like solving the halting problem. 😅 I don’t think we must be any smarter than pandas at this task. If an infinite loop is provided, that is a user problem. |
This is the problem :( Pandas does return here. And since we cannot, we should probably raise. |
Yes. And it should be an “incompatible type” error in your example, not an “unbound sequence” error. |
I discussed offline with @shwina. We concluded we should check that the input passes We also found a bug in PyArrow which I reported upstream: apache/arrow#34944 |
Fixes: #13049 This PR allows errors from pyarrow to be propagated when an un-bounded sequence is passed to `pa.array` constructor. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Bradley Dice (https://github.com/bdice) URL: #13799
The following snippet never completes:
In Pandas, a Series of object data type is returned:
At the very least, I think cuDF should be able to detect and raise (
ValueError
) in this pathological case.The text was updated successfully, but these errors were encountered: