Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-21761: [Python] accept pyarrow values / scalars in constructor functions ? #34658

Closed
wants to merge 6 commits into from

Conversation

AlenkaF
Copy link
Member

@AlenkaF AlenkaF commented Mar 21, 2023

Rationale for this change

Currently, pyarrow.array doesn't accept list of pyarrow Scalars and this PR adds a check to allow that.

@AlenkaF AlenkaF added this to the 12.0.0 milestone Mar 21, 2023
@github-actions
Copy link

@jorisvandenbossche
Copy link
Member

This seems to work?
It won't work for generators of pyarrow scalars (because they will already be consumed the first time we call _sequence_to_array. I am not sure we have to worry about that case (since this PR just adding some extra convenience anyway). But we should maybe at least test those cases to ensure it errors (and eg not silently return an empty array)

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Mar 28, 2023
@AlenkaF
Copy link
Member Author

AlenkaF commented Mar 28, 2023

This seems to work?

Yes, it is working for sequences with pa.Scalars.

It won't work for generators of pyarrow scalars (because they will already be consumed the first time we call _sequence_to_array.

Oh yes, that is true (missed it).

I am not sure we have to worry about that case (since this PR just adding some extra convenience anyway). But we should maybe at least test those cases to ensure it errors (and eg not silently return an empty array)

Agree, will add a test 👍

@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Mar 29, 2023
@AlenkaF AlenkaF removed this from the 12.0.0 milestone Apr 6, 2023
result = _ndarray_to_array(values, mask, type, c_from_pandas, safe,
pool)
except ArrowInvalid as err:
if "Scalar" in str(err):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relying on a substring within the error message seems a bit hacky. Is there any other way to check by chance?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, it's very hacky! =)
I have tried multiple ways of catching the specific ArrowInvalid error as others are expected here also, but all of them have some issue. Can try to remember them all and list them, but Joris advised me to go into the Python C++ code and use ConvertToSequenceAndInferSize so this will be changed. I should convert this PR into a draft to make this clear.

Thank you for reviewing!

@AlenkaF AlenkaF marked this pull request as draft April 20, 2023 16:18
@AlenkaF AlenkaF added this to the 13.0.0 milestone May 17, 2023
@AlenkaF
Copy link
Member Author

AlenkaF commented Jun 19, 2023

Closing this PR in favour of #36162.

@AlenkaF AlenkaF closed this Jun 19, 2023
@AlenkaF AlenkaF deleted the gh-21761-array-accepting-scalars branch June 19, 2023 10:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Python] accept pyarrow values / scalars in constructor functions ?
3 participants