Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-14495: [Python] Fix DictionaryArray.from_buffers, should not crash #13989

Merged
merged 4 commits into from
Sep 1, 2022
Merged

ARROW-14495: [Python] Fix DictionaryArray.from_buffers, should not crash #13989

merged 4 commits into from
Sep 1, 2022

Conversation

milesgranger
Copy link
Contributor

@github-actions
Copy link

@github-actions
Copy link

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

Requires modifying signature to take dictionary
as well as the buffers.
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!

python/pyarrow/array.pxi Outdated Show resolved Hide resolved
python/pyarrow/array.pxi Show resolved Hide resolved
python/pyarrow/tests/test_array.py Show resolved Hide resolved
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates!

with nogil:
c_data = CArrayData.Make(
c_type, length, c_buffers, null_count, offset)
c_data.get().dictionary = dictionary.sp_array.get().data()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a ArrayData::Make variant that directly accepts a dictionary, but that might not be worth exposing for just this one line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw that, and it appears to essentially do the same; creating ArrayData then setting the dictionary member. Using it this way, we don't need to pass (null?) child data, albeit, if that way is the preferred method I have no problem switching. 👍

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this. Just a minor suggestion.

python/pyarrow/array.pxi Outdated Show resolved Hide resolved
python/pyarrow/array.pxi Outdated Show resolved Hide resolved
milesgranger and others added 2 commits August 30, 2022 05:50
[skip ci]

Co-authored-by: Antoine Pitrou <[email protected]>
Co-authored-by: Antoine Pitrou <[email protected]>
@milesgranger
Copy link
Contributor Author

@pitrou I think this is ready for another round. :)

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you @milesgranger

@pitrou pitrou merged commit d3d6371 into apache:master Sep 1, 2022
@milesgranger milesgranger deleted the ARROW-14495_DictionaryArray-from_buffers-should-not-crash branch September 1, 2022 08:17
zagto pushed a commit to zagto/arrow that referenced this pull request Oct 7, 2022
fatemehp pushed a commit to fatemehp/arrow that referenced this pull request Oct 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants