Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make PyArrow Dependency Optional #354

Open
WillAyd opened this issue Oct 2, 2024 · 1 comment
Open

Make PyArrow Dependency Optional #354

WillAyd opened this issue Oct 2, 2024 · 1 comment
Labels
good first issue Good for newcomers

Comments

@WillAyd
Copy link
Collaborator

WillAyd commented Oct 2, 2024

Needs some investigation, but I think we have a feasible path to replacing pyarrow with arro3 internally. The only thing we use pyarrow for is to create a recordbatchreader and convert that into the appropriate end dataframe libraries.

If we can replace that with arro3, it should save a good deal of installation size

@WillAyd
Copy link
Collaborator Author

WillAyd commented Oct 25, 2024

Actually we don't need arro3 or pyarrow for cases where users opt for the capsule return type implemented in #378

If we dropped pyarrow as a dependency, we might just have to add a check in the reader like:

if return_type != "stream":
    import pyarrow as pa
    ... # handle error if not installed

@WillAyd WillAyd changed the title Replace PyArrow with arro3 Make PyArrow Dependency Optional Oct 25, 2024
@WillAyd WillAyd added the good first issue Good for newcomers label Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

1 participant