-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Python] Minimal pyarrow installation should not require libparquet #39006
Comments
If I just remove
|
If I do the same for the released version 14.0 installed with conda, I however do get the same issue. Inspecting the That is not the case of the |
Locally I don't have parquet encryption enabled, and so after enabling that and rebuilding libarrow and pyarrow, now I see the dependency on libparquet. The original encryption support was added in #10450 (pyarrow 8.0.0), and recently we also expanded it to support datasets in #34616 (pyarrow 14.0). Checking my different conda envs with various pyarrow versions, it seems this dependency already existed before 14.0, but not yet for 8.0. It seems to have been added in pyarrow 10.0. |
I assume this is just because at the time of pyarrow 8.0, conda-forge hadn't yet enabled the parquet encryption feature it its builds. With its latest rebuilds, it has the feature enabled, and then it's clear this dependency was already introduced with pyarrow 8.0, so likely with the original PR adding encryption (#10450). And I think a likely culprit is the fact that the pyarrow C++ sources (which become libarrow_python.so) depend on libparquet (and pyarrow always depends on libarrow_python.so): Lines 335 to 339 in 6101d12
So if we want to fix this, we need to create a separate |
…row_python.so to new libarrow_python_parquet_encryption.so
…row_python.so to new libarrow_python_parquet_encryption.so
…thon.so to new libarrow_python_parquet_encryption.so (#39316) ### Rationale for this change If I build pyarrow with everything and then I remove some of the Arrow CPP .so in order to have a minimal build I can't import pyarrow because it requires libarrow and libparquet. This is relevant in order to have a minimal build for Conda. Please see the related issue for more information. ### What changes are included in this PR? Move libarrow parquet encryption for pyarrow to its own shared object. ### Are these changes tested? I will run extensive CI with extra python archery tests. ### Are there any user-facing changes? No, and yes :) There will be a new .so on pyarrow but shouldn't be relevant in my opinion. * Closes: #39006 Lead-authored-by: Raúl Cumplido <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
…row_python.so to new libarrow_python_parquet_encryption.so (apache#39316) ### Rationale for this change If I build pyarrow with everything and then I remove some of the Arrow CPP .so in order to have a minimal build I can't import pyarrow because it requires libarrow and libparquet. This is relevant in order to have a minimal build for Conda. Please see the related issue for more information. ### What changes are included in this PR? Move libarrow parquet encryption for pyarrow to its own shared object. ### Are these changes tested? I will run extensive CI with extra python archery tests. ### Are there any user-facing changes? No, and yes :) There will be a new .so on pyarrow but shouldn't be relevant in my opinion. * Closes: apache#39006 Lead-authored-by: Raúl Cumplido <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
…row_python.so to new libarrow_python_parquet_encryption.so (apache#39316) ### Rationale for this change If I build pyarrow with everything and then I remove some of the Arrow CPP .so in order to have a minimal build I can't import pyarrow because it requires libarrow and libparquet. This is relevant in order to have a minimal build for Conda. Please see the related issue for more information. ### What changes are included in this PR? Move libarrow parquet encryption for pyarrow to its own shared object. ### Are these changes tested? I will run extensive CI with extra python archery tests. ### Are there any user-facing changes? No, and yes :) There will be a new .so on pyarrow but shouldn't be relevant in my opinion. * Closes: apache#39006 Lead-authored-by: Raúl Cumplido <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
Describe the bug, including details regarding any error messages, version, and platform.
As part of the work that is being done on conda to divide pyarrow on a minimal package (
pyarrow-base
) I've realised that currently in order to import pyarrow we require to havelibparquet
present otherwise I get the following error:We should be able to have a minimal installation and import pyarrow without parquet.
Component(s)
Python
The text was updated successfully, but these errors were encountered: