-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-39006: [Python] Extract libparquet requirements out of libarrow_python.so to new libarrow_python_parquet_encryption.so #39316
Conversation
|
@github-actions crossbow submit -g python |
This comment was marked as outdated.
This comment was marked as outdated.
Thanks! This looks good to me, but I am not a CMake expert ;) |
Although we have the So why was this build not failing? (building the C++ doesn't seem to enable Parquet in eg https://github.com/apache/arrow/blob/main/python/examples/minimal_build/build_venv.sh) |
If I would move parquet encryption to a separate .so I would do the same so nothing to comment. Regarding the minimal build that wasn't failing: |
The Windows failure seems related |
Thanks, I am testing a little further locally to validate the use case I am trying to fix works. I'll investigate the Windows failure. @AlenkaF I am also confused, my expectation if Arrow CPP is built without parquet and there's no |
ok, the fix seems to work for the use case I am trying to solve. I've built pyarrow with "everything":
after I do the following:
I get:
as expected. Without the change once I remove libparquet I can't import
|
Ah, so the reason it is working in the CI build is because there Parquet is not built at all (also C++ is not built with it, pyarrow is not built with it), and that works fine. But for conda, we actually do build pyarrow with support for Parquet, but then just leave out the parquet so library for the minimal install. And that's the case that was not working. |
The sphinx error is also related to the change. I'll have to investigate further that too. |
I'm curious, is this all that's needed? How does |
Probably not and I am missing something :) I'll keep working on it but I am surprised that no |
Ok, I'm testing this PR and I get: $ python -c "import pyarrow.parquet.encryption"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/home/antoine/arrow/dev/python/pyarrow/parquet/encryption.py", line 19, in <module>
from pyarrow._parquet_encryption import (CryptoFactory, # noqa
ImportError: /home/antoine/arrow/dev/python/pyarrow/_parquet_encryption.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN5arrow2py7parquet10encryption11PyKmsClientC1EP7_objectNS2_17PyKmsClientVtableE |
@github-actions crossbow submit -g python -g wheel |
This comment was marked as outdated.
This comment was marked as outdated.
@github-actions crossbow submit wheel-windows-* |
This comment was marked as outdated.
This comment was marked as outdated.
The Windows wheel failures look unrelated... |
The Windows wheels failures seem to have been failing on the nightly builds for some days. |
Is an issue open for it? |
I don't think so. I was planning to go over all the nightly failures and open issues in the next days in preparation for the release but I haven't had the time yet |
#39333 for Windows wheel build failure. |
@github-actions crossbow submit -g python -g wheel |
…row_python.so to new libarrow_python_parquet_encryption.so
…rquet_encryption.so
…ARQUET_LINK_LIBS from target_link_libraries
This comment was marked as outdated.
This comment was marked as outdated.
@github-actions crossbow submit -g python -g wheel |
I rebased to get the Windows wheel fix. |
Revision: 6ee40c9 Submitted crossbow builds: ursacomputing/crossbow @ actions-0f80c7792a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 51970e0. There was 1 benchmark result with an error:
There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 1 possible false positive for unstable benchmarks that are known to sometimes produce them. |
…row_python.so to new libarrow_python_parquet_encryption.so (apache#39316) ### Rationale for this change If I build pyarrow with everything and then I remove some of the Arrow CPP .so in order to have a minimal build I can't import pyarrow because it requires libarrow and libparquet. This is relevant in order to have a minimal build for Conda. Please see the related issue for more information. ### What changes are included in this PR? Move libarrow parquet encryption for pyarrow to its own shared object. ### Are these changes tested? I will run extensive CI with extra python archery tests. ### Are there any user-facing changes? No, and yes :) There will be a new .so on pyarrow but shouldn't be relevant in my opinion. * Closes: apache#39006 Lead-authored-by: Raúl Cumplido <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
…row_python.so to new libarrow_python_parquet_encryption.so (apache#39316) ### Rationale for this change If I build pyarrow with everything and then I remove some of the Arrow CPP .so in order to have a minimal build I can't import pyarrow because it requires libarrow and libparquet. This is relevant in order to have a minimal build for Conda. Please see the related issue for more information. ### What changes are included in this PR? Move libarrow parquet encryption for pyarrow to its own shared object. ### Are these changes tested? I will run extensive CI with extra python archery tests. ### Are there any user-facing changes? No, and yes :) There will be a new .so on pyarrow but shouldn't be relevant in my opinion. * Closes: apache#39006 Lead-authored-by: Raúl Cumplido <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
Rationale for this change
If I build pyarrow with everything and then I remove some of the Arrow CPP .so in order to have a minimal build I can't import pyarrow because it requires libarrow and libparquet. This is relevant in order to have a minimal build for Conda. Please see the related issue for more information.
What changes are included in this PR?
Move libarrow parquet encryption for pyarrow to its own shared object.
Are these changes tested?
I will run extensive CI with extra python archery tests.
Are there any user-facing changes?
No, and yes :) There will be a new .so on pyarrow but shouldn't be relevant in my opinion.