-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++] S3 Finalization functions need to exist even when S3 isn't enabled. #36974
Comments
I agree with a finalization API that can be called with/without S3 makes user codes simpler. We can use the following code for now: #include <arrow/util/config.h>
#ifdef ARROW_S3
# include <arrow/filesystem/s3fs.h>
#endif
// ...
#ifdef ARROW_S3
arrow::fs::EnsureS3Finalized();
#endif You don't need to use |
A compile time check is insufficient since it binds our application to a given build of Arrow. The E.g. We build on machines without S3, but run on machines with S3 support. |
Does this mean that you build your module's binary with S3 disabled Apache Arrow C++ and your user uses your built-binary with S3 enabled Apache Arrow C++ (that may be built by your user not you)? Why do you build your module with S3 disabled Apache Arrow C++? |
Because it dramatically simplifies the build env requirements. We should also remember that the |
We don't support the build style. (We don't guarantee ABI for different build options binaries.)
Does the "package config" means Note: In C++ level, we can use |
Yes I meant the pkg-config files. But I missed in your first explanation that The entire situtation around the AWS SDK requiring an explicit end of program shutdown call is rather unforunate. I wish we had a cleaner approach but that is best discussed in another issue. Thank you for the help. |
Describe the bug, including details regarding any error messages, version, and platform.
Before Arrow 12 you could have c++ consumers of Arrow conditional use S3 without knowing if the Arrow they are linking to has S3 support enabled.
But with Arrow 12 due to AWS SDK issues ( #33858 ) creation of an S3 filesystem
now requires an explicit call to
fs::FinalizeS3()
during shutdown of the application. Since theFinalizeS3
function is in the optionally compiledcpp/src/arrow/filesystem/s3fs.cc
there is no easy way for a C++ user to determine if they can callFinalizeS3
.I believe that the correct solution is that Arrow needs some generalized shutdown API that is always provided. This API would be aware if S3 support is enabled and call the function when needed. Currently this can't be done by consumers since
S3
support can't be safely determined by consumers:The current solution to work around this, is that every C++ user/consumer needs to use
dlopen
( or equivalant on windows ) on eitherlibArrow
or the root executable for static builds (dlopen(NULL)
) and check if theFinalizeS3
functions exists.Component(s)
C++
The text was updated successfully, but these errors were encountered: