-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Support S3 writes in chunked writer - ParquetDatasetWriter #10522
Comments
Thanks @sauravdev for sharing this use case. Would you please post a code sample and the error message, if any? |
Sure, below is sample code
and below is error: |
Resolves: #10522 This PR: - [x] Enables `s3` writing support in `ParquetDatasetWriter` - [x] Add's a work-around to reading an `s3` directory in `cudf.read_parquet`. Issue here: https://issues.apache.org/jira/browse/ARROW-16438 - [x] Introduces all the required `s3` python library combinations that will work together with such that `test_s3.py` can be run locally on dev environments. - [x] Improved the default `s3fs` error logs by changing the log level to `DEBUG` in pytests.(`S3FS_LOGGING_LEVEL`) Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) - Richard (Rick) Zamora (https://github.com/rjzamora) - Ayush Dattagupta (https://github.com/ayushdg) - Bradley Dice (https://github.com/bdice) URL: #10769
Parquet writes to external storage such as s3 should be possible using ParquetDatasetWriter, right now it errors out.
The text was updated successfully, but these errors were encountered: