Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to select directly from S3 and GCS locations in datafusion-cli #9167

Closed
r3stl355 opened this issue Feb 8, 2024 · 1 comment · Fixed by #9199
Closed

Add ability to select directly from S3 and GCS locations in datafusion-cli #9167

r3stl355 opened this issue Feb 8, 2024 · 1 comment · Fixed by #9199
Assignees
Labels
enhancement New feature or request

Comments

@r3stl355
Copy link
Contributor

r3stl355 commented Feb 8, 2024

Is your feature request related to a problem or challenge?

Once the #9133 is merged, it should be fairly straightforward to implement a similar functionality for other supported types of remote location (S3 and GCS)

We would like to run queries like:

select * from 's3://my_bucket/my_data/foo.parquet';
select * from 'gcp://<location of data in gcp>'

Describe the solution you'd like

Code can be added here to register a relevant object store for a url, which can be obtained calling get_object_store arrow-datafusion/datafusion-cli/src/exec.rs (after making it pub(crate).

get_object_store requires options parameter which can be used to store the authentication information but for select there is no option parameter so an empty HashMap can be used which means the relevant auth details will be taken from the environmental variables

Describe alternatives you've considered

No response

Additional context

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
1 participant