-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot query file from S3 #559
Comments
Hi @skamalj Are you building the cli locally? This should work if the |
Hello @thinkharderdev no I did not build the CLI. I just installed it using cargo. How do I enable s3 feature? |
I think you should be able to |
Thanks @thinkharderdev . works with ballista-cli now. I have now built scheduler and eexecutor as well with flag ballista-core/s3 and coonected to instance with this ballista-cli. I am getting missing region error. I have tried to set AWS_REGION and AWS_DEFAULT_REGION for both scheduler and ballista-cli shells. but same error. I have tested that this is finding the S3 location ok because create command fails if I give non-existent path. (base) kamal@Kamal:~/.aws$ ballista-cli --host localhost --port 50050 |
It is also necessary to register S3 related configuration in env when the
|
Yeah, both the scheduler and executor would need credentials for the S3 API. |
Thanks both @thinkharderdev and @r4ntix . It works when credentials are set for both.. |
Describe the bug
I am trying to query a csv file from S3. using ballista-cli. I get "No object store found" error
Ballista CLI v0.10.0
❯ CREATE EXTERNAL TABLE foo2 (a INT, b INT) STORED AS CSV LOCATION 's3://skamalj-s3/data.csv';
0 rows in set. Query took 0.000 seconds.
❯ SELECT * FROM foo2;
[2022-12-09T10:46:59Z ERROR ballista_scheduler::scheduler_server::query_stage_scheduler] Error planning job QnD94VQ: DataFusionError(Execution("No object store available for s3://skamalj-s3/data.csv"))
[2022-12-09T10:46:59Z ERROR ballista_scheduler::scheduler_server::query_stage_scheduler] Job QnD94VQ failed: Error planning job QnD94VQ: DataFusionError(Execution("No object store available for s3://skamalj-s3/data.csv"))
[2022-12-09T10:46:59Z ERROR ballista_core::execution_plans::distributed_query] Job QnD94VQ failed: Error planning job QnD94VQ: DataFusionError(Execution("No object store available for s3://skamalj-s3/data.csv"))
DataFusionError(ArrowError(ExternalError(Execution("Job QnD94VQ failed: Error planning job QnD94VQ: DataFusionError(Execution("No object store available for s3://skamalj-s3/data.csv"))"))))
❯
To Reproduce
Steps are copied above. data.csv file is created using command
$ echo "1,2" > data.csv
Expected behavior
Should return results similar to when queried from local csv.
Additional context
This is using cargo install for scheduler and executor. Version 0.10
aws credentials are set on the local machine and aws s3 ls command on same machine returns the listing.
The text was updated successfully, but these errors were encountered: