Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deps: Bump to OpenDAL v0.4 #4678

Merged
merged 6 commits into from
Apr 3, 2022
Merged

deps: Bump to OpenDAL v0.4 #4678

merged 6 commits into from
Apr 3, 2022

Conversation

Xuanwo
Copy link
Member

@Xuanwo Xuanwo commented Apr 2, 2022

Signed-off-by: Xuanwo [email protected]

I hereby agree to the terms of the CLA available at: https://databend.rs/dev/policies/cla/

Summary

This PR will bump OpenDAL to v0.4, read v0.4.0 release note to know more.

For databend, this PR will address the following problems:

Changelog

  • Bug Fix
  • Improvement

Test Plan

Unit Tests

Stateless Tests

@vercel
Copy link

vercel bot commented Apr 2, 2022

This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployment, click below or on the icon next to each commit.

🔍 Inspect: https://vercel.com/databend/databend/J8b129EDBSxKdjgHoSBR3kXpHeai
✅ Preview: Canceled

[Deployment for 6656f45 canceled]

@mergify
Copy link
Contributor

mergify bot commented Apr 2, 2022

Thanks for the contribution!
I have applied any labels matching special text in your PR Changelog.

Please review the labels and make any necessary changes.

@mergify mergify bot added pr-bugfix this PR patches a bug in codebase pr-improvement labels Apr 2, 2022
Signed-off-by: Xuanwo <[email protected]>
Signed-off-by: Xuanwo <[email protected]>
@Xuanwo Xuanwo marked this pull request as ready for review April 2, 2022 14:32
@Xuanwo Xuanwo requested a review from BohuTANG as a code owner April 2, 2022 14:32
@Xuanwo Xuanwo requested review from dantengsky and sundy-li April 2, 2022 14:34
@Xuanwo
Copy link
Member Author

Xuanwo commented Apr 3, 2022

In this PR, I also fix some clippy warnings around benches.

The most interesting one:

error: `assert!(false)` should probably be replaced
  --> query/benches/suites/mod.rs:41:9
   |
41 |         assert!(false)
   |         ^^^^^^^^^^^^^^
   |
   = note: `-D clippy::assertions-on-constants` implied by `-D warnings`
   = help: use `panic!()` or `unreachable!()`
   = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#assertions_on_constants

It seems our CI doesn't cover benches code. Ask @everpcpc for verification.

Locally, I use cargo clippy --all-targets -- -D warnings

@everpcpc
Copy link
Member

everpcpc commented Apr 3, 2022

It seems our CI doesn't cover benches code. Ask @everpcpc for verification.

Locally, I use cargo clippy --all-targets -- -D warnings

yes, we only run cargo clippy --tests -- -D warnings in ci now: https://github.com/datafuselabs/databend/blob/main/.github/actions/check/action.yml#L35

it seems came from e264ce4#diff-99b87178e201388c0ebfd7d8584c9f6930100d335919c10b7399b39a88cb0c03R27

we'd better change to --all-targets

@BohuTANG
Copy link
Member

BohuTANG commented Apr 3, 2022

Checking one question with this PR:

ERROR 1105 (HY000): Code: 3003, displayText = object permission denied: (op: stat, path: fuse/1/5/_ss/8878969040bd4d7aa5d28297cff1f321, source: response part: Parts { status: 403, version: HTTP/1.1, headers: {"x-amz-request-id": "H7JES5RX8545F3YH", "x-amz-id-2": "aSmaaQtvISxLeRmeb+4cepo3UUk1TbO0anxBdLF5wd9smLB7ANxO7ZcSa6xMe/4KPO94OamIjZQ=", "content-type": "application/xml", "date": "Sun, 03 Apr 2022 06:51:16 GMT", "server": "AmazonS3"} }, body: "").

The bucket path:

fuse/1 -- exists
fuse/1/5 -- not exists

Why the response status is 403 not 404?
https://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html#RESTErrorResponses

But I have try a COPY command it returns 404(This is the expected):

mysql> copy into ontime from 's3://databend-external/t_ontime/1/y_ontime.csv'  FILE_FORMAT = (type = "CSV" field_delimiter = '\t'  record_delimiter = '\n' skip_header =
1) size_limit=10;
ERROR 1105 (HY000): Code: 3006, displayText = Object { kind: ObjectNotExist, op: "stat", path: "t_ontime/1/y_ontime.csv", source: response part: Parts { status: 404, version: HTTP/1.1, headers: {"x-amz-request-id": "8DAN843RB5CX2ACX", "x-amz-id-2": "HjhuHaAkktiXMrirjAdiMrDpUkgR9hiO4mle/PxEpdG1ixKIinQ3INGvtnSG5YtXAluf5ktJBFE=", "content-type": "application/xml", "date": "Sun, 03 Apr 2022 07:06:34 GMT", "server": "AmazonS3"} }, body: "" }.
s3://databend-external/t_ontime/ -- exists
s3://databend-external/t_ontime/1/ -- not exists

@BohuTANG
Copy link
Member

BohuTANG commented Apr 3, 2022

Client:

mysql> select version();
+----------------------------------------------------------------------------------------------------+
| version()                                                                                          |
+----------------------------------------------------------------------------------------------------+
| DatabendQuery v0.7.5-nightly-f972974-simd(rust-1.61.0-nightly-2022-04-03T06:15:12.697451893+00:00) |
+----------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
Read 1 rows, 1 B in 0.005 sec., 197.8 rows/sec., 197.8 B/sec.

mysql> select count() from ontime_not_null;                                                                                                                              
ERROR 1105 (HY000): Code: 3002, displayText = op: stat, path: fuse/1/5/_ss/8878969040bd4d7aa5d28297cff1f321, source: response part: Parts { status: 403, version: HTTP/1.1, headers: {"x-amz-request-id": "8VPEPRCGVAXTQ69D", "x-amz-id-2": "Qk37WcBdiP8v7uNI+DYapaaFojsG//WJNTi/1u6p29qL8cWOTly1MV0sD929loZ3lyc1MpbETHs=", "content-type": "application/xml", "date": "Sun, 03 Apr 2022 07:16:46 GMT", "server": "AmazonS3"} }, body: "".

Server error log:

^[[2m2022-04-03T07:16:46.569495Z^[[0m ^[[31mERROR^[[0m ^[[2mdatabend_query::servers::mysql::writers::query_result_writer^[[0m^[[2m:^[[0m OnQuery Error: Code: 3002, displayText = op: stat, path: fuse/1/5/_ss/8878969040bd4d7aa5d28297cff1f321, source: response part: Parts { status: 403, version: HTTP/1.1, headers: {"x-amz-request-id": "8VPEPRCGVAXTQ69D", "x-amz-id-2": "Qk37WcBdiP8v7uNI+DYapaaFojsG//WJNTi/1u6p29qL8cWOTly1MV0sD929loZ3lyc1MpbETHs=", "content-type": "application/xml", "date": "Sun, 03 Apr 2022 07:16:46 GMT", "server": "AmazonS3"} }, body: "".

   0: common_exception::exception_code::<impl common_exception::exception::ErrorCode>::StoragePermissionDenied
             at /home/ubuntu/bohu/github/databend/common/exception/src/exception_code.rs:36:66
      common_exception::exception_into::<impl core::convert::From<std::io::error::Error> for common_exception::exception::ErrorCode>::from
             at /home/ubuntu/bohu/github/databend/common/exception/src/exception_into.rs:111:44
   1: <core::result::Result<T,F> as core::ops::try_trait::FromResidual<core::result::Result<core::convert::Infallible,E>>>::from_residual
             at /rustc/8769f4ef2fe1efddd1f072485f97f568e7328f79/library/core/src/result.rs:2064:27
      <&databend_query::sessions::query_ctx::QueryContext as databend_query::storages::fuse::io::read::meta_readers::BufReaderProvider>::buf_reader::{{closure}}
             at /home/ubuntu/bohu/github/databend/query/src/storages/fuse/io/read/meta_readers.rs:139:28
      <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/8769f4ef2fe1efddd1f072485f97f568e7328f79/library/core/src/future/mod.rs:91:19
   2: <core::pin::Pin<P> as core::future::future::Future>::poll
             at /rustc/8769f4ef2fe1efddd1f072485f97f568e7328f79/library/core/src/future/future.rs:124:9
      databend_query::storages::fuse::io::read::meta_readers::<impl databend_query::storages::fuse::io::read::cached_reader::Loader<databend_query::storages::fuse::meta::v1::snapshot::TableSnapshot> for T>::load::{{closure}}
             at /home/ubuntu/bohu/github/databend/query/src/storages/fuse/io/read/meta_readers.rs:114:55
      <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/8769f4ef2fe1efddd1f072485f97f568e7328f79/library/core/src/future/mod.rs:91:19
   3: <core::pin::Pin<P> as core::future::future::Future>::poll
             at /rustc/8769f4ef2fe1efddd1f072485f97f568e7328f79/library/core/src/future/future.rs:124:9
      databend_query::storages::fuse::io::read::cached_reader::CachedReader<T,L>::load::{{closure}}
             at /home/ubuntu/bohu/github/databend/query/src/storages/fuse/io/read/cached_reader.rs:98:59
      <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/8769f4ef2fe1efddd1f072485f97f568e7328f79/library/core/src/future/mod.rs:91:19
   4: databend_query::storages::fuse::io::read::cached_reader::CachedReader<T,L>::read::{{closure}}
             at /home/ubuntu/bohu/github/databend/query/src/storages/fuse/io/read/cached_reader.rs:62:68
      <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/8769f4ef2fe1efddd1f072485f97f568e7328f79/library/core/src/future/mod.rs:91:19
      databend_query::storages::fuse::fuse_table::FuseTable::read_table_snapshot::{{closure}}::{{closure}}
             at /home/ubuntu/bohu/github/databend/query/src/storages/fuse/fuse_table.rs:202:57
      <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/8769f4ef2fe1efddd1f072485f97f568e7328f79/library/core/src/future/mod.rs:91:19
   5: databend_query::storages::fuse::fuse_table::FuseTable::read_table_snapshot::{{closure}}

Error code line:
https://github.com/datafuselabs/databend/blob/f972974a41638064ed7bef024e9af57b409c2b71/query/src/storages/fuse/io/read/meta_readers.rs#L139

@Xuanwo
Copy link
Member Author

Xuanwo commented Apr 3, 2022

ERROR 1105 (HY000): Code: 3002, displayText = op: stat, path: fuse/1/5/_ss/8878969040bd4d7aa5d28297cff1f321, source: response part: Parts { status: 403, version: HTTP/1.1, headers: {"x-amz-request-id": "8VPEPRCGVAXTQ69D", "x-amz-id-2": "Qk37WcBdiP8v7uNI+DYapaaFojsG//WJNTi/1u6p29qL8cWOTly1MV0sD929loZ3lyc1MpbETHs=", "content-type": "application/xml", "date": "Sun, 03 Apr 2022 07:16:46 GMT", "server": "AmazonS3"} }, body: "".

Oh, the error message doesn't look nice, we need it print out message like "object permission denied", I will fix it.

Fixed in 6656f45


And to this error, can you try using the same credential to send a HEAD request to the same object (take version/tenant prefix into consideration) fuse/1/5/_ss/8878969040bd4d7aa5d28297cff1f321?

For example:

aws s3api head-object --bucket=bucket --key=prefix/to/fuse/1/5/_ss/8878969040bd4d7aa5d28297cff1f321
  • If aws s3api returns 200 or 404, opendal is buggy, and I'm willing to fix it.
  • If aws s3api returns 403 too, that means maybe we have a bucket policy here which forbid the requests.

@BohuTANG
Copy link
Member

BohuTANG commented Apr 3, 2022

ERROR 1105 (HY000): Code: 3002, displayText = op: stat, path: fuse/1/5/_ss/8878969040bd4d7aa5d28297cff1f321, source: response part: Parts { status: 403, version: HTTP/1.1, headers: {"x-amz-request-id": "8VPEPRCGVAXTQ69D", "x-amz-id-2": "Qk37WcBdiP8v7uNI+DYapaaFojsG//WJNTi/1u6p29qL8cWOTly1MV0sD929loZ3lyc1MpbETHs=", "content-type": "application/xml", "date": "Sun, 03 Apr 2022 07:16:46 GMT", "server": "AmazonS3"} }, body: "".

Oh, the error message doesn't look nice, we need it print out message like "object permission denied", I will fix it.

And to this error, can you try using the same credential to send a HEAD request to the same object (take version/tenant prefix into consideration) fuse/1/5/_ss/8878969040bd4d7aa5d28297cff1f321?

For example:

aws s3api head-object --bucket=bucket --key=prefix/to/fuse/1/5/_ss/8878969040bd4d7aa5d28297cff1f321
  • If aws s3api returns 200 or 404, opendal is buggy, and I'm willing to fix it.
  • If aws s3api returns 403 too, that means maybe we have a bucket policy here which forbid the requests.

Thank you for your useful information, have a try and response is 403:

 aws s3api head-object --bucket=databend-shared --key=fuse/1/5/_ss/8878969040bd4d7aa5d28297cff1f321

An error occurred (403) when calling the HeadObject operation: Forbidden

And i have check an exists key:

aws s3api head-object --bucket=databend-shared --key=fuse/1/1/_ss/92318ff55f694b51b59f119a7c39eda8_v1.json

Unknown output type: s3

Signed-off-by: Xuanwo <[email protected]>
@Xuanwo
Copy link
Member Author

Xuanwo commented Apr 3, 2022

Unknown output type: s3

Do you have set output type for aws? Please checkout your ~/.aws/config.

With the same command:

:( aws s3api head-object --bucket=databend-shared --key=fuse/1/1/_ss/92318ff55f694b51b59f119a7c39eda8_v1.json

{
    "AcceptRanges": "bytes",
    "LastModified": "Wed, 30 Mar 2022 07:36:55 GMT",
    "ContentLength": 19254,
    "ETag": "\"d1308468d48c4d8e5572e4f66f3665b8\"",
    "VersionId": "KtVbGRhs8apEOin2ZNsa5T44bciFfupD",
    "ContentType": "binary/octet-stream",
    "Metadata": {}
}
:) aws s3api head-object --bucket=databend-shared --key=fuse/1/1/_ss/92318ff55f694b51b59f119a7c39eda8_v1.jsonx


An error occurred (403) when calling the HeadObject operation: Forbidden

@Xuanwo
Copy link
Member Author

Xuanwo commented Apr 3, 2022

For a quick workaround, specify output in your command:

aws s3api head-object --bucket=databend-shared --key=fuse/1/1/_ss/92318ff55f694b51b59f119a7c39eda8_v1.json --output=json

@BohuTANG
Copy link
Member

BohuTANG commented Apr 3, 2022

Works:

aws s3api head-object --bucket=databend-shared --key=fuse/1/2/_ss/095e11cc31e34a818c36479099308d06_v1.json --output=json
{
    "AcceptRanges": "bytes",
    "LastModified": "Wed, 30 Mar 2022 12:55:56 GMT",
    "ContentLength": 49208,
    "ETag": "\"d424a5210da9a49419f308b2217236e8\"",
    "VersionId": "_4e2wY5ewTUNddKhh2WaKvHaonUyBxQh",
    "ContentType": "binary/octet-stream",
    "Metadata": {}
}

The not exists path head-object returns 403:

aws s3api head-object --bucket=databend-shared --key=fuse/1/2non/_ss/095e11cc31e34a818c36479099308d06_v1.json --output=json

An error occurred (403) when calling the HeadObject operation: Forbidden

@Xuanwo
Copy link
Member Author

Xuanwo commented Apr 3, 2022

It seems an expected behavior of AWS S3: https://stackoverflow.com/questions/19037664/how-do-i-have-an-s3-bucket-return-404-instead-of-403-for-a-key-that-does-not-e

S3 returns a 403 instead of a 404 when the user doesn't have permission to list the bucket contents.

If you query for an object and receive a 404, then you know that object doesn't exist. This is information you shouldn't know if you don't have permission to list the bucket contents so instead of telling you it doesn't exist, S3 just tells you that you're trying to do something you're not allowed to do.

Copy link
Member

@BohuTANG BohuTANG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great to me

@BohuTANG BohuTANG merged commit cf6da3e into databendlabs:main Apr 3, 2022
@Xuanwo Xuanwo deleted the bump-opendal branch April 3, 2022 12:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need-review pr-bugfix this PR patches a bug in codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

copy into don't support special filename
4 participants