Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for S3 Express One Zone #3465

Merged
merged 22 commits into from
Mar 11, 2024
Merged

Add support for S3 Express One Zone #3465

merged 22 commits into from
Mar 11, 2024

Conversation

ysaito1001
Copy link
Contributor

@ysaito1001 ysaito1001 commented Mar 6, 2024

Motivation and Context

Allows the Rust SDK to use S3 Express One Zone

Description

The PR adds the said S3-specific functionality to the Rust SDK. The code changes have already been reviewed by previous sub PRs, but it's worth going through them again as a whole:

In addition to the PRs above, commit eebe8af increases the canary lambda's memory size to 512MB from 128MB (also makes it configurable through a command line arg for canary-runner). By default, lambda's allowed memory size is 128MB but with the addition of canary-wasm in main, canary lambda's memory usage will be 152MB, causing the lambda to be killed by a signal during runtime. The commit addresses that issue.

Testing

Checklist

  • I have updated CHANGELOG.next.toml if I made changes to the AWS SDK, generated SDK code, or SDK runtime crates

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

ysaito1001 and others added 14 commits February 9, 2024 22:42
…directed for S3 Express use case (#3386)

## Motivation and Context
This PR is the first in the series to support the S3 Express feature in
the Rust SDK. The work will be done in the feature branch, and once it
is code complete, the branch will be merged to main.

## Description
This PR adds placeholder types for S3 Express and enables control flow
to be redirected for S3 Express use case. For instance, if we run the
following example code against a generated SDK from this PR:
```rust
let shared_config = aws_config::from_env().region(aws_sdk_s3::config::Region::new("us-east-1")).load().await;
let client = aws_sdk_s3::Client::new(&shared_config);
client.list_objects_v2().bucket("testbucket--use1-az4--x-s3";).send().await.unwrap();
```
it will end up
```
thread 's3_express' panicked at 'not yet implemented', /Users/awsaito/src/smithy-rs/aws/sdk/build/aws-sdk/sdk/s3/src/s3_express.rs:104:13
```
which points to
```
impl ProvideCredentials for DefaultS3ExpressIdentityProvider {
    fn provide_credentials<'a>(&'a self) -> aws_credential_types::provider::future::ProvideCredentials<'a>
    where
        Self: 'a,
    {
        todo!() <---
    }
}
```

### Implementation decisions
- `DefaultS3ExpressIdentityProvider` has an accompanying identity cache.
That identity cache cannot be configured by customers so it makes sense
for the provider itself to internally own it. In that case, we do NOT
want to use the identity cache stored in `RuntimeComponents`, since it
interferes with the S3 Express's caching policy. To that end, I added an
enum `CacheLocation` to `SharedIdentityResolver` (it already had the
`cache_partition` field so it was kind of aware of caching).
- Two reasons why `CacheLocation` is added to `SharedIdentityResolver`,
but not to individual, concrete `IdentityResolver`s. One is
`SharedIdentityResolver` was already cache aware, as mentioned above.
The other is that it is more flexible that way; The cache location is
not tied to a type of identity resolver, but we can select it when
creating a `SharedIdentityResolver`.
- I considered but did not add a field `cacheable` to `Identity` since I
wanted to keep `Identity` as plain data, keeping the concept of
"caching" somewhere outside.
- I've added a separate `Config` method,
`set_express_credentials_provider`, to override credentials provider for
S3 Express. There are other SDKs (e.g.
[Ruby](https://www.rubydoc.info/gems/aws-sdk-s3/Aws/S3/Client)) that
follow this style and it makes it clear to the customers that this is
the method to use when overriding the express credentials provider. The
existing `set_credentials_provider`, given its input type, cannot tell
whether a passed-in credentials provider is for a regular `sigv4` or for
S3 Express.

## Testing
Only verified that control flow could be altered for an S3 Express use
case, as shown above. Further testing will be added in subsequent PRs.

## Checklist
I am planning to include in `CHANGELOG.next.toml` a user guide for S3
Express once the feature branch `ysaito/s3express` is ready to be merged
to main.

----

_By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice._

---------

Co-authored-by: John DiSanti <[email protected]>
Co-authored-by: AWS SDK Rust Bot <[email protected]>
Co-authored-by: AWS SDK Rust Bot <[email protected]>
Co-authored-by: Zelda Hessler <[email protected]>
## Motivation and Context
Adds an implementation spike to allow `list-objects-v2` (possibly
others, haven't tested yet) to run against an S3 Express bucket.

## Description
This PR implements two ingredients, `S3ExpressIdentityProvider` and
`S3ExpressSigner`. `S3ExpressIdentityProvider` uses an internal S3
client to obtain an S3 Express session token that is passed to
`S3ExpressSigner`. `S3ExpressSigner` then signs a request with that
token, using effectively sigv4 but with session token omitted and an
extra header added instead, `x-amz-s3session-token`.

In addition, this PR supports presigning for S3 Express. Similarly to
signing headers, presigning for S3 Express excludes a query param
`X-Amz-Security-Token` and instead uses `X-Amz-S3session-Token` for the
signing query params. The following screeshot shows that a presigned URL
from `get_object` works for an S3 Express bucket:
<p align="center">
<img width="600" alt="chain-provider-ext-timeout-2"
src="https://github.com/smithy-lang/smithy-rs/assets/15333866/40d7bb53-d936-4d0d-8f95-0323725e2111">
</p>

Some implementation details:
- Since `S3ExpressIdentityProvider` passes an S3 Express bucket name for
S3's `create_session` API to obtain an S3 Express session token, it
needs to obtain the bucket name from somewhere.
`S3ExpressIdentityProvider::ProvideCredentials` I put previously did not
have enough arguments for us to figure this out, so I switched to
`S3ExpressIdentityProvider::ResolveIdentity` that takes enough
arguments.
- `SigV4Signer::sign_http_request` did not allow calling code to pass a
configured `SigningSettings`; The signer needs to exclude a header
`x-amz-security-token` and include `x-amz-s3session-token`. To make this
happen, I made `sigv4::extract_operation_config` and `sigv4::settings`
public APIs (previously private).
- One area I haven't quite figured out yet is how to configure the inner
S3 client to call `create_session`. The changes in this PR inherits
runtime components & config bag from the "outer" S3 client, but
customers may want to configure the inner S3 client in a more flexible
manner (e.g. operation timeout).

## Testing
To lock the behavior at this time, I added a connection recording test
for `list-objects-v2`.

----

_By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice._

---------

Co-authored-by: John DiSanti <[email protected]>
Co-authored-by: AWS SDK Rust Bot <[email protected]>
Co-authored-by: AWS SDK Rust Bot <[email protected]>
Co-authored-by: Zelda Hessler <[email protected]>
## Motivation and Context
Adds a default implementation for S3 Express identity cache.

## Description
This PR adds the said cache for S3 Express. This cache is not
configurable from outside and solely owned by the default S3 Express
identity provider. It is implemented in terms of an LRU cache keyed on a
string generated by `sha256hmac(random 64-byte key, access_key_id +
secret_key) + bucket_name` (note: `access_key_id` and `secret_key` are
for a customer's credentials but not for a retrieved `create_session`
API token).
Cache values are of type `ExpiringCache` that contains a session token
retrieved by S3's `create_session` API. When a customer is trying to use
a cached session token but if it has expired, `ExpiringCache` calls the
S3's `create_session` API, stores in it a new session token, and returns
it to the customer.

## Testing
Added unit tests for `S3IdentityCache` and a connection recording test
for `list-objects-v2` running against both express and regular buckets
to exercise a use case where a customer is switching between those
buckets.

----

_By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice._

---------

Co-authored-by: John DiSanti <[email protected]>
Co-authored-by: Russell Cohen <[email protected]>
Co-authored-by: AWS SDK Rust Bot <[email protected]>
Co-authored-by: AWS SDK Rust Bot <[email protected]>
Co-authored-by: Zelda Hessler <[email protected]>
## Motivation and Context
Changes the default request checksum to CRC32 for S3 Express.

## Description
This PR modifies the default request checksum to CRC32 for S3 Express.
Note that if a customer specifies a custom checksum algorithm via such
as
[checksum_algorithm](https://docs.rs/aws-sdk-s3/latest/aws_sdk_s3/operation/put_object/struct.PutObjectInput.html#structfield.checksum_algorithm),
then the PR does not affect anything and respects the customer-specified
checksum algorithm.

Otherwise, if an operation is annotated with a `requestChecksumRequired`
trait and set to true, the default checksum becomes relevant. The
default algorithm is currently `MD5` and S3 Express requires it to be
`CRC32`. The code changes in this PR reflect that requirement.

Implementation details include
- `RequestChecksumInterceptor` is a single-sourced place to handle
request checksum, but since it is a common place for all services we
cannot add S3-Express-specific logic in there to configure the default
checksum. Instead, we have introduced a slight more abstract concept of
overriding the default checksum via `DefaultRequestChecksumOverride`.
`RequestChecksumInterceptor` grabs it from a `ConfigBag` and uses it (if
found) to configure the default checksum.
- We deferred a point of adding a checksum to a request body to
`RequestChecksumInterceptor::modify_before_signing` from
`RequestChecksumInterceptor::modify_before_retry_loop`. This is because
we won't know whether S3 Express checksum default actually kicks until
an endpoint is fully resolved. `modify_before_retry_loop` occurs before
`client::orchestrator::endpoints::orchestrate_endpoint` so we need to
choose a different interceptor method that runs after it.
- I abandoned having a separate S3-specific `RequestChecksumInterceptor`
as it was more complicated than this approach. The wrinkle is that if we
have an S3-specific `RequestChecksumInterceptor`, then it needs to
either a) disable [the general
RequestChecksumInterceptor](https://github.com/smithy-lang/smithy-rs/blob/f7be5f801504e7edf5cb6b731f3400c37b298abc/aws/rust-runtime/aws-inlineable/src/http_request_checksum.rs#L79-L100)
entirely or b) only change the checksum default and delegate the rest to
the general `RequestChecksumInterceptor`. In both cases the general
`RequestChecksumInterceptor` must be disabled in runtime components,
otherwise a request checksum related header will incorrectly be added
twice. To disable the general `RequestChecksumInterceptor` using
[disable_interceptor](https://github.com/smithy-lang/smithy-rs/blob/main/rust-runtime/aws-smithy-runtime-api/src/client/interceptors.rs#L839),
we need to remove a generic parameter `AP` from
`RequestChecksumInterceptor` since we cannot enumerate all possible `AP`
when specifying it in `disable_interceptor`.
 
## Testing
Added integration tests for S3 Express default checksum to
`integration-tests/s3/tests/express.rs`.

----

_By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice._
## Motivation and Context
Adds the ability to disable S3 Express session auth (causing it to use a
regular sigv4 session auth instead).

## Description
S3 Express One Zone is an opt out feature, and there are three ways to
disable it (with the order of precedence as listed):
- through the `disable_s3_express_session_auth` method on an S3 client
- through an environment variable `AWS_S3_DISABLE_EXPRESS_SESSION_AUTH`
- through a profile file with a key `s3_disable_express_session_auth`
(won't be supported until
awslabs/aws-sdk-rust#1073 is addressed)
 
If one of the places is set to true/false, then the subsequent places
with lower precedence will not be considered.

Something to be aware of when setting the disable option through the
environment variable. The environment variable is only checked during a
client construction, meaning that if a customer sets it after the client
has been created the SDK will not take the environment variable value
into account, i.e. the following snippet will NOT disable the S3 Express
session auth:
```
let config = aws_config::load_from_env().await;
let client = aws_sdk_s3::Client::new(&config);

// Set the env variable to true after an S3 client has been created
std::env::set_var("AWS_S3_DISABLE_EXPRESS_SESSION_AUTH", "true");

let _ = client
    .list_objects_v2()
    .bucket("s3express-test-bucket--usw2-az1--x-s3")
    .send()
    .await;
``` 

## Testing
Added unit tests for `S3ExpressRuntimePlugin` and integration tests for
verifying disabling S3 Express session auth.

----

_By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice._
## Description
Add a test overriding S3 Express credentials provider.

When a customer passes a `CredentialsProvider` to
`.express_credentials_provider()` that produces a credential with a
session token, the request header should contain `x-amz-s3session-token`
with that session token.

## Testing
Added a new integration test
`support_customer_overriding_express_credentials_provider`.

----

_By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice._
…ugin` (#3457)

## Motivation and Context
In response to
#3455 (comment),
it's clear that we need to iterate more on the potential signing API.

This PR takes a different approach where `S3ExpressSigner` is dropped
and a session token name override, if any, is placed into and retrieved
from `ConfigBag`, which is used within `SigV4Signer::sign_http_request`.

## Testing
Existing tests in CI

A semver failure in CI can be ignored; A public API
SigV4Signer::sign_http_request has been removed that only existed in the
branch.

----

_By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice._
Copy link

github-actions bot commented Mar 6, 2024

A new generated diff is ready to view.

A new doc preview is ready to view.

Copy link

github-actions bot commented Mar 7, 2024

A new generated diff is ready to view.

A new doc preview is ready to view.

…ed) (#3462)

## Motivation and Context
Adds canary for S3 Express

## Description
This new canary tests the following APIs
- `ListDirectoryBuckets` (a new API for S3 Express)
- `ListObjectsV2` (converted from a connection recording test in
`integration-tests`)
- `Put/Get/DeleteObject` (the same set of APIs tested for regular S3
bucket and for MRAP in s3 canary)
- `CreateMultipartUpload/UploadPart/CompleteMultipartUpload` (operations
whose default checksum is `None` for S3 Express)
- `DeleteObjects` (an operation that requires request checksum and the
default checksum is CRC32)

## Testing
- verified s3 express canary passed in our release pipeline
- verified setting an environment variable `ENABLE_S3_EXPRESS_CANARY`
correctly skipped the s3 express canary in our release pipeline

----

_By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice._
@ysaito1001 ysaito1001 marked this pull request as ready for review March 8, 2024 03:55
@ysaito1001 ysaito1001 requested review from a team as code owners March 8, 2024 03:55
Copy link

github-actions bot commented Mar 8, 2024

A new generated diff is ready to view.

A new doc preview is ready to view.

Copy link

github-actions bot commented Mar 8, 2024

A new generated diff is ready to view.

A new doc preview is ready to view.

Copy link
Collaborator

@jdisanti jdisanti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the codegen diff, and most everything looks good. Mostly just comments on the checksum config piece.

Copy link

github-actions bot commented Mar 8, 2024

A new generated diff is ready to view.

A new doc preview is ready to view.

Copy link

github-actions bot commented Mar 8, 2024

A new generated diff is ready to view.

A new doc preview is ready to view.

Copy link
Collaborator

@jdisanti jdisanti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work on this! 🚀

#3474)

## Description
S3 express canary exposed a bug introduced in smithy-rs#3457 where the
code overwrote the regular SigV4 session token name with the S3
Expression session token name when it shouldn't.
```
    4: Error { code: "InvalidRequest", message: "CreateSession request should not include \"x-amz-s3session-token\"", aws_request_id: "01c0c8864e00018e20558a130509f37740b906e4", s3_extended_request_id: "DGTJhRqVMbZdAHQ" }
```
For APIs like `ListDirectoryBuckets` or `CreateSession`, we should not
overwrite `x-amz-security-token` with `x-amz-s3session-token` in the
request header.

In the said PR, `aws_sdk_s3::s3_express::utils::for_s3_express` did not
take into account auth schemes attached to a resolved endpoint, failing
to detect that it should not override the session token name for the
said APIs. This PR will resolve that issue.

## Testing
- Added an integration test verifying the fix (this test currently fails
when run in the destination branch, `ysaito/s3express`)
- Verified all canary (including wasm, s3express) passed

----

_By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice._
@ysaito1001 ysaito1001 enabled auto-merge March 11, 2024 15:39
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@ysaito1001 ysaito1001 added this pull request to the merge queue Mar 11, 2024
Merged via the queue into main with commit f3b332d Mar 11, 2024
41 checks passed
@ysaito1001 ysaito1001 deleted the ysaito/s3express branch March 11, 2024 16:27
@ysaito1001 ysaito1001 mentioned this pull request Apr 18, 2024
1 task
github-merge-queue bot pushed a commit that referenced this pull request Apr 18, 2024
## Motivation and Context
Running `cargo build --target wasm32-unknown-unknown
--no-default-features` on an S3 crate has stopped working since
#3465 with the following
error:
```
error: the wasm*-unknown-unknown targets are not supported by default, you may need to enable the "js" feature. For more information see: https://docs.rs/getrandom/#webassembly-support
   --> /Users/[REDACTED]/.cargo/registry/src/index.crates.io-6f17d22bba15001f/getrandom-0.2.14/src/lib.rs:352:9
    |
352 | /         compile_error!("the wasm*-unknown-unknown targets are not supported by \
353 | |                         default, you may need to enable the \"js\" feature. \
354 | |                         For more information see: \
355 | |                         https://docs.rs/getrandom/#webassembly-support");
    | |________________________________________________________________________^
```

To address the issue, this PR updates an S3's dependency on `ahash` in a
way that disables default features.

## Testing
Updated the existing test `integration-tests/webassembly` so that
`check-aws-sdk-standalone-integration-tests` will run `cargo check`
`aws-sdk-s3` against both `wasm32-wasi` and `wasm32-unknown-unknown`
(the updated check would break if we removed `default-features = false`
from the `ahash` dependency).

## Checklist
- [x] I have updated `CHANGELOG.next.toml` if I made changes to the AWS
SDK, generated SDK code, or SDK runtime crates

----

_By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice._
aws-sdk-rust-ci pushed a commit to awslabs/aws-sdk-rust that referenced this pull request Apr 22, 2024
## Motivation and Context
Running `cargo build --target wasm32-unknown-unknown
--no-default-features` on an S3 crate has stopped working since
smithy-lang/smithy-rs#3465 with the following
error:
```
error: the wasm*-unknown-unknown targets are not supported by default, you may need to enable the "js" feature. For more information see: https://docs.rs/getrandom/#webassembly-support
   --> /Users/[REDACTED]/.cargo/registry/src/index.crates.io-6f17d22bba15001f/getrandom-0.2.14/src/lib.rs:352:9
    |
352 | /         compile_error!("the wasm*-unknown-unknown targets are not supported by \
353 | |                         default, you may need to enable the \"js\" feature. \
354 | |                         For more information see: \
355 | |                         https://docs.rs/getrandom/#webassembly-support");
    | |________________________________________________________________________^
```

To address the issue, this PR updates an S3's dependency on `ahash` in a
way that disables default features.

## Testing
Updated the existing test `integration-tests/webassembly` so that
`check-aws-sdk-standalone-integration-tests` will run `cargo check`
`aws-sdk-s3` against both `wasm32-wasi` and `wasm32-unknown-unknown`
(the updated check would break if we removed `default-features = false`
from the `ahash` dependency).

## Checklist
- [x] I have updated `CHANGELOG.next.toml` if I made changes to the AWS
SDK, generated SDK code, or SDK runtime crates

----

_By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice._
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants