Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: add support for aws-chunked content encoding #1501

Merged
merged 15 commits into from
Jul 5, 2022

Conversation

Velfi
Copy link
Contributor

@Velfi Velfi commented Jun 27, 2022

Motivation and Context

Supports the flexible checksums feature. When sending streaming requests we need to include checksums as trailers and we need aws-chunked content encoding to do that.

Description

This provides functionality specified in the flexible checksums RFC

Testing

I wrote a test

Checklist

  • I have updated CHANGELOG.next.toml if I made changes to the smithy-rs codegen or runtime crates
  • I have updated CHANGELOG.next.toml if I made changes to the AWS SDK, generated SDK code, or SDK runtime crates

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@Velfi Velfi force-pushed the feature/aws-chunked-content-encoding branch from ccde60a to f803a9e Compare June 27, 2022 17:28
@Velfi Velfi force-pushed the feature/aws-chunked-content-encoding branch from f803a9e to 1c69093 Compare June 27, 2022 17:29
@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

Copy link
Collaborator

@rcoh rcoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good so far! the state machine is very clean, just a couple spots that could be cleaned up / clarified

Comment on lines 225 to 230
let total_length_of_trailers_in_bytes =
this.options.trailer_lens.iter().sum();

Poll::Ready(Some(Ok(trailers_as_aws_chunked_bytes(
total_length_of_trailers_in_bytes,
trailers,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any check that these trailers actually match this.options.trailer_lens?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added one

aws/rust-runtime/aws-http/src/content_encoding.rs Outdated Show resolved Hide resolved
aws/rust-runtime/aws-http/src/content_encoding.rs Outdated Show resolved Hide resolved
aws/rust-runtime/aws-http/src/content_encoding.rs Outdated Show resolved Hide resolved
aws/rust-runtime/aws-http/src/content_encoding.rs Outdated Show resolved Hide resolved
aws/rust-runtime/aws-http/src/content_encoding.rs Outdated Show resolved Hide resolved
Comment on lines 49 to 52
pub fn with_trailer_len(mut self, trailer_len: u64) -> Self {
self.trailer_lens.push(trailer_len);
self
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this API will be awkward to use if you have a vec of trail lengths already...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the API so you can input a vec when creating the Options

Comment on lines 24 to 34
#[derive(Debug, Default)]
#[non_exhaustive]
pub struct AwsChunkedBodyOptions {
/// The total size of the stream. Because we only support unsigned encoding
/// this implies that there will only be a single chunk containing the
/// underlying payload.
pub stream_length: u64,
/// The length of each trailer sent within an `AwsChunkedBody`. Necessary in
/// order to correctly calculate the total size of the body accurately.
pub trailer_lens: Vec<u64>,
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't personally love Default on this because Default on this is a sort of broken empty state which probably isn't right

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not necessarily broken, it just applies to a stream with no body or trailers. Do you have a better API in mind?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored the API, let me know what you think

Comment on lines 254 to 259
SizeHint::with_exact(
self.encoded_length()
.expect("Requests made with aws-chunked encoding must have known size")
as u64,
)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should also have a double check on this vs. the size_hint on the inner body

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can only check that the encoded body is larger (due to chunk prefix, CRLF, trailers) than the size hint of the inner body but I can't think of how to test that other than by constructing a body that lies about its size.

Comment on lines 43 to 46
pub fn with_stream_length(mut self, stream_length: u64) -> Self {
self.stream_length = stream_length;
self
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one possible default could be to have stream_length delegate to the inner size_hint() if unset...guess that's kind of tricky with trailers though?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah size hint doesn't account for trailers

@Velfi Velfi mentioned this pull request Jun 27, 2022
2 tasks
CHANGELOG.next.toml Outdated Show resolved Hide resolved
CHANGELOG.next.toml Outdated Show resolved Hide resolved
Velfi and others added 4 commits June 28, 2022 16:38
update: AwsChunkedBody to note we only support single chunks
remove: unnecessary `Option`s
remove: overly smart total_length_of_trailers_in_bytes in trailers_as_aws_chunked_bytes
update: use "where"-style declaration for `impl<Inner> Body for AwsChunkedBody<Inner>`
add: helpful data to trace logging
add: trailer len double check in AwsChunkedBody::poll_data
add: test for trailer len double check
add: assert to size_hint
fix: incorrect body emitted when body is empty
add: test for empty encoded body
Co-authored-by: John DiSanti <[email protected]>
@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

Comment on lines 298 to 300
for value in values {
total += value.len() + ",".len();
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this match the actual serialization behavior for multiple values? I thought that hyper would add multiple of the same header name with a different value, but I'm not 100% sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it matches what Hyper would do in this situation but it I did test it with S3 and it accepted it. I can test an upload the hyper-style headers if you like.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I switched this to work more similarly to how hyper serializes headers. The serialization is done manually.

@@ -276,6 +282,30 @@ where
len
}

fn total_rendered_length_of_trailers(header_map: Option<&HeaderMap>) -> u64 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we should have a unit test that serializes the header map with the http crate and compares its length against this function. Especially with multiple headers of the same name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a test that ensures our trailer serializer and trailer length calculator stay in sync

Copy link
Collaborator

@rcoh rcoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall this looks great! I'd like to see more exhaustive testing (eg. an inner body that contains multiple chunks eg.)

aws/rust-runtime/aws-http/src/content_encoding.rs Outdated Show resolved Hide resolved
aws/rust-runtime/aws-http/src/content_encoding.rs Outdated Show resolved Hide resolved
aws/rust-runtime/aws-http/src/content_encoding.rs Outdated Show resolved Hide resolved
aws/rust-runtime/aws-http/src/content_encoding.rs Outdated Show resolved Hide resolved
aws/rust-runtime/aws-http/src/content_encoding.rs Outdated Show resolved Hide resolved
update: when aws-chunked formatting a `HeaderMap`, header names with multiple values will be written out one value per line
remove: unnecessary AwsChunkedBodyOptions::stream_length method
add: trace fields
refactor: make inserting the final body CRLF more explicit
add: test to ensure trailer encoding and trailer len calculation stay in sync
@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

/// not equal to the actual length of the trailers returned by the inner `http::Body`
/// implementor. These trailer lengths are necessary in order to correctly calculate the total
/// size of the body for setting the content length header.
ReportedTrailerLengthMismatch { actual: u64, expected: u64 },
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see @rcoh suggested making an error for this, but I think this is a case where we have a bug in our code, and there's little to no value for a customer in matching on this error. It's a case that simply should not happen if the code is correct, so an assertion was warranted.

On the flip side, with an assertion, it would silently allow the lengths to mismatch when compiled in release mode, which would likely result in a server-side error response. I think this is OK though, since a bug with this should not have been released in the first place if it was adequately tested.

We might want to proptest header names/values for the size prediction vs. the rendering to be absolutely certain they cannot mismatch. I think they can right now in the case where a header value has Unicode characters in it since the length predictor function is using string length instead of byte length.

Copy link
Collaborator

@jdisanti jdisanti Jun 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing we could consider is just getting rid of the length prediction entirely. Just render the trailers to a BytesMut when the length is initially needed, and store that BytesMut in the AwsChunkedBody for later.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jdisanti only debug_assert doesn't fail in release mode

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a new error for when stream_length is set incorrectly

fix: don't convert to str before getting len of HeaderValue
@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

…unks

add: double check that stream_length used to create an `AwsChunkedBody` is correct.
@github-actions
Copy link

A new generated diff is ready to view.

A new doc preview is ready to view.

Copy link
Collaborator

@rcoh rcoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment on lines +367 to +380
pin_project! {
struct SputteringBody {
parts: Vec<Option<Bytes>>,
cursor: usize,
delay_in_millis: u64,
}
}

impl SputteringBody {
fn len(&self) -> usize {
self.parts.iter().flat_map(|b| b).map(|b| b.len()).sum()
}
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is cool!

@github-actions
Copy link

github-actions bot commented Jul 5, 2022

A new generated diff is ready to view.

A new doc preview is ready to view.

@Velfi Velfi merged commit b77c66c into main Jul 5, 2022
@Velfi Velfi deleted the feature/aws-chunked-content-encoding branch July 5, 2022 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants