-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: add support for aws-chunked content encoding #1501
Conversation
ccde60a
to
f803a9e
Compare
f803a9e
to
1c69093
Compare
A new generated diff is ready to view.
A new doc preview is ready to view. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking good so far! the state machine is very clean, just a couple spots that could be cleaned up / clarified
let total_length_of_trailers_in_bytes = | ||
this.options.trailer_lens.iter().sum(); | ||
|
||
Poll::Ready(Some(Ok(trailers_as_aws_chunked_bytes( | ||
total_length_of_trailers_in_bytes, | ||
trailers, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any check that these trailers actually match this.options.trailer_lens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added one
pub fn with_trailer_len(mut self, trailer_len: u64) -> Self { | ||
self.trailer_lens.push(trailer_len); | ||
self | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this API will be awkward to use if you have a vec of trail lengths already...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the API so you can input a vec when creating the Options
#[derive(Debug, Default)] | ||
#[non_exhaustive] | ||
pub struct AwsChunkedBodyOptions { | ||
/// The total size of the stream. Because we only support unsigned encoding | ||
/// this implies that there will only be a single chunk containing the | ||
/// underlying payload. | ||
pub stream_length: u64, | ||
/// The length of each trailer sent within an `AwsChunkedBody`. Necessary in | ||
/// order to correctly calculate the total size of the body accurately. | ||
pub trailer_lens: Vec<u64>, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't personally love Default
on this because Default
on this is a sort of broken empty state which probably isn't right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not necessarily broken, it just applies to a stream with no body or trailers. Do you have a better API in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored the API, let me know what you think
SizeHint::with_exact( | ||
self.encoded_length() | ||
.expect("Requests made with aws-chunked encoding must have known size") | ||
as u64, | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should also have a double check on this vs. the size_hint on the inner body
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can only check that the encoded body is larger (due to chunk prefix, CRLF, trailers) than the size hint of the inner body but I can't think of how to test that other than by constructing a body that lies about its size.
pub fn with_stream_length(mut self, stream_length: u64) -> Self { | ||
self.stream_length = stream_length; | ||
self | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one possible default could be to have stream_length delegate to the inner size_hint()
if unset...guess that's kind of tricky with trailers though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah size hint doesn't account for trailers
update: AwsChunkedBody to note we only support single chunks remove: unnecessary `Option`s remove: overly smart total_length_of_trailers_in_bytes in trailers_as_aws_chunked_bytes update: use "where"-style declaration for `impl<Inner> Body for AwsChunkedBody<Inner>` add: helpful data to trace logging add: trailer len double check in AwsChunkedBody::poll_data add: test for trailer len double check add: assert to size_hint fix: incorrect body emitted when body is empty add: test for empty encoded body
Co-authored-by: John DiSanti <[email protected]>
A new generated diff is ready to view.
A new doc preview is ready to view. |
for value in values { | ||
total += value.len() + ",".len(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this match the actual serialization behavior for multiple values? I thought that hyper would add multiple of the same header name with a different value, but I'm not 100% sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it matches what Hyper would do in this situation but it I did test it with S3 and it accepted it. I can test an upload the hyper-style headers if you like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched this to work more similarly to how hyper
serializes headers. The serialization is done manually.
@@ -276,6 +282,30 @@ where | |||
len | |||
} | |||
|
|||
fn total_rendered_length_of_trailers(header_map: Option<&HeaderMap>) -> u64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we should have a unit test that serializes the header map with the http
crate and compares its length against this function. Especially with multiple headers of the same name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a test that ensures our trailer serializer and trailer length calculator stay in sync
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall this looks great! I'd like to see more exhaustive testing (eg. an inner body that contains multiple chunks eg.)
update: when aws-chunked formatting a `HeaderMap`, header names with multiple values will be written out one value per line remove: unnecessary AwsChunkedBodyOptions::stream_length method add: trace fields refactor: make inserting the final body CRLF more explicit add: test to ensure trailer encoding and trailer len calculation stay in sync
…in sync for empty HeaderMap
A new generated diff is ready to view.
A new doc preview is ready to view. |
/// not equal to the actual length of the trailers returned by the inner `http::Body` | ||
/// implementor. These trailer lengths are necessary in order to correctly calculate the total | ||
/// size of the body for setting the content length header. | ||
ReportedTrailerLengthMismatch { actual: u64, expected: u64 }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see @rcoh suggested making an error for this, but I think this is a case where we have a bug in our code, and there's little to no value for a customer in matching on this error. It's a case that simply should not happen if the code is correct, so an assertion was warranted.
On the flip side, with an assertion, it would silently allow the lengths to mismatch when compiled in release mode, which would likely result in a server-side error response. I think this is OK though, since a bug with this should not have been released in the first place if it was adequately tested.
We might want to proptest header names/values for the size prediction vs. the rendering to be absolutely certain they cannot mismatch. I think they can right now in the case where a header value has Unicode characters in it since the length predictor function is using string length instead of byte length.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing we could consider is just getting rid of the length prediction entirely. Just render the trailers to a BytesMut
when the length is initially needed, and store that BytesMut
in the AwsChunkedBody
for later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jdisanti only debug_assert
doesn't fail in release mode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a new error for when stream_length is set incorrectly
fix: don't convert to str before getting len of HeaderValue
A new generated diff is ready to view.
A new doc preview is ready to view. |
A new generated diff is ready to view.
A new doc preview is ready to view. |
…unks add: double check that stream_length used to create an `AwsChunkedBody` is correct.
A new generated diff is ready to view.
A new doc preview is ready to view. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
pin_project! { | ||
struct SputteringBody { | ||
parts: Vec<Option<Bytes>>, | ||
cursor: usize, | ||
delay_in_millis: u64, | ||
} | ||
} | ||
|
||
impl SputteringBody { | ||
fn len(&self) -> usize { | ||
self.parts.iter().flat_map(|b| b).map(|b| b.len()).sum() | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is cool!
A new generated diff is ready to view.
A new doc preview is ready to view. |
Motivation and Context
Supports the flexible checksums feature. When sending streaming requests we need to include checksums as trailers and we need
aws-chunked
content encoding to do that.Description
This provides functionality specified in the flexible checksums RFC
Testing
I wrote a test
Checklist
CHANGELOG.next.toml
if I made changes to the smithy-rs codegen or runtime cratesCHANGELOG.next.toml
if I made changes to the AWS SDK, generated SDK code, or SDK runtime cratesBy submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.