-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Flexible checksums #1482
Conversation
add: base64::encoded_length add: ChecksumBody add: ChecksumValidatedBody update: signer to support streaming payloads with trailers add: checksum inlineables add: AwsChunkedEncodingBody + options struct update: make append_merge_header_maps pub
add: test for response checksum validation add: ByteStream map method add: get_parts_mut method to operation::Output update: support formatting the string type in RustCodegenDecorator.kt update: S3 models
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't able to review this all in one sitting, but will revisit tomorrow. Incredible work on this!
.chunk_length | ||
.or(this.options.stream_length) | ||
.unwrap_or_default(); | ||
Poll::Ready(Some(Ok(prefix_with_chunk_size(data, total_chunk_size)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this adds a significant amount of copying—how hard would it be to refactor it to send the chunk size and the chunk separately?
I'm also a little confused about how this works—we never switch back into WritingChunkSize
AFAICT
update: ChecksumValidatedBody::poll_inner to log at the appropriate level rename: checksum_header_name_to_checksum_algorithm to header_name_to_algorithm rename: checksum_algorithm_to_checksum_header_name to algorithm_to_header_name fix: use new ClientCodegenContext instead of CodegenContext
fix: prefer static over const declaration for `HeaderName`s fix: add missing param to TestProtocolTraitImplGenerator.generateTraitImpls
add: tests for wrapped body retryability refactor: swap pin-project for pin-project-lite refactor: swap tempfile for temp-file in integration tests
A new generated diff is ready to view.
A new doc preview is ready to view. |
A new generated diff is ready to view.
A new doc preview is ready to view. |
*this.state = AwsChunkedBodyState::WritingChunk; | ||
let total_chunk_size = this | ||
.options | ||
.chunk_length | ||
.or(this.options.stream_length) | ||
.unwrap_or_default(); | ||
Poll::Ready(Some(Ok(prefix_with_chunk_size(data, total_chunk_size)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this code seems to straddle two lines between supporting multiple chunks in some ways but not in others–it makes it pretty confusing to follow. I think we should maybe remove chunk_size
as a parameter entirely?
/// Test connection for the movies IT | ||
/// headers are signed with actual creds, at some point we could replace them with verifiable test | ||
/// credentials, but there are plenty of other tests that target signing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out of date comment
// static INIT_LOGGER: std::sync::Once = std::sync::Once::new(); | ||
// fn init_logger() { | ||
// INIT_LOGGER.call_once(|| { | ||
// tracing_subscriber::fmt::init(); | ||
// }); | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this useful?
async fn collect_body_into_string(mut body: aws_smithy_http::body::SdkBody) -> String { | ||
use bytes::Buf; | ||
use bytes_utils::SegmentedBuf; | ||
use http_body::Body; | ||
use std::io::Read; | ||
|
||
let mut output = SegmentedBuf::new(); | ||
while let Some(buf) = body.data().await { | ||
output.push(buf.unwrap()); | ||
} | ||
|
||
let mut output_text = String::new(); | ||
output | ||
.reader() | ||
.read_to_string(&mut output_text) | ||
.expect("Doesn't cause IO errors"); | ||
|
||
output_text | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use str::from_utf8(&body.collect().into_bytes())
fn poll_trailers( | ||
self: Pin<&mut Self>, | ||
cx: &mut Context<'_>, | ||
) -> Poll<Result<Option<HeaderMap<HeaderValue>>, Self::Error>> { | ||
let this = self.project(); | ||
match ( | ||
this.checksum.headers(), | ||
http_body::Body::poll_trailers(this.inner, cx), | ||
) { | ||
// If everything is ready, return trailers, merging them if we have more than one map | ||
(Ok(outer_trailers), Poll::Ready(Ok(inner_trailers))) => { | ||
let trailers = match (outer_trailers, inner_trailers) { | ||
// Values from the inner trailer map take precedent over values from the outer map | ||
(Some(outer), Some(inner)) => Some(append_merge_header_maps(inner, outer)), | ||
// If only one or neither produced trailers, just combine the `Option`s with `or` | ||
(outer, inner) => outer.or(inner), | ||
}; | ||
Poll::Ready(Ok(trailers)) | ||
} | ||
// If the inner poll is Ok but the outer body's checksum callback encountered an error, | ||
// return the error | ||
(Err(e), Poll::Ready(Ok(_))) => Poll::Ready(Err(e)), | ||
// Otherwise return the result of the inner poll. | ||
// It may be pending or it may be ready with an error. | ||
(_, inner_poll) => inner_poll, | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is really cool! (how checksum body uses structured trailers then AWS chunked writes the trailers into the body)
pub fn checksum_mismatch(expected: Bytes, actual: Bytes) -> Self { | ||
Self::ChecksumMismatch { expected, actual } | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given the Display impl, I wonder if we should have a max length on these fields
pub trait Checksum: Send + Sync { | ||
/// Given a slice of bytes, update this checksum's internal state. | ||
fn update(&mut self, bytes: &[u8]) -> Result<(), BoxError>; | ||
/// Either return this checksum as a `HeaderMap` containing one HTTP header, or return an error | ||
/// describing why checksum calculation failed. | ||
fn headers(&self) -> Result<Option<HeaderMap<HeaderValue>>, BoxError>; | ||
/// Return the `HeaderName` used to represent this checksum algorithm | ||
fn header_name(&self) -> HeaderName; | ||
/// "Finalize" this checksum, returning the calculated value as `Bytes` or an error that | ||
/// occurred during checksum calculation. To print this value in a human-readable hexadecimal | ||
/// format, you can print it using Rust's builtin [formatter]. | ||
/// | ||
/// _**NOTE:** typically, "finalizing" a checksum in Rust will take ownership of the checksum | ||
/// struct. In this method, we clone the checksum's state before finalizing because checksums | ||
/// may be used in a situation where taking ownership is not possible._ | ||
/// | ||
/// [formatter]: https://doc.rust-lang.org/std/fmt/trait.UpperHex.html | ||
fn finalize(&self) -> Result<Bytes, BoxError>; | ||
/// Return the size of this checksum algorithms resulting checksum, in bytes. For example, the | ||
/// CRC32 checksum algorithm calculates a 32 bit checksum, so a CRC32 checksum struct | ||
/// implementing this trait method would return 4. | ||
fn size(&self) -> u64; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, I like this design! Have we already gone back and forth on this? I might consider splitting out HttpChecksum
as `CheckSum + headers/header_name
@@ -6,7 +6,7 @@ | |||
use bytes::Bytes; | |||
use http::{HeaderMap, HeaderValue}; | |||
use http_body::{Body, SizeHint}; | |||
use pin_project::pin_project; | |||
use pin_project_lite::pin_project; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for doing this!!
pin_project! { | ||
#[doc = "Stream of binary data"] | ||
#[doc = ""] | ||
#[doc = "`ByteStream` wraps a stream of binary data for ease of use."] | ||
#[doc = ""] | ||
#[doc = "## Getting data out of a `ByteStream`"] | ||
#[doc = ""] | ||
#[doc = "`ByteStream` provides two primary mechanisms for accessing the data:"] | ||
#[doc = "1. With `.collect()`:"] | ||
#[doc = ""] | ||
#[doc = " [`.collect()`](crate::byte_stream::ByteStream::collect) reads the complete ByteStream into memory and stores it in `AggregatedBytes`,"] | ||
#[doc = " a non-contiguous ByteBuffer."] | ||
#[doc = " ```no_run"] | ||
#[doc = " use aws_smithy_http::byte_stream::{ByteStream, AggregatedBytes};"] | ||
#[doc = " use aws_smithy_http::body::SdkBody;"] | ||
#[doc = " use bytes::Buf;"] | ||
#[doc = " async fn example() {"] | ||
#[doc = " let stream = ByteStream::new(SdkBody::from(\"hello! This is some data\"));"] | ||
#[doc = " // Load data from the stream into memory:"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, are you sure this is required here? There are other places where we use pin_project_lite and doc comments:
I converted this into a draft and will split it into several smaller PRs. |
Closing this in favor of #1561 |
EDIT (2022/06/27)
At @rcoh 's suggestion, I've broken this PR up into smaller PRs to make review easier:
I'll open more PRs for remaining functionality once those get merged.
Motivation and Context
Supports the flexible checksums feature. Without this feature, we can't update our S3 models.
Description
See the flexible checksums RFC for details on what this PR accomplishes.
Testing
I wrote several tests to test the new functionality on multiple different levels
Checklist
CHANGELOG.next.toml
if I made changes to the smithy-rs codegen or runtime cratesCHANGELOG.next.toml
if I made changes to the AWS SDK, generated SDK code, or SDK runtime cratesBy submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.