-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider converting io.ReadSeeker arguments in S3 package to io.Reader #272
Comments
@DavidHuie I had the same concern while trying to upload some content to S3. If you look into the source code the library uses a ReadSeeker to calculate to content MD5 ( aws-sdk-go/service/s3/content_md5.go Line 25 in ea83c25
|
This is not exactly true. That said, you're probably looking for the s3manager.Uploader abstraction. You can read more about it in our Getting Started Guide. We will be doing a better job of advertising these features as we bootstrap our documentation. |
@marcosnils actually just a minor pedantic correction, but Content-MD5 is not calculated for PutObject. That said, we do calculate checksums in the payload signature logic: https://github.com/aws/aws-sdk-go/blob/master/internal/signer/v4/v4.go#L311 -- so, basically the same thing. We also need a seekable body so that we can rewind it in case of retries, so there are multiple reasons why a seekable reader is required, and you correctly touched on one of them. |
The |
@DavidHuie the Uploader performs multi-part uploads. |
@lsegal totally right. I knew that some sort of hash was calculated because I looked in the code before. Just got confused between MD5 and Checksum. Thanks for the clarification. |
Correct, but in my usecase the server never has the entire payload -- it's on the client. The server, which would use this library, would be proxying for the multipart API. |
@DavidHuie if you're calling UploadPart directly then you must provide a seekable stream. This is required for SHA-256 checksumming in the signature layer as well as for retry support which is integral to the SDK's feature-set. There are a couple of ways to go about this-- if you are concerned about memory footprint of each part, you may consider caching each part uploaded from the client on disk. If your application server is backed by some amount of SSD filesystem access, the I/O cost should be fairly low. |
You might also want to check #142 which has some more information about this specific issue. |
@lsegal The cacheing on disk solution is probably good enough for my purposes. Thanks for the quick response. |
No problem! |
Forcing users to use
io.ReadSeeker
is not a good idea because most reads are done in Go with theio.Reader
interface. If someone is already using anio.Reader
, they would have to read the entire of buffer into memory in order to make anio.ReadSeeker
. I noticed that other S3 libraries useio.Reader
interfaces, such as https://github.com/mitchellh/goamz.Please consider this because I cannot use this package in my own applications due to these memory concerns.
The text was updated successfully, but these errors were encountered: