Consider converting io.ReadSeeker arguments in S3 package to io.Reader #272

DavidHuie · 2015-06-07T05:09:52Z

Forcing users to use io.ReadSeeker is not a good idea because most reads are done in Go with the io.Reader interface. If someone is already using an io.Reader, they would have to read the entire of buffer into memory in order to make an io.ReadSeeker. I noticed that other S3 libraries use io.Reader interfaces, such as https://github.com/mitchellh/goamz.

Please consider this because I cannot use this package in my own applications due to these memory concerns.

The text was updated successfully, but these errors were encountered:

marcosnils · 2015-06-07T05:19:12Z

@DavidHuie I had the same concern while trying to upload some content to S3. If you look into the source code the library uses a ReadSeeker to calculate to content MD5 (

aws-sdk-go/service/s3/content_md5.go

Line 25 in ea83c25

_, err = r.Body.Seek(0, 0)

) which is about to send. It doesn't matter if the interface is changed to an io.Reader because either way the library will still read the whole content to do this calculation. It's cleaner from an API perspective to use ReadSeeker as resetting the buffer is provided by the Seeker API.

lsegal · 2015-06-07T05:20:08Z

If someone is already using an io.Reader, they would have to read the entire of buffer into memory in order to make an io.ReadSeeker.

This is not exactly true. That said, you're probably looking for the s3manager.Uploader abstraction. You can read more about it in our Getting Started Guide. We will be doing a better job of advertising these features as we bootstrap our documentation.

lsegal · 2015-06-07T05:23:45Z

If you look into the source code the library uses a ReadSeeker to calculate to content MD5

@marcosnils actually just a minor pedantic correction, but Content-MD5 is not calculated for PutObject. That said, we do calculate checksums in the payload signature logic: https://github.com/aws/aws-sdk-go/blob/master/internal/signer/v4/v4.go#L311 -- so, basically the same thing. We also need a seekable body so that we can rewind it in case of retries, so there are multiple reasons why a seekable reader is required, and you correctly touched on one of them.

DavidHuie · 2015-06-07T05:24:30Z

The Uploader abstraction doesn't quite fit my usecase because I'm managing my own uploads via the multipart API (the server is proxying a large upload for a client). Would it be possible to create a similar abstraction for uploading a multipart part?

lsegal · 2015-06-07T05:25:25Z

@DavidHuie the Uploader performs multi-part uploads.

marcosnils · 2015-06-07T05:27:03Z

@lsegal totally right. I knew that some sort of hash was calculated because I looked in the code before. Just got confused between MD5 and Checksum.

Thanks for the clarification.

DavidHuie · 2015-06-07T05:30:14Z

Correct, but in my usecase the server never has the entire payload -- it's on the client. The server, which would use this library, would be proxying for the multipart API.

lsegal · 2015-06-07T05:34:40Z

@DavidHuie if you're calling UploadPart directly then you must provide a seekable stream. This is required for SHA-256 checksumming in the signature layer as well as for retry support which is integral to the SDK's feature-set. There are a couple of ways to go about this-- if you are concerned about memory footprint of each part, you may consider caching each part uploaded from the client on disk. If your application server is backed by some amount of SSD filesystem access, the I/O cost should be fairly low.

lsegal · 2015-06-07T05:36:56Z

You might also want to check #142 which has some more information about this specific issue.

DavidHuie · 2015-06-07T05:39:29Z

@lsegal The cacheing on disk solution is probably good enough for my purposes. Thanks for the quick response.

lsegal · 2015-06-07T06:06:45Z

No problem!

lsegal added the guidance Question that needs advice or information. label Jun 7, 2015

lsegal closed this as completed Jun 7, 2015

diehlaws mentioned this issue Oct 26, 2018

S3 uploader docs question #2228

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider converting io.ReadSeeker arguments in S3 package to io.Reader #272

Consider converting io.ReadSeeker arguments in S3 package to io.Reader #272

DavidHuie commented Jun 7, 2015

marcosnils commented Jun 7, 2015

lsegal commented Jun 7, 2015

lsegal commented Jun 7, 2015

DavidHuie commented Jun 7, 2015

lsegal commented Jun 7, 2015

marcosnils commented Jun 7, 2015

DavidHuie commented Jun 7, 2015

lsegal commented Jun 7, 2015

lsegal commented Jun 7, 2015

DavidHuie commented Jun 7, 2015

lsegal commented Jun 7, 2015

Consider converting io.ReadSeeker arguments in S3 package to io.Reader #272

Consider converting io.ReadSeeker arguments in S3 package to io.Reader #272

Comments

DavidHuie commented Jun 7, 2015

marcosnils commented Jun 7, 2015

lsegal commented Jun 7, 2015

lsegal commented Jun 7, 2015

DavidHuie commented Jun 7, 2015

lsegal commented Jun 7, 2015

marcosnils commented Jun 7, 2015

DavidHuie commented Jun 7, 2015

lsegal commented Jun 7, 2015

lsegal commented Jun 7, 2015

DavidHuie commented Jun 7, 2015

lsegal commented Jun 7, 2015