Chunker should be able to use temp files instead of memory #12

jypma · 2016-09-21T12:44:52Z

The chunker at the moment requires (at least) 5MB of memory for every ongoing upload stream. With 100 concurrent connections, that'll easily eat a Java heap with nothing left over.

Buffering to temp files instead should not give a considerable overhead if it stays within disk cache, but allow the general system to scale much further, if one can live with (max S3 upload rate) = (max disk read speed).

filosganga · 2016-09-21T13:05:33Z

I think will be good to be configurable ideally, with at least 3 options:

on memory
on memory (off heap)
on disk

joearasin · 2016-09-21T16:06:54Z

Interesting -- I hadn't thought about scaling to this extent. What sort of use case are we talking about? I'm picturing someone forking off a bunch of streams, leaving them open, and pushing data into them.

jypma · 2016-09-21T17:41:00Z

We are building a (huge) document storage system, potentially saving many concurrent documents at the same time. Some of them small, some of them up to several 100 MB. I expect the operations to be mostly I/O bound, and hence senseful to leave many upload streams to S3 open simultaneously. At least up to the extent that we're saturating our upload bandwidth from EC2.

This allows a (much) creater amount of upload streams to run in parallel.

jypma added a commit to jypma/s3-stream that referenced this issue Sep 26, 2016

bluelabsio#12: Add ability to buffer chunks to disk instead of memory

052f569

This allows a (much) creater amount of upload streams to run in parallel.

jypma added a commit to jypma/s3-stream that referenced this issue Sep 26, 2016

bluelabsio#12: Add ability to buffer chunks to disk instead of memory

72ffcea

This allows a (much) creater amount of upload streams to run in parallel.

jypma added a commit to jypma/s3-stream that referenced this issue Sep 26, 2016

bluelabsio#12: Add ability to buffer chunks to disk instead of memory

37ba7e8

This allows a (much) creater amount of upload streams to run in parallel.

joearasin mentioned this issue Oct 31, 2016

AWS S3 Integration akka/alpakka#1

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chunker should be able to use temp files instead of memory #12

Chunker should be able to use temp files instead of memory #12

jypma commented Sep 21, 2016

filosganga commented Sep 21, 2016

joearasin commented Sep 21, 2016

jypma commented Sep 21, 2016

Chunker should be able to use temp files instead of memory #12

Chunker should be able to use temp files instead of memory #12

Comments

jypma commented Sep 21, 2016

filosganga commented Sep 21, 2016

joearasin commented Sep 21, 2016

jypma commented Sep 21, 2016