Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3: getObject response body streaming? #38

Closed
TillaTheHun0 opened this issue Aug 25, 2022 · 5 comments · Fixed by #42
Closed

S3: getObject response body streaming? #38

TillaTheHun0 opened this issue Aug 25, 2022 · 5 comments · Fixed by #42

Comments

@TillaTheHun0
Copy link
Contributor

I see this note on the getObject implementation for S3 and I know #24 exists, but seems to be focused on uploading objects, not getting objects. Being able to stream objects down from S3 would be awesome.

My use case:

My S3 buckets are locked down and can't be publicly accessed, so I would like to stream an object from S3, through my server, to the client, without needing to buffer the entire object. A workaround would be to create a presigned url for retrieving the object from s3 and the client using that, instead of my server, I just don't like exposing the underlying cloud infra, if that makes sense.

@danopia
Copy link
Member

danopia commented Aug 25, 2022

Hey, yes #24 is about request bodies. Comparatively, response bodies have almost no blockers to support streaming. It's just a question of API design. .getObject() unconditionally returns the data as a Uint8Array buffer, as you saw:

Body: new Uint8Array(await resp.arrayBuffer()), // TODO: maybe allow proper body streaming,

So how should a streaming body be requested and then returned? Is changing Uint8Array to ReadableStream<Uint8Array> enough or should I return a whole-ass Response object so you can also do .text() or whatever? I haven't seen how the official AWS SDK gives streaming response bodies FWIW.

Input welcome on how to present streaming response bodies :)


A workaround would be to create a presigned url for retrieving the object from s3 and the client using that, instead of my server

You can also make a pre-signed URL and then immediately fetch that URL from the same process! So you can contain the cloud layout within the server. Still a workaround of course.

@danopia danopia changed the title S3: getObject body streaming? S3: getObject response body streaming? Aug 25, 2022
@TillaTheHun0
Copy link
Contributor Author

You can also make a pre-signed URL and then immediately fetch that URL from the same process! So you can contain the cloud layout within the server. Still a workaround of course.

Great point! Worth a try in the meantime.

My intuition says that a ReadableStream would suffice, and would be more kosher with Deno, and I think less opinionated and ergo more flexible. Body being an entire Response object may be confusing, since technically this whole object being returned by getObject is the "response" from S3? The caller could always instantiate a Response themselves, around the Body Readable Stream, if they wanted Response apis. Just my initial thoughts.

Looking at api from Node world:

SDK v3 S3's GetObjectCommand resolves to a Node ReadableStream

SDK V2 buffered the response into memory which introduces same challenges discussed on this issue

@danopia
Copy link
Member

danopia commented Feb 19, 2023

My intuition says that a ReadableStream would suffice

This is looking like the most reasonable answer for a Deno-first library and has great synergy with Deno.writeFile() etc, but I'm still bothered about adding one of these lines whenever grabbing e.g. configuration files from S3:

// different ways of buffering a stream:
const bodyBytes = new Uint8Array(await new Response(resp.Body).arrayBuffer());
const bodyText = await new Response(resp.Body).text();
const bodyJson = await new Response(resp.Body).json();

I'm concerned about discoverability, because it's not obvious to use new Response. I tried searching "deno readablestream to string" etc on Google and didn't get useful results for this use-case.

I'm adding a tsdoc comment on the bodies (and a release note) and call it a day, because ReadableStream<Uint8Array> is truthfully the correct thing:

  /** To get this stream as a buffer, use `new Response(...).arrayBuffer()` or related functions. */
  Body?: ReadableStream<Uint8Array> | null;

@danopia
Copy link
Member

danopia commented Feb 26, 2023

🚀 This just shipped in v0.8.0 as a breaking change

For specific result fields which are the entire contents of the response body, the returned structure will now contain the ReadableStream<Uint8Array> instead of a buffered Uint8Array.

@TillaTheHun0
Copy link
Contributor Author

Hey @danopia . Sorry, i've been caught on other things. This looks really cool though, i'm going to give it a try!

Thanks for your work on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants