Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DownloadRequestBuilder for more complex download support #111

Closed

Conversation

tustvold
Copy link

Problems

  • I wish to stream chunks of bytes from object storage, but the streaming API is per-byte, which is unfortunate for performance
  • I wish to detect not found errors based on matching a status code, and not matching a string
  • download and download_streamed treat not found errors differently
  • I wish to specify range requests to fetch only a given slice of bytes

Proposal

This PR therefore implements a new ObjectClient::download_request method that returns a DownloadRequestBuilder which allows constructing a request with additional options, and then either collecting the results into a single Bytes or streaming the HTTP chunks as they are sent on the wire. This approach will provide an obvious extension point for future possibilities, like requesting a specific object generation, conditional matches, etc...

ObjectClient::download and ObjectClient::download_streamed are then updated to use this builder.

Let me know what you think 😄

This pattern could also be applied to other methods such as create() to allow setting metadata, ACLs, content-encoding, etc...

.headers(self.0.get_headers().await?)
.send()
.await?;
if resp.status() == StatusCode::NOT_FOUND {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the special case that forces a string match on not found errors, it is not replicated in download_streamed

@ViktorWb
Copy link
Contributor

ViktorWb commented Jun 5, 2022

Also just ran into the performance problem, and this solution improves it significantly. What do you say @ThouCheese ?

@ViktorWb
Copy link
Contributor

I want to use bytes_stream, but I need to know the content length before starting to read the stream. Is this possible? If not, perhaps the DownloadRequestBuilder.send method could be made public to make it possible to read the response content-length header like so:

let res = client.object().download_request(bucket_name, filename).send().await?;
let content_length = res.content_length()?;
let bytes_stream = res.bytes_stream().map_err(Into::into)

@tustvold
Copy link
Author

In the end I moved away from using this crate (influxdata/object_store_rs#42) and so am going to close this. I would be happy for someone else to take this code and run with it if they want, but I don't have time to help drive this.

@tustvold tustvold closed this Jul 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants