Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Downloader for transient retries #553

Closed
wants to merge 16 commits into from
Closed

Conversation

ericphanson
Copy link
Member

fixes #552 for the Downloads backend.

Based on #548

It's a little awkward, because we have retries at 2 levels- on the level of the HTTP request (via whichever backend), and in submit_request. We don't have AWSExceptions until the 2nd layer, where we actually read from the stream. In the first layer, we don't read from the stream yet. So we need a way to tell _http_request that we are retrying from a transient error, so I added a new keyword argument.

Would really appreciate any thoughts.

"LimitExceededException",
"RequestThrottled",
"PriorRequestNotComplete",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a transient error, not a throttling one, according to https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-retries.html#cli-usage-retries-modes-standard.title

However, in this PR only Downloads backend is getting special transient error handling, so maybe I should add this back with a comment, so we don't mess up the HTTP behavior.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it later on as a separate check

src/utilities/downloads_backend.jl Outdated Show resolved Hide resolved
src/utilities/utilities.jl Outdated Show resolved Hide resolved
src/utilities/utilities.jl Outdated Show resolved Hide resolved
src/utilities/utilities.jl Outdated Show resolved Hide resolved
ericphanson and others added 2 commits May 13, 2022 15:45
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@ericphanson
Copy link
Member Author

bors try

bors bot added a commit that referenced this pull request May 13, 2022
@bors
Copy link
Contributor

bors bot commented May 13, 2022

try

Build failed:

@ericphanson
Copy link
Member Author

bors try

bors bot added a commit that referenced this pull request May 13, 2022
src/AWS.jl Outdated Show resolved Hide resolved
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@bors
Copy link
Contributor

bors bot commented May 13, 2022

try

Build failed:

@ericphanson
Copy link
Member Author

bors try

bors bot added a commit that referenced this pull request May 13, 2022
test/patch.jl Outdated Show resolved Hide resolved
ericphanson and others added 2 commits May 13, 2022 17:35
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
test/patch.jl Outdated Show resolved Hide resolved
@bors
Copy link
Contributor

bors bot commented May 13, 2022

try

Build failed:

@ericphanson
Copy link
Member Author

bors try

bors bot added a commit that referenced this pull request May 19, 2022
@bors
Copy link
Contributor

bors bot commented May 19, 2022

try

Build failed:


Downloads.jl tends to perform better under concurrent operation than HTTP.jl,
particularly with `@async` / `asyncmap`. As of March 2022, threading (e.g. `@spawn` or `@threads`) with Downloads.jl is broken on all releases of Julia ([Downloads.jl#110](https://github.com/JuliaLang/Downloads.jl/issues/110)), and there are still reported issues on the upcoming
1.7.3 and 1.8 releases ([Downloads.jl#182](https://github.com/JuliaLang/Downloads.jl/issues/182])).
"""
struct DownloadsBackend <: AWS.AbstractBackend
downloader::Union{Nothing,Downloads.Downloader}
create_new_downloader::Any
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
create_new_downloader::Any
create_new_downloader::Base.Callable

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That isn't documented; is it really better than Any?

src/utilities/downloads_backend.jl Outdated Show resolved Hide resolved
end

DownloadsBackend() = DownloadsBackend(nothing)
DownloadsBackend() = DownloadsBackend(nothing, () -> get_downloader(; fresh=true))
DownloadsBackend(D::Downloader) = DownloadsBackend(D, () -> get_downloader(; fresh=true))

const AWS_DOWNLOADER = Ref{Union{Nothing,Downloader}}(nothing)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not to derail this PR but using undef when the downloader is not yet defined seems preferable

Suggested change
const AWS_DOWNLOADER = Ref{Union{Nothing,Downloader}}(nothing)
const AWS_DOWNLOADER = Ref{Downloader}()

You'd just need to change some code in get_downloader to use isassigned(AWS_DOWNLOADER) before dereferencing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do that separately if we are going to change it. I vaguely remember some problems with that approach when trying it in the original implementation. Maybe it was useful to be able to reset back to nothing, or something like that...

src/utilities/downloads_backend.jl Show resolved Hide resolved
end

DownloadsBackend() = DownloadsBackend(nothing)
DownloadsBackend() = DownloadsBackend(nothing, () -> get_downloader(; fresh=true))
DownloadsBackend(D::Downloader) = DownloadsBackend(D, () -> get_downloader(; fresh=true))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not restrict the constructor to only accepting create_new_downloader and the downloader field can just be used internally?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Downloaders are stateful, and the original API promise was that you could decide how to share them, e.g. you could have 1 downloader per thread that you provision out. However this issue shows that we need the ability to create new ones as well. That makes me think actually the original API was bad and we probably should only have as input create_new_downloader like you say. However I think that's probably breaking, since we started by allowing you to pass in a Downloader. So my compromise here is that we'll use whatever downloader you pass in, but sometimes we will make a new one.

But... it's even more complicated, because if we make a new one because we think the old one might have a problem, we don't want to use the old one anymore. But DownloadsBackend is immutable, so I don't have a way to replace the old one. We can replace AWS's global downloader, which is probably what most people are using, but if you were using the ability to provision downloaders on a per-request basis then we don't have a way to replace those.

This API problem still has me stumped. The current implementation ONLY fixes things in a good way for users of the global downloader.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest you have the following for backwards compatibility:

DownloadsBackend(D::Downloader) = DownloadsBackend(() -> D)

Using this doesn't work with your transient fixes but is effectively just uses the old behaviour. If you want the fix you need to pass in a function. I can't see another option for this as we can't copy instances of Downloaders.

But... it's even more complicated, because if we make a new one because we think the old one might have a problem, we don't want to use the old one anymore. But DownloadsBackend is immutable

Why not make DownloadsBackend mutable or use a Ref for the downloader field?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am worried about concurrent access to the field: we can have multiple readers and at least one writer to the field. I suppose we can add a lock though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a lock and updated the constructors

src/utilities/downloads_backend.jl Outdated Show resolved Hide resolved
src/utilities/downloads_backend.jl Outdated Show resolved Hide resolved
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Base automatically changed from eph/base-retry to master May 25, 2022 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Do not re-use HTTP connection when retrying a transient error
3 participants