Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alternate Downloads.jl-based backend (alternative to HTTP.jl) #396
Alternate Downloads.jl-based backend (alternative to HTTP.jl) #396
Changes from all commits
d4c8f91
cae56c4
9b917fa
c8329a9
e66dca9
96a96ea
2fd84d4
8311723
31e126d
383eced
d979da4
bffb900
a549d87
a370fd1
93d4a29
be7adb1
3bf0a31
10d81ee
73c0bbf
5d50a61
9783836
9c2da08
1c6c1bf
8d63853
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may want to use the formulation from JuliaLang/Downloads.jl#136 instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can leave it until it gets merged upstream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was just reading through this, I don't understand
close(x); read(x)
— for normal filesread(x)
will return zero bytes after it's closed. (For IOBuffer it throws an exception).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a workaround specifically for
Base.BufferStream
, withoutclose
ing it beforehand,read
hangs.readavailable
works as well and that was my first choice, but @ericphanson pointed out this warning:I'm no expert when it comes to the various
IO
types and how to properly handle them all, so any feedback is good feedback!There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think perhaps we just need to use
read()
here and hope for the best? It turns out there's a lot of inconsistency in the blocking behavior of streams in Julia — see the issue here JuliaLang/julia#24526Perhaps for this case, the newly available — but strangely named —
Base.shutdown()
is what you want JuliaLang/julia#40783But even so, I'm not sure that AWS should be unilaterally closing or shutting down the write side of the the stream at all — to me that seems to be the business of the code which passes in the stream. For example, if reading from a pipe, presumably you want to keep reading until the writing side decides it's done?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, seems like whoever was filling the buffer needs to call
shutdown
(or close) to mark it as being doneThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, we only do it if
return_stream
is false, in which case you are supposed to get back some bytes as far as I understand it, and it seems like we need to close it to get those bytes.I’m not sure who that is- Downloads.jl fills it but I don’t think they should be closing it (because of exactly what @c42f just said).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, somebody was wrong before, and now AWS.jl is now possibly also very bad at stream EOF handling here too. Whoever created the stream is generally responsible for marking when it is done, or passing it to another function which will take care of that. I suspect Downloads.jl absolutely should be copying the EOF marker from the underlying stream also. It is out-of-band data, but likely just as relevant as the in-band bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps I've misunderstood what's going on here, but is the problem in Downloads.jl then? Shouldn't they close the write side when writing is done?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I 100% don't understand what's going on. But let me try to summarize the current behavior (including this PR).
Summary
The HTTPBackend only sets a
response_stream
if both the user has not passed one and they ask forreturn_stream=true
, whereas with the DownloadsBackend we create one whenever the user has not passed one. This is because Downloads.jl wants you to pass anIO
object to get the response always, unlike HTTP.jl which will either populate a stream you pass or return a vector of bytes directly.response_stream
, setsreturn_stream=true
response_stream
, setsreturn_stream=false
response_stream
, setsreturn_stream=true
response_stream
, and setsreturn_stream=false
body
HTTP.jl returns ✔️If I'm understanding then, I think Case 2 is probematic for the DownloadsBackend, and we should just not try to return bytes in this case, matching the behavior of the HTTP backend. I.e. we should not be closing the stream in this case.
Case 3: I don't really know if we should close the stream or not. I think ideally this case wouldn't exist, because I don't think there really should be a
return_stream
option-- I think a simpler API is "if you want a stream you should pass one and it will get populated (and not closed); if you don't want a stream, don't pass one and you will get a vector of bytes back".So: if that analysis is right, Downloads.jl is fine, and we have at least one problematic case here. But I'll admit to definitely not fully understanding the semantics of how streams should work (and the manual isn't very helpful in this regard), though the discussion here helps me a bit.
Though possibly, HTTP.jl and Downloads.jl should be closing the streams whenever they are passed and the backend is done populating them? I think maybe that's what @vtjnash is saying in #396 (comment). In which case most/all of these cases are wrong for both backends.
Appendix: Following the path of
get_object
To see these cases, play yout, we let's say we are doing an s3
get_object
.AWS.jl/src/services/s3.jl
Lines 2239 to 2246 in 181c340
params
which is a dictionary with additional parametersparams
to buildRequest
, changing the name fromparams
toargs
:AWS.jl/src/AWS.jl
Lines 213 to 217 in 181c340
AWS.jl/src/utilities/utilities.jl
Lines 65 to 77 in 181c340
Request
object:AWS.jl/src/utilities/request.jl
Lines 26 to 42 in 181c340
submit_request
:AWS.jl/src/utilities/request.jl
Lines 59 to 61 in 181c340
Here, what happens depends on which backend we are using. This PR is about the new DownloadsBackend.
Using the
DownloadsBackend
submit_request
calls_http_request
and if we are using aDownloadsBackend
, we end up at the method in question,AWS.jl/src/utilities/downloads_backend.jl
Line 41 in 181c340
response_stream
, we create a default one:AWS.jl/src/utilities/downloads_backend.jl
Lines 58 to 60 in 181c340
return_stream==true
, we arrange to collect the bytes from the response stream, including possibly closing theresponse_stream
:AWS.jl/src/utilities/downloads_backend.jl
Lines 70 to 74 in 181c340
Using the
HTTPBackend
If the user is using the existant HTTP.jl backend, what happens?
submit_request
calls_http_request
with an HTTPBackend, leading us to this method:AWS.jl/src/utilities/request.jl
Lines 186 to 217 in 181c340
response_stream
if both the user has not passed one and they ask forreturn_stream=true
:AWS.jl/src/utilities/request.jl
Lines 192 to 194 in 181c340
nothing
request.response_stream
toHTTP.request
:AWS.jl/src/utilities/request.jl
Line 203 in 181c340
body
if they were not given aresponse_stream
, otherwise they put the bytes in the stream: https://github.com/JuliaWeb/HTTP.jl/blob/a2b467e24c9bbd45691f9d0f57b57ee7463bd15a/src/HTTP.jl#L77-L78There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've filed #433 so we don't forget about this