Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always close the response_stream after HTTP.request #467

Merged
merged 8 commits into from
Sep 28, 2021
Merged
2 changes: 2 additions & 0 deletions src/utilities/request.jl
Original file line number Diff line number Diff line change
Expand Up @@ -213,6 +213,8 @@ function _http_request(http_backend::HTTPBackend, request::Request)
isa(e, Base.IOError) ||
(isa(e, HTTP.StatusError) && _http_status(e) >= 500)
end
finally
request.response_stream isa IO && close(request.response_stream)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an end-user were to pass in their own response_stream, what would happen here if it closes? Is that data still available in it, or does it go away?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was gonna suggest something like

diff --git a/src/utilities/request.jl b/src/utilities/request.jl
index 12debd4..2ef8363 100644
--- a/src/utilities/request.jl
+++ b/src/utilities/request.jl
@@ -189,11 +189,13 @@ function _http_request(http_backend::HTTPBackend, request::Request)
     @repeat 4 try
         http_stack = HTTP.stack(; redirect=false, retry=false, aws_authorization=false)
 
+        should_close = false
         if request.return_stream && request.response_stream === nothing
             request.response_stream = Base.BufferStream()
+            should_close = true
         end
 
-        return @mock HTTP.request(
+        r = @mock HTTP.request(
             http_stack,
             request.request_method,
             HTTP.URI(request.url),
@@ -203,6 +205,10 @@ function _http_request(http_backend::HTTPBackend, request::Request)
             response_stream=request.response_stream,
             http_options...,
         )
+        if should_close
+            close(request.response_stream)
+        end
+        return r
     catch e
         # Base.IOError is needed because HTTP.jl can often have errors that aren't
         # caught and wrapped in an HTTP.IOError

earlier. E.g. only close if it was opened internally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the behaviour is exactly as before (i.e. as we had with older HTTP.jl versions). In older HTTP versions, close was already called inside the HTTP.request above https://github.com/JuliaWeb/HTTP.jl/pull/752/files#diff-2877f0a59db9c7dda95204097d3c2010021eb8f6c59c98d56d3f7d16e38b0acbL149

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the behaviour is exactly as before (i.e. as we had with older HTTP.jl versions)

I think the idea is that this maybe wasn't correct from HTTP.jl which is why it got changed there. #396 (comment) and #433 have lots of discussion on this (which I forgot completely about until just now)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only close if it was opened internally

wouldn't this be different behaviour to what we had when using older HTTP.jl versions?
i.e. didn't HTTP.request just call close on the response_stream (if it wasn't nothing)?

(i don't know, since i'm not too familiar with this stuff)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not been paying too much attention to the HTTP.jl changes. Are there use cases anyone can think of to keeping the streams open and having users manually close them?

Personally I think closing the HTTP requests automatically makes sense. It gives a loose contract of, this is the data which AWS has provided you from your request.

Maybe I'm missing something obvious though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, maybe it should be reverted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, i have no opinion on how best to fix this. But i am keen to fix it as fast as possible, as not being able get files from S3 is quite debilitating for some workflows at Invenia 😊

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I would say we should merge this PR which basically brings back the behavior of HTTP 0.9.14 and then continue this discussion in an issue.

I would really like to see this merged today.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

end
end

Expand Down
147 changes: 83 additions & 64 deletions test/issues.jl
Original file line number Diff line number Diff line change
@@ -1,83 +1,102 @@
@service S3

bucket_name = "aws-jl-test-issues---" * _now_formatted()
S3.create_bucket(bucket_name)

@testset "issue 223" begin
# https://github.com/JuliaCloud/AWS.jl/issues/223
body = "Hello World!"
file_name = "contains spaces"

try
S3.put_object(bucket_name, file_name, Dict("body" => body))
resp = S3.get_object(bucket_name, file_name)

@test String(resp) == body
finally
S3.delete_object(bucket_name, file_name)
S3.delete_bucket(bucket_name)
BUCKET_NAME = "aws-jl-test-issues---" * _now_formatted()

try
S3.create_bucket(BUCKET_NAME)

@testset "issue 223" begin
# https://github.com/JuliaCloud/AWS.jl/issues/223
body = "Hello World!"
file_name = "contains spaces"

try
S3.put_object(BUCKET_NAME, file_name, Dict("body" => body))
resp = S3.get_object(BUCKET_NAME, file_name)

@test String(resp) == body
finally
S3.delete_object(BUCKET_NAME, file_name)
end
end
end

@testset "issue 227" begin
@testset "s3 public bucket" begin
# https://github.com/JuliaCloud/AWS.jl/issues/227
config = AWSConfig(; creds=nothing)
resp = S3.get_object("www.invenia.ca", "index.html"; aws_config=config)
@testset "issue 227" begin
@testset "s3 public bucket" begin
# https://github.com/JuliaCloud/AWS.jl/issues/227
config = AWSConfig(; creds=nothing)
resp = S3.get_object("www.invenia.ca", "index.html"; aws_config=config)

@test !isempty(resp)
end

@testset "s3 private bucket" begin
bucket_name = "aws-jl-test-issues---" * _now_formatted()
file_name = "hello_world"

try
S3.create_bucket(bucket_name)
S3.put_object(bucket_name, file_name)

@test !isempty(resp)
@test_throws AWSException S3.get_object(
bucket_name, file_name; aws_config=AWSConfig(; creds=nothing)
)
finally
S3.delete_object(bucket_name, file_name)
S3.delete_bucket(bucket_name)
end
end

@testset "lambda" begin
@service Lambda

@test_throws NoCredentials Lambda.list_functions(;
aws_config=AWSConfig(; creds=nothing)
)
end
end

@testset "s3 private bucket" begin
bucket_name = "aws-jl-test-issues---" * _now_formatted()
file_name = "hello_world"
@testset "issue 324" begin
body = "Hello World!"
file_name = "streaming.bin"

try
S3.create_bucket(bucket_name)
S3.put_object(bucket_name, file_name)
S3.put_object(BUCKET_NAME, file_name, Dict("body" => body))
resp = S3.get_object(BUCKET_NAME, file_name)
@test String(resp) == body

@test_throws AWSException S3.get_object(
bucket_name, file_name; aws_config=AWSConfig(; creds=nothing)
# ERROR: MethodError: no method matching iterate(::Base.BufferStream)
# => BUG: header `response_stream` is pushed into the query...
io = Base.BufferStream()
S3.get_object(
BUCKET_NAME,
file_name,
Dict("response_stream" => io, "return_stream" => true),
)
if bytesavailable(io) > 0
@test String(readavailable(io)) == body
else
@test "no body data was available" == body
end

finally
S3.delete_object(bucket_name, file_name)
S3.delete_bucket(bucket_name)
S3.delete_object(BUCKET_NAME, file_name)
end
end

@testset "lambda" begin
@service Lambda

@test_throws NoCredentials Lambda.list_functions(;
aws_config=AWSConfig(; creds=nothing)
)
end
end
@testset "issue 466" begin
file_name = "hang.txt"

@testset "issue 324" begin
body = "Hello World!"
file_name = "streaming.bin"

try
S3.create_bucket(bucket_name)
S3.put_object(bucket_name, file_name, Dict("body" => body))
resp = S3.get_object(bucket_name, file_name)
@test String(resp) == body

# ERROR: MethodError: no method matching iterate(::Base.BufferStream)
# => BUG: header `response_stream` is pushed into the query...
io = Base.BufferStream()
S3.get_object(
bucket_name, file_name, Dict("response_stream" => io, "return_stream" => true)
)
if bytesavailable(io) > 0
@test String(readavailable(io)) == body
else
@test "no body data was available" == body
try
S3.put_object(BUCKET_NAME, file_name)
stream = S3.get_object(BUCKET_NAME, file_name, Dict("return_stream" => true))
println("test #466") # So we know if this is the reason for tests hanging.
@test eof(stream) # This will hang if #466 not fixed and using HTTP.jl v0.9.15+
println("#466 fixed")
finally
S3.delete_object(BUCKET_NAME, file_name)
end

finally
S3.delete_object(bucket_name, file_name)
S3.delete_bucket(bucket_name)
end

finally
S3.delete_bucket(BUCKET_NAME)
end