Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always close the response_stream after HTTP.request #467

Merged
merged 8 commits into from
Sep 28, 2021

Conversation

nickrobinson251
Copy link
Contributor

@nickrobinson251 nickrobinson251 commented Sep 28, 2021

TODO: add test... i just need to recreate the issue using AWS.jl explicitly (rather than AWSS3) done.

- We need to explicitly close the `response_stream`
  (`HTTP.request` no longer closes it for us)
  in HTTP.jl v0.9.15+ (see HTTP.jl #752)
- Since `close` is safe to call multiple times,
  this should be compatible with old HTTP.jl versions.
- Should fix issue JuliaCloud#466
@ericphanson
Copy link
Member

ericphanson commented Sep 28, 2021

I'm a bit worried about the finally, since I see no other uses in this package where we use Retry.jl's macros with finally, and there are no tests for it: https://github.com/JuliaWeb/Retry.jl/blob/master/test/runtests.jl. So careful tests making sure we are handling that right seem important to me.

@nickrobinson251
Copy link
Contributor Author

I'm a bit worried about the finally, since I see no other uses in this package where we use Retry.jl's macros with finally

i suppose we could just call close in both branches of the try, if we want to avoid using finally?

e.g.

-        return @mock HTTP.request(
+        resp = @mock HTTP.request(
             http_stack,
             request.request_method,
             HTTP.URI(request.url),
             HTTP.mkheaders(request.headers),
             request.content;
             require_ssl_verification=false,
             response_stream=request.response_stream,
             http_options...,
         )
+        request.response_stream === nothing || close(request.response_stream)
+        return resp
     catch e
+        request.response_stream === nothing || close(request.response_stream)
         # Base.IOError is needed because HTTP.jl can often have errors that aren't

@ericphanson
Copy link
Member

The semantics of these macros are pretty confusing; both the try and catch branches can be executed multiple times if it retries, and we don't want to close the stream if we're going to retry. So I kind of think we need finally... just need to be sure it actually works 😅.

@nickrobinson251
Copy link
Contributor Author

nickrobinson251 commented Sep 28, 2021

:oof: macros 😒

okay, well 🤞 they play nice with finally

suggestions for tests are very welcome 😊

for now i'm struglling to recreate the original issue with just AWS.jl (not AWSS3.jl) for a test 😅

@ericphanson
Copy link
Member

Let’s see what CI says already— we can check at least if tests pass here with the new HTTP.jl release.

bors try

bors bot added a commit that referenced this pull request Sep 28, 2021
@bors
Copy link
Contributor

bors bot commented Sep 28, 2021

@ericphanson
Copy link
Member

Tests pass and we are using HTTP v0.9.15: https://github.com/JuliaCloud/AWS.jl/runs/3733253932#step:10:48

so that’s promising

@nickrobinson251
Copy link
Contributor Author

What's a sensible way to test something that causes code to hang?

For now i've just put some code (which hangs on master) into a test, but i'm not sure if that's the best way to go (since we rely on CI timing out or, if locally, realising something is wrong): 4dcecc6

@ericphanson
Copy link
Member

We at least have timeouts now (https://github.com/JuliaCloud/AWS.jl/blob/master/.github/workflows/CI.yml#L16) due to hangs in the past, so I think the way you did it is OK.

Should we be creating buckets, or is there one we should use already @mattBrzezinski ?

@nickrobinson251
Copy link
Contributor Author

Should we be creating buckets

ah yeah, i meant to ask -- the testsets in test/issues.jl all set up and then teardown buckets, which is kinda nice but also kinda slow and make for a lot of boilerplate. For now i've just followed the same pattern, but (perhaps as a follow-up) maybe all these issues.jl tests on S3 could just use different files in the same bucket?

@@ -213,6 +213,8 @@ function _http_request(http_backend::HTTPBackend, request::Request)
isa(e, Base.IOError) ||
(isa(e, HTTP.StatusError) && _http_status(e) >= 500)
end
finally
request.response_stream isa IO && close(request.response_stream)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an end-user were to pass in their own response_stream, what would happen here if it closes? Is that data still available in it, or does it go away?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was gonna suggest something like

diff --git a/src/utilities/request.jl b/src/utilities/request.jl
index 12debd4..2ef8363 100644
--- a/src/utilities/request.jl
+++ b/src/utilities/request.jl
@@ -189,11 +189,13 @@ function _http_request(http_backend::HTTPBackend, request::Request)
     @repeat 4 try
         http_stack = HTTP.stack(; redirect=false, retry=false, aws_authorization=false)
 
+        should_close = false
         if request.return_stream && request.response_stream === nothing
             request.response_stream = Base.BufferStream()
+            should_close = true
         end
 
-        return @mock HTTP.request(
+        r = @mock HTTP.request(
             http_stack,
             request.request_method,
             HTTP.URI(request.url),
@@ -203,6 +205,10 @@ function _http_request(http_backend::HTTPBackend, request::Request)
             response_stream=request.response_stream,
             http_options...,
         )
+        if should_close
+            close(request.response_stream)
+        end
+        return r
     catch e
         # Base.IOError is needed because HTTP.jl can often have errors that aren't
         # caught and wrapped in an HTTP.IOError

earlier. E.g. only close if it was opened internally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the behaviour is exactly as before (i.e. as we had with older HTTP.jl versions). In older HTTP versions, close was already called inside the HTTP.request above https://github.com/JuliaWeb/HTTP.jl/pull/752/files#diff-2877f0a59db9c7dda95204097d3c2010021eb8f6c59c98d56d3f7d16e38b0acbL149

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the behaviour is exactly as before (i.e. as we had with older HTTP.jl versions)

I think the idea is that this maybe wasn't correct from HTTP.jl which is why it got changed there. #396 (comment) and #433 have lots of discussion on this (which I forgot completely about until just now)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only close if it was opened internally

wouldn't this be different behaviour to what we had when using older HTTP.jl versions?
i.e. didn't HTTP.request just call close on the response_stream (if it wasn't nothing)?

(i don't know, since i'm not too familiar with this stuff)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not been paying too much attention to the HTTP.jl changes. Are there use cases anyone can think of to keeping the streams open and having users manually close them?

Personally I think closing the HTTP requests automatically makes sense. It gives a loose contract of, this is the data which AWS has provided you from your request.

Maybe I'm missing something obvious though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, maybe it should be reverted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, i have no opinion on how best to fix this. But i am keen to fix it as fast as possible, as not being able get files from S3 is quite debilitating for some workflows at Invenia 😊

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I would say we should merge this PR which basically brings back the behavior of HTTP 0.9.14 and then continue this discussion in an issue.

I would really like to see this merged today.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@mattBrzezinski mattBrzezinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left a couple comments.

test/issues.jl Outdated Show resolved Hide resolved
test/issues.jl Outdated Show resolved Hide resolved
Copy link
Member

@mattBrzezinski mattBrzezinski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment about the try/finally in test/issues.jl and it LGTM!

@mattBrzezinski
Copy link
Member

FWIW You should be able to just run JuliaFormatter w/ BlueStyle on the entire package and it'll make the CI step happy. I believe I already cleaned this repo out.

@ericphanson
Copy link
Member

I'm concerned about this change; I think #467 (comment) is probably the right way.

In #396 (comment) @vtjnash said

Whoever created the stream is generally responsible for marking when it is done, or passing it to another function which will take care of that.

which sounds like we should only be closing the ones we create here.

@mattBrzezinski
Copy link
Member

I'm concerned about this change; I think #467 (comment) is probably the right way.

In #396 (comment) @vtjnash said

Whoever created the stream is generally responsible for marking when it is done, or passing it to another function which will take care of that.

which sounds like we should only be closing the ones we create here.

I haven't had time to read through all these threads, but will soon(TM). I think we can push these changes through as they mimic the existing functionality, then open an issue to discuss?

@ericphanson
Copy link
Member

Ok, that sounds fine to me

@mattBrzezinski
Copy link
Member

bors r+

bors bot added a commit that referenced this pull request Sep 28, 2021
467: Always close the `response_stream` after `HTTP.request` r=mattBrzezinski a=nickrobinson251

- We need to explicitly close the `response_stream`
  (`HTTP.request` no longer closes it for us)
  in HTTP.jl v0.9.15+ (see JuliaWeb/HTTP.jl#752)
- Since `close` is safe to call multiple times,
  this should be compatible with old HTTP.jl versions. 
- Should fix issue #466 (and JuliaCloud/AWSS3.jl#215)
- Also since we're fixing this here, we can close JuliaWeb/HTTP.jl#772

~TODO: add test... i just need to recreate the issue using AWS.jl explicitly (rather than AWSS3)~ done.

Co-authored-by: Nick Robinson <[email protected]>
Co-authored-by: Nick Robinson <[email protected]>
@mattBrzezinski
Copy link
Member

CI failed, but also threw

WARNING: redefinition of constant BUCKET_NAME. This may fail, cause incorrect answers, or produce other errors.

test/issues.jl Outdated Show resolved Hide resolved
@nickrobinson251
Copy link
Contributor Author

i don't understand bors -- is CI running, or...?

@mattBrzezinski
Copy link
Member

bors r+

@bors
Copy link
Contributor

bors bot commented Sep 28, 2021

@bors bors bot merged commit b730f72 into JuliaCloud:master Sep 28, 2021
@nickrobinson251 nickrobinson251 deleted the npr/466-close-response-stream branch September 28, 2021 19:27
@omus
Copy link
Member

omus commented Sep 28, 2021

Personally, I think this should have been addressed by upper bounding HTTP.jl. That would have been a trivial change in comparison and would have given us more time to work out the details of this PR

@oxinabox
Copy link
Collaborator

oxinabox commented Sep 28, 2021

A fair point conceptually.
But in practice an incredibly annoying one, do to how julia's resolver/registry works.
since would have needed to go and retroactively capped the version of HTTP.jl in all past releases of AWS.jl.
Which would have required manual registry edits.
(I have suggested before that for post 1.0 packages, like AWS.jl the registry should refuse to install any but the lasted patch release for any minor which would have made tagging such a patch successfully block it (at least for AWS 2.x.y))

Another good option might have been to revert the change to HTTP.jl and tag a release.
(since it was supposedly a nonbreaking change but did infact break a pretty core package for a lot of people)
While we worked out whether or not it was breaking after all, or whether AWS.jl was violating HTTP.jl's public API (and if so fixed AWS.jl)

@nickrobinson251
Copy link
Contributor Author

Yeah, reverting the change to HTTP.jl would have been another option, but may have required more discussion.
Pinning HTTP is quite painful, as Lyndon says. This PR was a 2 line change to the src (the big diffs to the tests were various requests made in code review, mostly addressing issues in the existing tests, arguably out of scope for this PR tbh). And since this just maintains the existing (pre-HTTP.jl v0.9.15) behaviour in an internal function, i don't really see the harm. As far as this package is concerned, it just protects against the relevant change in HTTP. Especially as a change like #468 can be done, and maybe should be done, (just like it could have been before).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Requests to AWS S3 hang with HTTP.jl v0.9.15
6 participants