-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Disable pipelining to alleviate race conditions #732
Conversation
Codecov Report
@@ Coverage Diff @@
## master #732 +/- ##
==========================================
- Coverage 76.88% 76.66% -0.22%
==========================================
Files 37 37
Lines 2444 2443 -1
==========================================
- Hits 1879 1873 -6
- Misses 565 570 +5
Continue to review full report at Codecov.
|
Seems ok to me to remove pipelining support given that others aren't doing it. |
Yeah, pipeline support has always seemed a bit superfluous to me, especially given the work to switch those kinds of workflows over to http2, which is better in several respects. I'm in favor of removing support for it in HTTP.jl. |
It would be amazing to fix this race condition! Anything we can do to help merge this? |
The main work to do here is to convince ourselves that this is correct. Unfortunately this really requires a deep dive into both the HTTP standard and the HTTP.jl code so I've been working on this in the last few days. Once I've got a more complete fix it would be amazing to have people help test or review it. The current PR is more of a hacky workaround than anything — looking into the surrounding code, I haven't been able to convince myself that the connection pool is correct for concurrent/parallel use without larger changes. I'm pursuing two separate lines of research/prototyping:
|
On this note, it may be helpful to see the code I split out a while ago and worked on packaging up stand-alone: https://github.com/JuliaServices/ConnectionPools.jl/blob/master/src/ConnectionPools.jl. |
Cool I'd just decided to do something similar. I think the minimal abstraction to extract here is the "Pod". What we currently call a "connection pool" is just a lock around a The hard part is the pod which I'd call ConcurrentResourcePool or some such (for HTTP.jl, the "resource" being the individual open socket which are pooled for a given pod). When you want a new resource you can get it from the pool (if available) or create it asynchronously. There's a limit on total resources which means resource recycling needs to be really watertight or you'll hang indefinitely at some point. |
An update on this: I was persuaded to try using libcurl via Downloads.jl as a backend for HTTP.jl rather than fixing HTTP's connection and stream layers in pure Julia. This was only partly about addressing the immediate problem here; the other reason to go in this direction is to have access to a very mature HTTP client library. The new backend is currently implemented in Currently I've implemented this by adding a new default layer, effectively replacing the connection and stream layers in the stack. This isn't very satisfying though — it feels to me that we should clean up and generalize our HTTP stack to make this supported in a more obvious way. |
Could you expand on what you mean by "clean up and generalize our HTTP stack"? It's not clear to me what you think would make this cleaner/more obvious than just replacing those lower layers. |
Good question. A few thoughts:
|
Follows up on the discussion and work started in #732. The gist is this: supporting "pipelined requests" is not well-supported across the web these days and severely complicates the threadsafe client implementation in HTTP.jl. And as has been pointed out in various discussions, the attempt to support pipelining is affecting our ability to avoid thread-safety issues in general (see #517). This commit has a few key pieces: * Splits "connection pooling" logic into a new connectionpools.jl file * Removes pipeline-related fields/concepts from the ConnectionPool.jl file and specifically the `Connection`/`Transaction` data structures * Attempts to simplify a lot of the work/logic around managing a `Transaction`'s lifecycle Things I haven't done yet: * Actually deprecated the `pipeline_limit` keyword argument * Got all tests passing; I think the websockets/server code isn't quite ironed out yet, but I'll dig into it some more tomorrow Big thanks to @nickrobinson251 for help reviewing the connectionpools.jl logic. Pinging people to help review: @c42f, @vtjnash, @s2maki, @fredrikekre
* Remove client support for pipelined requests Follows up on the discussion and work started in #732. The gist is this: supporting "pipelined requests" is not well-supported across the web these days and severely complicates the threadsafe client implementation in HTTP.jl. And as has been pointed out in various discussions, the attempt to support pipelining is affecting our ability to avoid thread-safety issues in general (see #517). This commit has a few key pieces: * Splits "connection pooling" logic into a new connectionpools.jl file * Removes pipeline-related fields/concepts from the ConnectionPool.jl file and specifically the `Connection`/`Transaction` data structures * Attempts to simplify a lot of the work/logic around managing a `Transaction`'s lifecycle Things I haven't done yet: * Actually deprecated the `pipeline_limit` keyword argument * Got all tests passing; I think the websockets/server code isn't quite ironed out yet, but I'll dig into it some more tomorrow Big thanks to @nickrobinson251 for help reviewing the connectionpools.jl logic. Pinging people to help review: @c42f, @vtjnash, @s2maki, @fredrikekre * get more tests passing; remove Transaction * Get all tests passing * update Julai compat to 1.6 * fix 32-bit and docs * fix 32-bit * fix 32-bit * Address review comments * Added `GC.@preserve` annotations * Removed calls to `hash` in `hashconn` and renamed to `connectionkey` * Removed enforcement of `reuse_limit` * Renamed some fields/variables for clarity * Cleaned up comments for clarification in connectionpools.jl * Added an additional `acquire` method that accepts an already created connection object for `Pod` insertion/tracking
Superseded by #783 |
Some problems
As noted in #517 we've encountered hangs using the HTTP.jl client in production. In #517 (comment) I've found several potential concurrency problems with the current ConnectionPool:
It seems both
closeread(::Transaction)
andclosewrite(::Transaction)
callrelease(::Connection)
which returns the connection to the pool. So the connection will be returned to the pool multiple times per request-response. In addition,closewrite
ends up being called twice as part ofrequest(::Type{StreamLayer{<:}})
- both in the body of that function and also inwritebody()
. So a given connection will be returned to the pool's channel up to three times. This seems weird to me and potentially broken.reuse_limit
is not respected because it checksreadcount()
rather than the sequence number. Butreadcount
isn't incremented immediately on assigning a transaction to a connection, but only later whencloseread
is called.It seems that
isvalid()
is happy to close connection streams based on various conditions (timeout and reuse counts) when the stream isn't currently reading/writing. But depending on the details of locking, a connection could already have had pending transactions assigned to it which have not yet started reading or writing. It seems easy for race conditions to creep in here.What to do?
Removing
release()
fromcloseread()
disables HTTP/1.1 pipelining and seems to fix the issues at #517. This PR includes that trivial change as a strawman (and also a couple of other related concurrency fixes).Rather than trying to fix pipelining properly, perhaps we should consider consider removing it. It seems that major HTTP implementations have deemed it not worthwhile in practice due to:
For example,