-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Http 1.1's read-ahead task management #103257
Conversation
Tagging subscribers to this area: @dotnet/ncl |
Not my area but I want to recognize that the change description here is really great 😀. |
/azp run runtime-libraries-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
bf4ef8f
to
8c21819
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Is there chance this can also be related to #31254?
It seems like there may be mix of problems but some looked similar enough to me.
@@ -133,7 +137,9 @@ public bool PrepareForReuse(bool async) | |||
// If the read-ahead task is completed, then we've received either EOF or erroneous data the connection, so it's not usable. | |||
if (ReadAheadTaskHasStarted) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could still this be race condition? e.g. It did not start it but it will before we get to the next statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the time you get to PrepareForReuse
, the caller is the only thing with a reference to this connection.
It's possible that the read was already started and is transitioning to Completed
right under you, but it can't transition from NotStarted
=> something else, or from something to => NotStarted
.
src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/HttpConnection.cs
Outdated
Show resolved
Hide resolved
* Fix Http 1.1's read-ahead task management * Fix comment typo
Fixes #100616
Every HTTP/1.1 connection has a
ValueTask<int> _readAheadTask
field that gets set either:PrepareForReuse
right before the call toSendAsync
, orAs a result of the latter, if a connection is added to the connection pool at any point (the shared
_http11Connections
stack), we must assume that the read-ahead may have been set.When "consuming" connections, we carefully manage who may consume the read-ahead task to ensure that we don't leak unobserved task exceptions.
To achieve this, we're using an
int _readAheadTaskStatus
flag to signal which thread will observe the result / whether the exception should be suppressed.The problem with the current state (before this PR) is that a connection may be briefly added to the connection pool and taken out right after to handle a pending request from the request queue.
Because we know that the connection was freshly added and therefore still good, we don't bother calling into
PrepareForReuse
, which means we weren't updating the_readAheadTaskStatus
as expected. This results in hitting a debug assert inHttpConnection.SendAsync
that checks that someone claimed ownership of the read-ahead completion before sending the request (#100616).Why don't we always call
PrepareForReuse
then?PrepareForReuse
expects its caller to immediately call intoSendAsync
if the check is successful. This allows it to store thestream.ReadAsync
into_readAheadTask
directly, without any extra wrapping logic.runtime/src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/HttpConnection.cs
Line 166 in 0686ce6
But at this point in the connection management, we're not 100% certain that we will have a request for the connection yet - we have waiters that may get canceled at any point.
If we did call
PrepareForReuse
and then had no waiter to signal, we'd have to throw away the connection.PrepareForReuse
also uses different logic based on abool async
argument which we don't yet have at this point.Why don't we give the connection to the waiter and have the consuming code call
PrepareForReuse
?While not super likely, if the connection was already bad, it would mean that we now took a request from the head of the queue and pushed it back to the tail to retry.
It would also potentially mean needing an extra flag to remember if this is the first request on the connection to match how we currently handle new connections that are immediately bad. Without that, we could end up doing more connect retries to a faulty server than we currently do (maybe that'd be fine, but it's not an obvious consequence).
It should be a possible approach, but I found the following simpler to follow:
This PR:
I changed how we handle read-ahead completion ownership, allowing a thread to return its completion ownership:
If
TryOwnScavengingTaskCompletion
fails, the read-ahead already completed and any potential exceptions got ignored => the connection is not usable and we try again.If
TryReturnScavengingTaskCompletionOwnership
fails, the read-ahead completed between us taking ownership and trying to return it => the connection is not usable and we try again.