Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HTTP/3] NullReferenceException on cancellation #48624

Closed
JamesNK opened this issue Feb 22, 2021 · 7 comments · Fixed by #55724
Closed

[HTTP/3] NullReferenceException on cancellation #48624

JamesNK opened this issue Feb 22, 2021 · 7 comments · Fixed by #55724
Assignees
Milestone

Comments

@JamesNK
Copy link
Member

JamesNK commented Feb 22, 2021

There is a gRPC functional test called ServerStreaming_CancellationOnClientWhileMoveNext_CancellationSentToServer.

The test:

  1. Starts a HTTP/3 request
  2. Reads from the request body
  3. Disposes HttpResponseMessage (which cancels the request)
  4. Reads from the request body

Instead of throwing the same exception as HTTP/2 (I think is an OCE), a NullReferenceException is thrown.

Stack trace:

    System.NullReferenceException: Object reference not set to an instance of an object.
       at System.Net.Http.Http3RequestStream.HandleReadResponseContentException(Exception ex, CancellationToken cancellationToken)
       at System.Net.Http.Http3RequestStream.ReadResponseContentAsync(HttpResponseMessage response, Memory`1 buffer, CancellationToken cancellationToken)
       at Grpc.Net.Client.StreamExtensions.ReadMessageAsync[TResponse](Stream responseStream, GrpcCall call, Func`2 deserializer, String grpcEncoding, Boolean singleMessage, CancellationToken cancellationToken) in C:\Development\Source\grpc-dotnet\src\Grpc.Net.Client\Internal\StreamExtensions.cs:line 78
       at Grpc.Net.Client.Internal.HttpContentClientStreamReader`2.MoveNextCore(CancellationToken cancellationToken) in C:\Development\Source\grpc-dotnet\src\Grpc.Net.Client\Internal\HttpContentClientStreamReader.cs:line 161

I think the problem is a null _connection is called here:

case Http3ConnectionException _:
// A connection-level protocol error has occurred on our stream.
_connection.Abort(ex);
throw new IOException(SR.net_http_client_execution_error, new HttpRequestException(SR.net_http_client_execution_error, ex));

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Feb 22, 2021
@ghost
Copy link

ghost commented Feb 22, 2021

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

There is a gRPC functional test called ServerStreaming_CancellationOnClientWhileMoveNext_CancellationSentToServer.

The test:

  1. Starts a HTTP/3 request
  2. Reads from the request body
  3. Disposes HttpResponseMessage (which cancels the request)
  4. Reads from the request body

Instead of throwing the same exception as HTTP/2 (I think is an OCE), a NullReferenceException is thrown.

Stack trace:

    System.NullReferenceException: Object reference not set to an instance of an object.
       at System.Net.Http.Http3RequestStream.HandleReadResponseContentException(Exception ex, CancellationToken cancellationToken)
       at System.Net.Http.Http3RequestStream.ReadResponseContentAsync(HttpResponseMessage response, Memory`1 buffer, CancellationToken cancellationToken)
       at Grpc.Net.Client.StreamExtensions.ReadMessageAsync[TResponse](Stream responseStream, GrpcCall call, Func`2 deserializer, String grpcEncoding, Boolean singleMessage, CancellationToken cancellationToken) in C:\Development\Source\grpc-dotnet\src\Grpc.Net.Client\Internal\StreamExtensions.cs:line 78
       at Grpc.Net.Client.Internal.HttpContentClientStreamReader`2.MoveNextCore(CancellationToken cancellationToken) in C:\Development\Source\grpc-dotnet\src\Grpc.Net.Client\Internal\HttpContentClientStreamReader.cs:line 161

I think the problem is a null _connection is called here:

case Http3ConnectionException _:
// A connection-level protocol error has occurred on our stream.
_connection.Abort(ex);
throw new IOException(SR.net_http_client_execution_error, new HttpRequestException(SR.net_http_client_execution_error, ex));

Author: JamesNK
Assignees: -
Labels:

area-System.Net.Http

Milestone: -

@ManickaP
Copy link
Member

ManickaP commented Mar 4, 2021

Triage: the support for cancelation in QUIC is lacking atm, we should fix this.

@ManickaP ManickaP removed the untriaged New issue has not been triaged by the area owner label Mar 4, 2021
@ManickaP ManickaP added this to the 6.0.0 milestone Mar 4, 2021
@karelz karelz added the bug label Apr 15, 2021
@CarnaViire CarnaViire self-assigned this Jun 4, 2021
@CarnaViire
Copy link
Member

@JamesNK could you please share what exactly is done on the server and what on the client?

@JamesNK
Copy link
Member Author

JamesNK commented Jun 8, 2021

Client and server code: https://github.com/grpc/grpc-dotnet/blob/9f72e70c6a47ae823bb3ddcc719f1c63bf721fc7/test/FunctionalTests/Client/CancellationTests.cs#L352-L410

I think the error came from line 396.

I know you'd prefer a reproduction that doesn't involve gRPC but once I begin testing HTTP/3 with the gRPC functional tests again then I'll likely be raising a lot of issues from the tests. Recreating each issue without gRPC will take too long.

I have a branch that adds HTTP/3 support to the grpc-dotnet repository but it is many months out of date. I will need to update it before you can test HTTP/3 on it - https://github.com/JamesNK/grpc-dotnet/commits/jamesnk/http3

@CarnaViire
Copy link
Member

@JamesNK thanks for sharing the code!
With it, I managed to create a repro for the issue in pure HTTP/3. I will start investigating the bug now.

@CarnaViire
Copy link
Member

I've tried to create an explanation about what happens with links to the code, as it ended up hard to explain only with words.

What happens is: client sends a request. For the response, server sends a couple of messages and then pauses without closing the connection. Client reads these two messages and on the next read task (namely, call.ResponseStream.MoveNext) it will patiently wait for new data to appear on the wire until cancellation occurs.

Cancellation token source gets created here
https://github.com/grpc/grpc-dotnet/blob/9f72e70c6a47ae823bb3ddcc719f1c63bf721fc7/test/FunctionalTests/Client/CancellationTests.cs#L379

It is passed to gRPC internals in Options.CancellationToken
https://github.com/grpc/grpc-dotnet/blob/9f72e70c6a47ae823bb3ddcc719f1c63bf721fc7/test/FunctionalTests/Client/CancellationTests.cs#L384

And then in the middle of client waiting on read call, this token gets cancelled
https://github.com/grpc/grpc-dotnet/blob/9f72e70c6a47ae823bb3ddcc719f1c63bf721fc7/test/FunctionalTests/Client/CancellationTests.cs#L392-L393


Returning to gRPC internals, inside GrpcCall there is another internal Cancellation token source _callCts
https://github.com/grpc/grpc-dotnet/blob/9f72e70c6a47ae823bb3ddcc719f1c63bf721fc7/src/Grpc.Net.Client/Internal/GrpcCall.cs#L75

The external token Options.CancellationToken is registered with a callback
https://github.com/grpc/grpc-dotnet/blob/9f72e70c6a47ae823bb3ddcc719f1c63bf721fc7/src/Grpc.Net.Client/Internal/GrpcCall.cs#L707-L713

Which will cancel the internal _callCts upon its cancellation
https://github.com/grpc/grpc-dotnet/blob/9f72e70c6a47ae823bb3ddcc719f1c63bf721fc7/src/Grpc.Net.Client/Internal/GrpcCall.cs#L381-L384

And then dispose HttpResponseMessage
https://github.com/grpc/grpc-dotnet/blob/9f72e70c6a47ae823bb3ddcc719f1c63bf721fc7/src/Grpc.Net.Client/Internal/GrpcCall.cs#L386-L388


Returning to the reading task in the test call.ResponseStream.MoveNext
https://github.com/grpc/grpc-dotnet/blob/9f72e70c6a47ae823bb3ddcc719f1c63bf721fc7/test/FunctionalTests/Client/CancellationTests.cs#L396

CancellationToken.None is passed to it, but if the token would be cancellable, then inside gRPC would register the same callback on it that was registered on external Options.CancellationToken: the one that cancels the internal _callCts and disposes HttpResponseMessage
https://github.com/grpc/grpc-dotnet/blob/9f72e70c6a47ae823bb3ddcc719f1c63bf721fc7/src/Grpc.Net.Client/Internal/HttpContentClientStreamReader.cs#L125-L129

After that, the token that is passed to our stream for reading is not user's token, but internal _callCts token
https://github.com/grpc/grpc-dotnet/blob/9f72e70c6a47ae823bb3ddcc719f1c63bf721fc7/src/Grpc.Net.Client/Internal/HttpContentClientStreamReader.cs#L165-L169
(_call.CancellationToken retuns _callCts.Token)
https://github.com/grpc/grpc-dotnet/blob/9f72e70c6a47ae823bb3ddcc719f1c63bf721fc7/src/Grpc.Net.Client/Internal/GrpcCall.cs#L98-L101

JFYI the method for actual reading from the steam is here
https://github.com/grpc/grpc-dotnet/blob/9f72e70c6a47ae823bb3ddcc719f1c63bf721fc7/src/Grpc.Net.Client/Internal/StreamExtensions.cs#L78

And the stream is HttpResponseMessage.Content stream
https://github.com/grpc/grpc-dotnet/blob/9f72e70c6a47ae823bb3ddcc719f1c63bf721fc7/src/Grpc.Net.Client/Internal/HttpContentClientStreamReader.cs#L149

What we are passing as Content stream is Http3ReadStream

Which on read will be calling for Http3RequestStream.ReadResponseContentAsync

return _stream.ReadResponseContentAsync(_response, buffer, cancellationToken);

The cancellation token is passed forward there to reading on QuicStream

int bytesRead = await _stream.ReadAsync(buffer.Slice(0, copyLen), cancellationToken).ConfigureAwait(false);

And if the cancellation occurs, this QuicStream will be aborted with a specific RequestCancelled error code

case OperationCanceledException oce when oce.CancellationToken == cancellationToken:
_stream.AbortWrite((long)Http3ErrorCode.RequestCancelled);

But it will already be null (most of the times I guess, as it's essentially a race condition), because HttpResponseMessage got disposed, which disposed Content


which disposed stream inside (that is Http3ReadStream as I've mentioned before)

which disposed Http3RequestStream

which disposed QuicStream

and set it to null


So what I understand happens on cancellation here
https://github.com/grpc/grpc-dotnet/blob/9f72e70c6a47ae823bb3ddcc719f1c63bf721fc7/test/FunctionalTests/Client/CancellationTests.cs#L392-L393 is:

      Options.CancellationToken is canceled
      |                                   |
      V                                   V
_callCts is canceled               HttpResponseMessage is disposed
      |                                   |
      V                                   V
OperationCanceledException        QuicStream is disposed and set to null 
is thrown and catched inside
Http3RequestStream
      |
      V
QuicStream is attempted to
be aborted, but it is null
already

While getting rid of the exception is easy (check for null before aborting stream) I want to understand the whole scenario because it seems a bit strange to me. Why HttpResponseMessage is disposed inside cancellation callback? That itself wouldn't lead to OperationCanceledException, but to ObjectDisposedException. Would it be more logical for the disposal to happen after catching OperationCanceledException from the stream?

If this disposal pattern in gRPC will be left as it is, we won't send a correct error code to the peer (i.e. it will be some generic error code, not about the request being canceled). I understand that in general we won't be able to prevent a user from calling Dispose at any, even unexpected, time, but how much do we care about this specific cancellation situation? I also don't know, whether disposing HttpResponseMessage for cancelling the request is some kind of known practice?

cc @ManickaP @scalablecory @geoffkizer @wfurt

@JamesNK
Copy link
Member Author

JamesNK commented Jun 11, 2021

Why HttpResponseMessage is disposed inside cancellation callback?

It is how you abort an HTTP request that has already returned a response. If the stream is in progress an abort is sent to the server.

Scenarios to cancel HTTP request:

  • HttpMessageHandler.SendAsync(request, cancellationToken) -> Client hasn't receive response headers. Trigger cancellation token to abort request.
  • HttpResponseMessage.Dispose() -> Client has received response headers and is downloading body. Call dispose on response message to abort request.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Jun 17, 2021
@karelz karelz changed the title HTTP/3: NullReferenceException on cancellation [HTTP/3] NullReferenceException on cancellation Jul 1, 2021
CarnaViire added a commit that referenced this issue Jul 20, 2021
Fix abort on cancellation for HTTP/3 content stream, fix dispose when read was aborted by cancellation token.
Fixes #48624
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Jul 20, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Aug 19, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.