Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QuicConnection.Dispose should not wait for streams to complete #55156

Closed
geoffkizer opened this issue Jul 4, 2021 · 24 comments
Closed

QuicConnection.Dispose should not wait for streams to complete #55156

geoffkizer opened this issue Jul 4, 2021 · 24 comments

Comments

@geoffkizer
Copy link
Contributor

Currently, there is logic in QuicConnection.Dispose and elsewhere to wait for active streams to complete before closing the msquic handle.

In general I don't think we should do this. It's up to the application to decide when a Quic connection is no longer needed and to close it. We cannot and should not try to figure out when the connection is "done".

Note we don't do this for CloseAsync, which is what we generally expect a user to do before calling Dispose. So in the expected usage pattern this logic never kicks in.

Also, even when the logic does kick in, we are doing an immediate connection close to MsQuic instead of a graceful one. So data in flight will likely end up lost regardless.

@geoffkizer geoffkizer added this to the 6.0.0 milestone Jul 4, 2021
@ghost
Copy link

ghost commented Jul 4, 2021

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

Currently, there is logic in QuicConnection.Dispose and elsewhere to wait for active streams to complete before closing the msquic handle.

In general I don't think we should do this. It's up to the application to decide when a Quic connection is no longer needed and to close it. We cannot and should not try to figure out when the connection is "done".

Note we don't do this for CloseAsync, which is what we generally expect a user to do before calling Dispose. So in the expected usage pattern this logic never kicks in.

Also, even when the logic does kick in, we are doing an immediate connection close to MsQuic instead of a graceful one. So data in flight will likely end up lost regardless.

Author: geoffkizer
Assignees: -
Labels:

area-System.Net.Quic

Milestone: 6.0.0

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Jul 4, 2021
@wfurt
Copy link
Member

wfurt commented Jul 5, 2021

do you expect msquic to track this? We saw crashed before when close the handle too early.
We can probably hook to SHUTDOWN_COMPLETED event as we discussed before.

@ManickaP ManickaP removed the untriaged New issue has not been triaged by the area owner label Jul 6, 2021
@ManickaP
Copy link
Member

ManickaP commented Jul 6, 2021

We saw crashed before when close the handle too early.

Yes, we shouldn't crash in any case and for that we need to keep the connection alive until all it's associated streams are closed.
If we kept a list of all streams we could forcibly close them upon Dispose of connection though.

cc @nibanks

@geoffkizer
Copy link
Contributor Author

geoffkizer commented Jul 6, 2021

Yes, we shouldn't crash in any case and for that we need to keep the connection alive until all it's associated streams are closed.

Why?

My understanding is that closing the connection will kill all active streams anyway. Right?

Edit: If we were/are crashing, we should understand why. My understanding is that msquic handles this properly, so I assume we are doing something wrong on our side. Do we know what it is? Is it the previously identified GCHandle issue?

@wfurt
Copy link
Member

wfurt commented Jul 6, 2021

I think we need to delay the closing until last stream is gone. I originally did the counting of streams before understanding the msquic internal mechanic. While I think Dispose can free all the managed resources and throw ODE afterwards we sill need to preserve the native handles.

@geoffkizer
Copy link
Contributor Author

I think we need to delay the closing until last stream is gone.

I still don't understand why we need to do that. Are you saying this is an msquic requirement? If so, what's the specific requirement?

Or is it something about how we are using msquic? If so, seems like it would be better to fix how we use msquic.

@wfurt
Copy link
Member

wfurt commented Jul 6, 2021

Let's @nibanks to comment. I think we should hook to SHUTDOWN_COMPLETED event to free MsQuic resources. AFAIK that can be before or after Dispose is called.

@geoffkizer
Copy link
Contributor Author

I think we should hook to SHUTDOWN_COMPLETED event to free MsQuic resources.

Agreed... #55044

@nibanks
Copy link

nibanks commented Jul 6, 2021

I'm not sure I follow the issue completely here. Is this related to the fact that streams that haven't been started don't get shutdown automatically? FYI, they don't get shutdown complete events because we cannot deliver a notification before you call StreamStart. Perhaps a quick call might help here.

@geoffkizer
Copy link
Contributor Author

@wfurt Can you describe the issue you've seen? You understand this better than I do.

@wfurt
Copy link
Member

wfurt commented Jul 6, 2021

This is something we discussed as well @nibanks. The ultimate question is when it is safe to free all the connection related resources e.g. configuration and the connection itself. Perhaps you can describe the optimal/safe sequence.

Prior #52800 we would try to free everything in Dispose and we saw crashes when there were still active streams. #52800 added counting to prevent that. But perhaps there is better way.

This is somewhat similar for the streams. Note that c# application may call some functions like Shutdown/Reset even after the final event - I saw tests trying to reset/close on failure and I'm not sure what we want to do there @geoffkizer. We may relay on the SafeHandle and throw ODE but that would be perhaps weird if real .NET Dispose was not called.

@geoffkizer
Copy link
Contributor Author

@nibanks Basically, the question here is:

(1) Can we call ConnectionShutdown when there are active streams in use. @wfurt is concerned that when we tried to do this in the past, we saw crashes.
(2) If this is okay to do, then what behavior should we expect when we call ConnectionShutdown. E.g. what events will this generate on the open streams and the connection itself, and in what order. Also, if we try to use a stream after ConnectionShutdown is called, what will happen?

@wfurt Please add/correct if I missed anything.

@geoffkizer
Copy link
Contributor Author

We may relay on the SafeHandle and throw ODE but that would be perhaps weird if real .NET Dispose was not called.

Not sure what you mean here...

We should always be relying on the SafeHandle to manage lifetime of the handle. That's what it's for. Used properly, it should prevent use-after-free issues with the handle.

@nibanks
Copy link

nibanks commented Jul 7, 2021

You can most definitely call ConnectionShutdown at any time on a connection (so long as you haven't called ConnectionClose on it) and as many times as you want (calling it again will be silently ignored). All active (not already shutdown) and started (successfully called StreamStart for local streams or all remote streams) streams will receive a shutdown complete event. Further/racing calls to those streams (so long as haven't closed them) will start failing with invalid state.

@geoffkizer
Copy link
Contributor Author

@nibanks Thanks, that's what I expected.

@wfurt Based on this I don't think we need to delay closing the connection. If we are seeing crashes, it's probably related to #55044.

@CarnaViire
Copy link
Member

@nibanks however, is that true that all StreamClose should be called before ConnectionClose? I believe StreamClose called after ConnectionClose was the cause of the crashes (or at least one of them).

@nibanks
Copy link

nibanks commented Jul 7, 2021

@nibanks however, is that true that all StreamClose should be called before ConnectionClose? I believe StreamClose called after ConnectionClose was the cause of the crashes (or at least one of them).

That's correct.

@CarnaViire
Copy link
Member

That means that at least SafeMsQuicConnectionHandle.Dispose should wait for all streams SafeMsQuicStreamHandle.Dispose @geoffkizer. That is currently enforced by stream counting @wfurt did. It might be rewritten to DangerousAddRef counting but I don't see much value to do it now. Also, to my understanding, just hooking onto connection's final event wouldn't be enough here... final connection event would mean all streams are shutdown/killed, but not necessarily that all StreamClose are called.

@geoffkizer
Copy link
Contributor Author

Okay, I think I understand the issue now -- thanks all.

In the finalization case, we cannot guarantee the ordering of calls to StreamClose and ConnectionClose -- at least not without a bunch of hassle in our finalization logic.

@nibanks Is it possible for msquic to allow StreamClose after ConnectionClose?

@nibanks
Copy link

nibanks commented Jul 7, 2021

It's not currently possible and would not be easy to accomplish. Our entire execution model requires the connection as what manages and drives the queue of work for it and all child streams. StreamClose queues work on the connection; without the connection you can't do the work.

@geoffkizer
Copy link
Contributor Author

I'm not saying you need to do any work during StreamClose. You can keep the way that connection teardown works today. It's fine if StreamClose after ConnectionClose is basically a no-op.

@nibanks
Copy link

nibanks commented Jul 7, 2021

StreamClose has to clean up stuff, and must execute on our worker thread.

@wfurt
Copy link
Member

wfurt commented Jul 7, 2021

We call ConnectionClose when the SafeHandle is closed @geoffkizer. So we can call ConnectionShutdown to initiate the shutdown but we still need to keep the underlying MsQuic object alive. I guess we should clarify what close mean on UDP. There may be just terminology disconnect. The work in #52800 was meant to prevent premature ConnectionClose.

@geoffkizer
Copy link
Contributor Author

I'm closing this issue. The original description is incorrect and based on bad assumptions.

I believe what we should do is:
(1) Change QuicConnection.Dispose to do an abortive shutdown of the connection -- tracked here: #55492
(2) Eventually, change the handle refcounting logic so it is done by the SafeHandles for stream and connection, instead of by the MsQuicConnection and MsQuicStream themselves. Tracked here: #55103

@ghost ghost locked as resolved and limited conversation to collaborators Aug 11, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants