-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add real text on error handling #335
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More special cases than I like, but a good direction.
draft-ietf-quic-transport.md
Outdated
|
||
An error on stream 1 MUST be handled as a connection error. Stream 1 is | ||
critical to the functioning of the entire connection. Furthermore, some | ||
application protocols place constraints on how streams are used that makes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SVA: "constraints ... that make"
draft-ietf-quic-transport.md
Outdated
affected stream. | ||
|
||
An error on stream 1 MUST be handled as a connection error. Stream 1 is | ||
critical to the functioning of the entire connection. Furthermore, some |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to tease this apart and say that if Stream 1 is closed in any way, that's a connection error? You don't necessarily need to special-case elevation of the error that impacted stream 1 to connection-level then, but it's still permitted.
draft-ietf-quic-transport.md
Outdated
certain streams critical to the functioning of the protocol. Termination of a | ||
critical stream could result in a fatal error in an application protocol. | ||
Errors in such critical streams MUST be changed into a error that closes the | ||
entire connection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be a transport requirement. If an application has such constraints, it needs to terminate the connection upon seeing the stream close. Otherwise, the application needs an API to "bless" certain streams. This parallels the above comment about stream 1 -- permit stream errors to be stream errors, but if you depend on a certain stream surviving and it doesn't, it's your job to kill the connection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To this point, I thought I had text in HTTP saying that if the Connection Control stream closes, you MUST close the connection, but I don't see it in master. Maybe it's buried in a PR?
@@ -2022,7 +2071,58 @@ send a WINDOW_UPDATE frame at least two roundtrips before it expects the sender | |||
to get blocked. | |||
|
|||
|
|||
# Error Codes {#error-handling} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You removed this anchor, which is referenced elsewhere in the text.
draft-ietf-quic-transport.md
Outdated
It is sent by a middlebox to notify an endpoint that the path is no longer | ||
functional. Such a Public Reset does not include a Public Reset Proof, it | ||
includes only the connection ID, which provides an assurance that the middlebox | ||
received packets on the flow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would handle this with a structured proof instead. I.e.,
struct {
ProofType type;
opaque proof[len];
}
Then we can require the proof to contain the conn id or whatever. As I think I mentioned, proofs from non-participants should probably instead contain something derived from the packet, which includes a MAC and so is guaranteed to be high entropy.
Also, it's not clear to me we want to require conn ids to be high enough entropy to make receipt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have to put something in the field. And while I agree that it's not necessarily true that a connection ID sufficient, it is something that a recipient of a PR will have and can use. The same goes for packet numbers, which I anticipate adding to this.
I hope that there is only ever a need for one type of proof per version. That suggests that switching on type isn't necessary. Can you justify having multiple types of proof within the same version?
Note that this all anticipates merging something like Jana's most recent header proposal to some extent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand it, it's quite likely that servers will not be sending the connection ID at all for many connections because the host/port is sufficient, so it's not clear that you have to put something in this field.
As far as having multiple kinds of proof, this PR in fact anticipates that, with one for an endpoint and one for middleboxes, it just labels the endpoint one proof and the middlebox one connection ID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jana's proposal has a strategy for that. If you don't like it, then we can continue to work on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I think I indicated in my email, I don't like it....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's an issue in the way this is stated. There's no way for a client to distinguish between a PR generated by a server from that generated by a middlebox. I think all we want to say that a valid PR MUST include proof that the PR was generated in response to a client packet. There's no way to say who's generating this because both middleboxes and "new" servers will know exactly the same things about the packet. (A middlebox may know more about the connection.)
draft-ietf-quic-transport.md
Outdated
After receiving a Public Reset that omits the Public Reset Proof, an endpoint | ||
SHOULD stop sending on the path on which the Public Reset was received. Packets | ||
that are received on the path MAY be discarded. Without support for multipath, | ||
this means that the connection could become unusable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/could/will/ right?
I'm not sure getting clever here helps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could recognizes the problem that the next paragraph covers
draft-ietf-quic-transport.md
Outdated
packets continue to arrive after a Public Reset, this could indicate an attack. | ||
Delays could mean that a few packets arrive after a Public Reset, but if this | ||
continues well beyond the observed bounds on jitter, an endpoint MAY consider | ||
the Public Reset to have been spurious. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like more guidance may be needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to accept that guidance if you have it. I was going to say something about round trips or packets, but it's not especially easy to pick an answer or even general guidance.
draft-ietf-quic-transport.md
Outdated
@@ -2165,6 +2169,9 @@ QUIC_MULTIPLE_TERMINATION_OFFSETS (0x80000005): | |||
QUIC_STREAM_CANCELLED (0x80000006): | |||
: The stream was cancelled | |||
|
|||
QUIC_CLOSED_CRITICAL_STREAM (0x80000007): | |||
: A stream that is critical to the protocol was closed or reset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor, but I think you can just say "closed," as "reset" is a type of closure.
On Wed, Feb 22, 2017 at 7:06 PM, Martin Thomson ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In draft-ietf-quic-transport.md
<#335 (comment)>:
> +include proof that the sender participated in the initial connection handshake.
+It is sent by a middlebox to notify an endpoint that the path is no longer
+functional. Such a Public Reset does not include a Public Reset Proof, it
+includes only the connection ID, which provides an assurance that the middlebox
+received packets on the flow.
+
+After receiving a Public Reset that omits the Public Reset Proof, an endpoint
+SHOULD stop sending on the path on which the Public Reset was received. Packets
+that are received on the path MAY be discarded. Without support for multipath,
+this means that the connection could become unusable.
+
+A middlebox might send a packet in an attempt to attack a connection. If
+packets continue to arrive after a Public Reset, this could indicate an attack.
+Delays could mean that a few packets arrive after a Public Reset, but if this
+continues well beyond the observed bounds on jitter, an endpoint MAY consider
+the Public Reset to have been spurious.
Happy to accept that guidance if you have it. I was going to say something
about round trips or packets, but it's not especially easy to pick an
answer or even general guidance.
If that's true, then we just shouldn't standardize this feature at all. I
note that we don't seem to have anything like WG consensus so far that
middleboxes should be able to do this.
… —
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#335 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABD1oQ2s3H3YiuYbmaXVM-5H8kuesWXNks5rfPeagaJpZM4MJYSu>
.
|
I've added some changes that are stricter on unauthenticated public resets. Firstly, any unauthenticated public reset can be ignored. Secondly, I've used |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should require a proof that's always possible to generate for a server/on-path-middlebox, and treat them exactly the same at the client. I don't think there's a reason for the client to do anything with a PR that doesn't include a proof if the proof MUST be generated by the server.
depends on the entity that generates the packet. | ||
|
||
A Public Reset packet sent by an endpoint indicates that it does not have the | ||
state necessary to continue with a connection. In this case, the endpoint will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: "... it does not have the state necessary to respond otherwise to the received packet."
draft-ietf-quic-transport.md
Outdated
It is sent by a middlebox to notify an endpoint that the path is no longer | ||
functional. Such a Public Reset does not include a Public Reset Proof, it | ||
includes only the connection ID, which provides an assurance that the middlebox | ||
received packets on the flow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's an issue in the way this is stated. There's no way for a client to distinguish between a PR generated by a server from that generated by a middlebox. I think all we want to say that a valid PR MUST include proof that the PR was generated in response to a client packet. There's no way to say who's generating this because both middleboxes and "new" servers will know exactly the same things about the packet. (A middlebox may know more about the connection.)
draft-ietf-quic-transport.md
Outdated
received packets on the flow. | ||
|
||
An endpoint MAY ignore a Public Reset that does not include a proof. A | ||
middlebox might send a packet in an attempt to attack a connection. If packets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Earlier text suggests that a middlebox may send a PR, and this suggests that such sending may be an attack. How's a client to know?
draft-ietf-quic-transport.md
Outdated
middlebox might send a packet in an attempt to attack a connection. If packets | ||
continue to arrive after a Public Reset, this could indicate an attack. Delays | ||
could mean that a few packets arrive after a Public Reset. However, if valid | ||
packets continue to arrive more than twice the RTT variance (see |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this recommendation is appropriate. I'd suggest that we discuss exactly what should happen here in a separate issue. Basically my concern is (i) there's no reason to believe that the server has any packets coming to the client, so waiting for some time may not make any sense, (ii) you could have the client force something by sending a PING frame to the server and then wait an RTO before closing the connection. At this point I think saying something like "a client MAY wait to see if the PR is legitimate" seems adequate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said, I don't see why a client needs to handle both a PR with proof and without proof. I'd lean towards making a proof mandatory (proof TBD) and then ignore PRs without proof. In my header proposal, the proof is simply the first 8 bytes of the packet.
Crazy idea alert: what is the value of having a separate CONNECTION_CLOSE frame at all, rather than adding an error code to Public Reset and doing all abrupt termination that way? If the Reset is lost, the next packet from the peer will be for an unknown connection ID and trigger a Public Reset anyway. So reliablility isn't a big issue. Pros: - It's simpler Cons:
|
So the idea would be more like:
* GOAWAY => no new streams created – connection is terminal
* Service all existing streams
* When the last stream closes and all data is ACK’d, discard your state and close silently
For abrupt departures, use a Public Reset instead?
Or are you suggesting that rather than closing silently, endpoints should emit a Public Reset packet once they’re totally done, and that’s the middleboxes’ signal to tear down state?
From: martinduke [mailto:[email protected]]
Sent: Thursday, February 23, 2017 12:23 PM
To: quicwg/base-drafts <[email protected]>
Cc: Mike Bishop <[email protected]>; Comment <[email protected]>
Subject: Re: [quicwg/base-drafts] Add real text on error handling (#335)
Crazy idea alert: what is the value of having a separate CONNECTION_CLOSE frame at all, rather than adding an error code to Public Reset and doing all abrupt termination that way?
If the Reset is lost, the next packet from the peer will be for an unknown connection ID and trigger a Public Reset anyway. So reliablility isn't a big issue.
Pros: - It's simpler
-solves the problem of telling middleboxes when to delete their state.
Cons:
* If the peer is idle for a long time after a lost Public Reset, it could maintain connection state for quite a while, whereas a retransmitted CONNECTION_CLOSE would have closed everything out. But with PING et al this is probably rare.
* You need some kind of limiter to avoid a window of packets coming from the peer resulting in a storm of Public Resets. But we already have to deal with this in TCP (and will have to in other QUIC scenarios) and it seems to work OK.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#335 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AEE2hcnLQgQegTGDGsURs67kZCLVAtFxks5rfeqtgaJpZM4MJYSu>.
|
Well this suggestion was focused on the abrupt termination case that Martin
T. outlines in the PR.
I am not sure at all where we have landed on what a clean close looks like,
but it seems like the messages prior to CONNECTION_CLOSE are doing all the
work. Sending a public reset when all that stuff is done would certainly
help.
…On Thu, Feb 23, 2017 at 12:28 PM, Mike Bishop ***@***.***> wrote:
So the idea would be more like:
* GOAWAY => no new streams created – connection is terminal
* Service all existing streams
* When the last stream closes and all data is ACK’d, discard your state
and close silently
For abrupt departures, use a Public Reset instead?
Or are you suggesting that rather than closing silently, endpoints should
emit a Public Reset packet once they’re totally done, and that’s the
middleboxes’ signal to tear down state?
From: martinduke ***@***.***
Sent: Thursday, February 23, 2017 12:23 PM
To: quicwg/base-drafts ***@***.***>
Cc: Mike Bishop ***@***.***>; Comment <
***@***.***>
Subject: Re: [quicwg/base-drafts] Add real text on error handling (#335)
Crazy idea alert: what is the value of having a separate CONNECTION_CLOSE
frame at all, rather than adding an error code to Public Reset and doing
all abrupt termination that way?
If the Reset is lost, the next packet from the peer will be for an unknown
connection ID and trigger a Public Reset anyway. So reliablility isn't a
big issue.
Pros: - It's simpler
-solves the problem of telling middleboxes when to delete their state.
Cons:
* If the peer is idle for a long time after a lost Public Reset, it could
maintain connection state for quite a while, whereas a retransmitted
CONNECTION_CLOSE would have closed everything out. But with PING et al this
is probably rare.
* You need some kind of limiter to avoid a window of packets coming from
the peer resulting in a storm of Public Resets. But we already have to deal
with this in TCP (and will have to in other QUIC scenarios) and it seems to
work OK.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<https://github.com/
quicwg/base-drafts#335#issuecomment-282109796>, or mute the thread<
https://github.com/notifications/unsubscribe-auth/
AEE2hcnLQgQegTGDGsURs67kZCLVAtFxks5rfeqtgaJpZM4MJYSu>.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#335 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AXRMEZsTc80As1HRVRkzJFY7pbaoAdBIks5rfevwgaJpZM4MJYSu>
.
|
To expand on the clean-connection case a bit more, I am not sure we have designed a mechanism that assures both sides that both sides have delivered all desired data. (we are hurting for a clean-close connection state machine). I believe that if we repurposed CONNECTION_CLOSE to say "You have acknowledged everything I intended to send", and made both sides send it, that would solve the problem. This would also require not overloading CONNECTION_CLOSE with abrupt termination purposes, as I describe above. |
Needs to be rebased, but LGTM. We still need to describe that data needs to be delivered when CONNECTION_CLOSE is received, but that can be added later. |
Is there actually a difference between a peer that has lost all state and a middlebox that can't deliver packets to its next hop? I'm having trouble envisioning how a PR could be authenticated as coming from an endpoint that used to know you, unless you do something with long-lived key-pairs for which the endpoint persists the keypair and the peers learn about the public key during connection setup. Short of that, I think it's going to be hard to exclude middleboxes while permitting rebooted peers. (And even harder if your peer is an Anycast IP address and you've suddenly wound up talking to a different instance.) |
Mike you can do this without public keys by using a single long term key and hash preimages as in PR #20. |
Ah, so it doesn't have to be an asymmetric keypair, but you do need a persistent key. Which implies some way to do key rollover semi-cleanly. (Generate a new key every boot, use it for new connections, use the previous key for any incoming connections you don't recognize.) I would expect the window after boot that you'd need to be rejecting old packets to be reasonably small -- basically whatever connection idle time you were advertising before. |
Yep, you got it @MikeBishop, though it isn't that simple for a cluster of machines. |
…eceiving packets more than rttvar after the Public Reset
e80ae82
to
b8ab6b7
Compare
This covers two things:
the general operation of Public Reset packets and their role in the protocol, including two separate admonishments not to use Public Reset unless you don't have connection state
how to generate errors for the connection and for streams
On the former, I want to add ekr's proposed public reset design in a separate PR.
On the latter, I've cribbed from HTTP/2 a little, but the advice here turns out to be subtly different.
Closes #289, #47.