Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consider encoding a duration to not acknowledge every CE-marked packet #244

Closed
marten-seemann opened this issue Oct 31, 2023 · 12 comments
Closed

Comments

@marten-seemann
Copy link
Contributor

marten-seemann commented Oct 31, 2023

We decided to remove "Ignore CE" in #118, and I'm suggesting to consider reversing that decision.

With new congestion control algorithms like Prague / L4S that send ECN marks much more frequently than classical ECN, it seems like there's the possibility that the "Ignore CE" field might become useful than we currently anticipate. For the calculation of its alpha value, Prague currently averages the fraction of ECN-marked packets over one RTT before acting on them, and while that's not ideal (see my recent email thread on the ICCRG list). While in this case it's still useful to receive the ECN in a timely manner, an immediate acknowledgement of CE-marked packets might not be that useful in all cases. To be clear, I don't have a specific use case for it at this point, but I'm worried that we're about to publish an RFC and might regret not having added this field soon.

Implementation-wise, this should be trivial to support on both on the sender and the receiver side. For the draft, we need to be careful how we recommend usage of this bit (see #107 for example). Maybe we can make our life easy by recommending that implementations never set this flag, unless in experiments. Future documents could then define how it's actually safe to use it.

@marten-seemann
Copy link
Contributor Author

An alternative would be to replace this boolean flag by some richer signaling, e.g. to only send an ACK for the first CE-marked packet, but then suppress triggering of immediate ACKs for subsequent CE-marked packets for a while.

@moeller0
Copy link

moeller0 commented Nov 1, 2023

Nit-pick: it is not that L4S necessarily sends ECN marks more frequently than rfc3168 does... The AQM will send CEs based on the congestion level and hence depending on load L4S and rfc3168 might elicit a similar amount of CE marks. An rfc3168 receiver will than reflect that signal as a TCP ECE flag until the sender acknowledged this with an CWR, so the ECE will stay "up" for at least a full RTT on all ACK packets... What L4S does is to put more value on the CE/ECT(1) and ECT(1)/CE transitions, but the actual marking frequency can be quite similar.
L4S needs to average, simply because it needs to reconstruct the congestion signal, which was encoded as marking probability (equivalent with marking frequency) and to extract that from a 1-bit/packet stream simply takes time, as you certainly know. Averaging/low pass filtering however takes time adding delay to the congestion response. But is the fact the congestion signal is already delayed a good justification for delaying it any further?

@marten-seemann
Copy link
Contributor Author

But is the fact the congestion signal is already delayed a good justification for delaying it any further?

It depends. If you know that the signal won't be acted upon for another RTT, it makes little sense to send an extra packet, potentially only containing that one ACK frame, just for that. It might make more sense to wait a few more ms / for a few more packets to arrive before sending that ACK. That said, if I understand section 2.4.2 of the Prague draft correctly, an ECN signal would be acted upon immediately, just with an outdated alpha value, so there seems value in immediately acknowledging that packet.

If L4S would lead to a number of consecutive packets being ECN marked though (more so than RFC 3168), as was claimed on the ICCRG list, this might be an important use case for a more fine-grained control on when an ECN marking should elicit an ACK. For example, I could see how acknowledging the first CE after a while right away would be valuable, but then delaying acknowledgements for subsequent CE marks for a certain time. This could be achieved by encoding a duration (similar to the Max Ack Delay field) in the ACK_FREQUENCY frame.

@moeller0
Copy link

moeller0 commented Nov 1, 2023

In that case however, stretches of CE, one can simply aggregate these, an ACK containing only CE marked packets should not be ambiguous at the receiver side (even if delayed a bit). But that really just moves the problematic pattern to 50% CE marking, with every other packet marked (which however is not a likely outcome for a true probability encoder, we can expect some more lumping and 50% marking rate probably should be rare...).

@IngJohEricsson
Copy link

IngJohEricsson commented Nov 1, 2023

My understanding is that it is important that the first CE after a stretch of non-CE (define number and/or time) should be ACKed immediately. After that, subsequent ACKs can be delayed, regardless packets are CE marked or not.

@marten-seemann
Copy link
Contributor Author

marten-seemann commented Nov 1, 2023

@moeller0 This discussion is not about how to encode CE markings. That's defined by the ACK frame in RFC 9000, and it does contain cumulative counts. The problem is that every CE-marked packet would be acknowledged separately, since section 13.2.1 says:

Similarly, packets marked with the ECN Congestion Experienced (CE) codepoint in the IP header SHOULD be acknowledged immediately, to reduce the peer's response time to congestion events.

Now this might or might not be the input that your congestion controller needs. The purpose of the ACK_FREQUENCY is to give the sender a way to tune how ACKs are sent, such that you can get the right inputs for your congestion controller.

My understanding is that it is important that the first CE after a stretch of non-CE (define number and/or time) should be ACKed immediately. After that, subsequent ACKs can be delayed, regardless packets are CE marked or not.

@IngJohEricsson Then encoding a time as I suggested in #244 (comment) would work. I'll update the title of this issue.

@marten-seemann marten-seemann changed the title consider bringing back "Ignore CE" consider encoding a duration to not acknowledge every CE-marked packet Nov 1, 2023
@IngJohEricsson
Copy link

Yes. Perhaps a time is a good idea, After all, the ECN feedback fields in the ACK frames can differ quite a lot numerically without wrap-around issues

@mirjak
Copy link
Contributor

mirjak commented Nov 3, 2023

There is text in this draft saying:

An endpoint SHOULD send an immediate acknowledgement when a packet marked with the ECN Congestion Experienced (CE) {{?RFC3168}} codepoint in the IP header is received and the previously received packet was not marked CE.

@moeller0
Copy link

moeller0 commented Nov 3, 2023

For rfc3168 (barring a drop of that ACK) that will be enough, however for L4S maybe not, given that steady-state rate seems to require 2 CE marks per RTT, so getting two CE's in sequence likely is a magnitude signal worth transmitting ASAP, no?

In L4S for better or worse the "congestion magnitude" that the control loop responds to proportionally is a rate coded signal, so the sender needs to have access to the CE marking frequency ideally in a timely fashion. If the response to each individual CE is small, than delaying the CE marks seems not be the best approach.

@ianswett
Copy link
Collaborator

ianswett commented Nov 8, 2023

I discussed this with @marten-seemann and my preference is to save this for a potential v2 of this draft. We have little experience with ECN deployment in QUIC today and L4S style feedback is also relatively new and congestion controllers are still evolving about how to incorporate it best.

If a v2 was created, a client could sent two transport params and the server would respond with whichever it supported/preferred. There is a bit of extra code complexity, but it gives us time to figure out whether a varint representing a number of packets, time, or something else is the right solution for ECN.

@nibanks
Copy link
Member

nibanks commented Nov 8, 2023

I support saving this for a possible future v2.

@mirjak
Copy link
Contributor

mirjak commented Feb 5, 2024

As I indicated above in the discussion, the current draft (inline with Acc ECN) only requires sending an immediate ACK for the first CE mark in a row. I tried to clarify this in PR #277. I think this is was what @marten-seemann asked for in his second comment in this discussion above.

I further agree that if we need anything else in future, that can simply be a new extension when we actually know what we need. Specifying a field that can only be used for "experimentation" seems really weird to me. Extensions are cheap, so just specify an own extension for experimentation in a separate draft and reserve a provision code point.

I'm closing this issue now. In case there is still any clarification needed in the draft, please open a new issue!

@mirjak mirjak closed this as completed Feb 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants