-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High latency on linux with delayed ack enabled #480
Comments
Hi, first thanks for your report. Since TLS does no buffering (the |
Thinking a bit more about that, my question is: what is in 36? My intuition is that the TLS handshake is already established, now in 34 the client requested a resource. The server may do whatever computation is needed, and only when that is finished (webmachine, database fetching, serializing data, ...), a write is called. May it be that the server logic takes 40ms? Again, having log messages with precise timestamps in the tls layer whenever a write is executed (and on the underlying tcp layer, when a write is executed) may be useful to dig deeper. |
Hi @hannesm, thanks for looking at this. I can produce more logs for sure, but I will first point out some things that I know already now: Yes, the TLS handshake is established already, it's a persistent TLS connection, in which several HTTP requests are sent sequentially. Frame 36 is the first of three TLS segments which contains In the meantime I checked Cohttp, and the Oh, I have an idea! Does the mirage TCP/IP stack also implement the Nagle's Algorithm? That would explain why I see it with both the mirage and the Linux TCP/IP stack, and it is known that the Nagle's Algorithm (disabled with the TCP_NODELAY socket option) and the delayed ACKs don't work well together, because the Nagle's Algorithm does wait for ACKs. |
Thanks for your comment.
You may want to ask on the mirage/mirage-tcpip repository, or on the MirageOS mailing list about the design and implementation of the mirage-tcpip stack. |
To summarize:
Is this a valid summary of the scenarios you tested? |
Yes, 1, 2 and 4 are correct, sorry for my confusing prose style. 🙈 I would now agree, that this is probably not related to the TLS stack. It's rather an issue of the not-disabled Nagle in both TCP stacks. Shall we close this issue and I find a better place for it? |
Here are some logs: Logs (Click me)
|
Since, as you mentioned, this has something to do with "cohttp" and "TCP", I'll close this issue here. |
What I'd propose to proceed with, considering that clearly TCP_NODELAY is a decent socket option (as your issue has shown). Using different functions (write / write_nodelay) does not compose well (i.e. the TLS stack won't offer such a multitude of functions that are mostly the same, only calling some other function on the lower level). Thus, maybe in the TCP interface we need a way to set socket options (or flow options) on a flow. Then, you could set the TCP_NODELAY before passing it to TLS, and the TCP stack could decide to use write / write_nodelay depending on the socket options configured. So, to me the solution would be (I have not looked into cohttp at all):
(take it or leave it, it is only what I'd propose -- feel free to find and propose other solutions) |
@hannesm thanks for the guidelines. Sounds good. |
I'm not really sure the issue is ocaml-tls, but it's my best guess for now.
I'm running a Cohttp server with TLS, and I was investigating a suspiciously high latency (~40ms roundtrip) over sub-ms network when sending requests in a persistent TLS connection from an arbitrary https client on Linux. The same did not appear, when doing the requests from MacOS. Looking at the network traces I saw the following issue:
The delay between frame 36 and 37 comes from the delayed ack, that is activated by default on Linux. But the question is: why is the server sending only the first fragment of the response and then waiting for the
ACK
? After theACK
arrives at the server, it immediately sends the rest of the response (TLS segment 2 and 3) in frame 38. (That there are 3 segments is probably an Cohttp issue, and maybe not even a real issue.)It does not seem to be the mirage TCP/IP stack, because I can also reproduce it when I build it with the unix socket interface.
If I enable TCP_QUICKACK on the client socket, the issue almost disappears, but sill the request-response needs two roundtrips instead of one, because the server side is waiting for the
ACK
of the first part of the response, before sending the rest.The first TLS segment contains the HTTP response code line:
Some versions:
The text was updated successfully, but these errors were encountered: