-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Per-request rtt latency metrics in HTTP filters #12305
Comments
I think exposing this is fine though I'd probably prefer adding a function to extract the RTT over exposing the FD itself. I'm not sure exactly what the best way to measure the RTT time for a given request as I imagine it might change over the duration of the request? Sounded like @alyssawilk has some experience here, so I'll let her chime in. |
Yeah, I think we're going to want the callbacks to expose a socket like API, which has the rtt function as its first function, and make sure we do our best w.r.t. non-linux builds (cc @wrowe @zuercher for windows and mac respectively, @florincoras for API discussion) rtt (and measurement accuracy) definitely changes over time but I think we can just comment that it's a current measure in the API, and leave it up to the callers to get and/or latch when appropriate. |
My take on this is that we shouldn't be exposing a socket like API at all, as RTT is very much dependent on the underlying upstream protocol which is fully abstracted. I would recommend instead just exposing an |
Hm, I guess codec vs transport is if you're doing this via kernel checks or ping frame turnaround. We can do either, but frank suggested checking the kernel which I think is more about transport than codec. |
I guess my larger point is that I'm not sure what "per-request RTT" means when talking about multiplexed streams, so we need to clearly specify that? |
Both To @mattklein123's point, this measurement might not be relevant if the sockets are actually implemented in user space on top of tcp/udp (i.e., tls and quic) or some additional multiplexing happens (unfortunately not familiar with this yet). It could start making sense if tls/quic internally track per "stream" (for lack of better term) rtt and are willing to expose that info via the |
RTT here is the TCP round trip time between the two endpoints. It should be the same at any point in time for all streams sharing that TCP connection. You can also get an application level variant which includes time to read the data from the kernel and respond but that's entirely different, and also not a per-stream construct as it's done with ping frames. |
Right, understood, but we need to clearly specify what this means on a per-request basis. For long lived streams, is this an RTT averaged over time? At the beginning? End? Etc.
I'm not sure who this is addressed to. If me, no, I'm thinking about RTT but I'm trying to clarify what this will mean in the context of a request/response/stream. |
I think this is just returning the kernel rtt as snapshot when the call is made (though kernel rtt does smooth a bit). I think it's more a property of the transport than the stream. |
I think having some way of exposing the rtt of the underlying connection would be sufficient. This way it isn't connected to the request or stream at all. That is we could add a method like This would always report the current value of While we could pull up this function to be a property of the HTTP stream and delegate the implementation separately via config to be either from HTTP/2 |
@fcfort yup sgtm |
Given that this implementation is going to be platform-specific, what should I do for the other platforms like Win32/MacOS? @wrowe, @zuercher, do you think there is an equivalent implementation for Windows and MacOS respectively? Otherwise I could just return some default value and emit a warning log saying this feature is unimplemented on that platform. |
I would just return an optional and allow the caller to determine what to do if the value is not available. |
I haven't done much collecting this data, but know it's available on Windows. |
On Windows, RTT statistics can be gathered via SIO_TCP_INFO or GetPerTcpConnectionEStats in either the TCP_ESTATS_PATH_ROD or TCP_ESTATS_FINE_RTT_ROD structure. |
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or other activity occurs. Thank you for your contributions. |
We should use SIO_TCP_INFO on Windows. It is designed to be the equivalent of TCP_INFO. |
@mattklein123 I am reopening the issue under area\windows since the platforms offers support. We can/should add support for windows for this one as well |
Assigned it to me because I have a good idea how to implement it and I have some time to pick it up this week. That being said, If anyone from the community wants to take a stab at it more than happy to guide them. |
The use-case I’d like to address is the ability to measure the rtt latency to upstreams so that this per-request rtt latency can be reported in detailed metrics.
Within the Linux TCP/IP stack, the TCP-level rtt is computed and available (like this) via file descriptors and retrieved via
getsockopt
calls using the file descriptor value. While the Listener filters can access this value, our desire is to record this value on a per-request basis.What do you think about having the socket file descriptor accessible via the HTTP filter callbacks?
While one possibility would be to store this information in the Listener filter state and access it via the HTTP filter connection’s StreamInfo data, there isn’t a way of invoking a function in the context of the Listener filter to retrieve the TCP-level info from the underlying socket’s file descriptor at the time of the request.
Another possible option is to use HTTP/2
PING
frames to measure rtt, but this would only apply to clients that speak HTTP/2 and in our use-case, the upstreams clients control the protocol being used.This use-case was also discussed a bit in #1464.
The text was updated successfully, but these errors were encountered: