Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Per-request rtt latency metrics in HTTP filters #12305

Closed
fcfort opened this issue Jul 27, 2020 · 19 comments · Fixed by #12944 or #13430
Closed

Per-request rtt latency metrics in HTTP filters #12305

fcfort opened this issue Jul 27, 2020 · 19 comments · Fixed by #12944 or #13430
Assignees
Labels
area/http area/windows design proposal Needs design doc/proposal before implementation help wanted Needs help!

Comments

@fcfort
Copy link
Contributor

fcfort commented Jul 27, 2020

The use-case I’d like to address is the ability to measure the rtt latency to upstreams so that this per-request rtt latency can be reported in detailed metrics.

Within the Linux TCP/IP stack, the TCP-level rtt is computed and available (like this) via file descriptors and retrieved via getsockopt calls using the file descriptor value. While the Listener filters can access this value, our desire is to record this value on a per-request basis.

What do you think about having the socket file descriptor accessible via the HTTP filter callbacks?

While one possibility would be to store this information in the Listener filter state and access it via the HTTP filter connection’s StreamInfo data, there isn’t a way of invoking a function in the context of the Listener filter to retrieve the TCP-level info from the underlying socket’s file descriptor at the time of the request.

Another possible option is to use HTTP/2 PING frames to measure rtt, but this would only apply to clients that speak HTTP/2 and in our use-case, the upstreams clients control the protocol being used.

This use-case was also discussed a bit in #1464.

@snowp
Copy link
Contributor

snowp commented Jul 27, 2020

I think exposing this is fine though I'd probably prefer adding a function to extract the RTT over exposing the FD itself.

I'm not sure exactly what the best way to measure the RTT time for a given request as I imagine it might change over the duration of the request? Sounded like @alyssawilk has some experience here, so I'll let her chime in.

@snowp snowp added area/http design proposal Needs design doc/proposal before implementation labels Jul 27, 2020
@alyssawilk
Copy link
Contributor

Yeah, I think we're going to want the callbacks to expose a socket like API, which has the rtt function as its first function, and make sure we do our best w.r.t. non-linux builds (cc @wrowe @zuercher for windows and mac respectively, @florincoras for API discussion)

rtt (and measurement accuracy) definitely changes over time but I think we can just comment that it's a current measure in the API, and leave it up to the callers to get and/or latch when appropriate.

@mattklein123
Copy link
Member

My take on this is that we shouldn't be exposing a socket like API at all, as RTT is very much dependent on the underlying upstream protocol which is fully abstracted. I would recommend instead just exposing an rtt() function somewhere on the upstream request/pool request/etc. and then the implementation can be codec/protocol specific? Either way I think we will need a short design doc on this one. Thank you!

@alyssawilk
Copy link
Contributor

Hm, I guess codec vs transport is if you're doing this via kernel checks or ping frame turnaround. We can do either, but frank suggested checking the kernel which I think is more about transport than codec.

@mattklein123
Copy link
Member

I guess my larger point is that I'm not sure what "per-request RTT" means when talking about multiplexed streams, so we need to clearly specify that?

@florincoras
Copy link
Member

Both Socket and IoHandle now have a getSocketOption function which, in case of the default SocketInterface implementations, maps to a getsockopts syscall. No need to go to the fd, in fact I would recommend never to work with the raw fd.

To @mattklein123's point, this measurement might not be relevant if the sockets are actually implemented in user space on top of tcp/udp (i.e., tls and quic) or some additional multiplexing happens (unfortunately not familiar with this yet). It could start making sense if tls/quic internally track per "stream" (for lack of better term) rtt and are willing to expose that info via the getSocketOption api.

@alyssawilk
Copy link
Contributor

RTT here is the TCP round trip time between the two endpoints. It should be the same at any point in time for all streams sharing that TCP connection. You can also get an application level variant which includes time to read the data from the kernel and respond but that's entirely different, and also not a per-stream construct as it's done with ping frames.
Are you thinking of RT (response time) rather than RTT (round trip time)?

@mattklein123
Copy link
Member

RTT here is the TCP round trip time between the two endpoints. It should be the same at any point in time for all streams sharing that TCP connection. You can also get an application level variant which includes time to read the data from the kernel and respond but that's entirely different, and also not a per-stream construct as it's done with ping frames.

Right, understood, but we need to clearly specify what this means on a per-request basis. For long lived streams, is this an RTT averaged over time? At the beginning? End? Etc.

Are you thinking of RT (response time) rather than RTT (round trip time)?

I'm not sure who this is addressed to. If me, no, I'm thinking about RTT but I'm trying to clarify what this will mean in the context of a request/response/stream.

@alyssawilk
Copy link
Contributor

I think this is just returning the kernel rtt as snapshot when the call is made (though kernel rtt does smooth a bit). I think it's more a property of the transport than the stream.

@fcfort
Copy link
Contributor Author

fcfort commented Jul 29, 2020

I think having some way of exposing the rtt of the underlying connection would be sufficient. This way it isn't connected to the request or stream at all. That is we could add a method like int Connection::lastRoundTripTime() const to src/include/envoy/network/connection.h

This would always report the current value of tcpi_rtt on Linux from the kernel's TCP/IP stack. It would be up to the caller to know when to sample this value. Callers in HTTP filters would access it via StreamFilterCallbacks::connection.

While we could pull up this function to be a property of the HTTP stream and delegate the implementation separately via config to be either from HTTP/2 PING frames, TCP/IP rtt, etc., this seems a bit overkill for now. Even in this case, I think it's fine to make this a "snapshot" of the rtt at whatever time it is called as long we document it as such.

@mattklein123
Copy link
Member

@fcfort yup sgtm

@fcfort
Copy link
Contributor Author

fcfort commented Jul 30, 2020

Given that this implementation is going to be platform-specific, what should I do for the other platforms like Win32/MacOS?

@wrowe, @zuercher, do you think there is an equivalent implementation for Windows and MacOS respectively?

Otherwise I could just return some default value and emit a warning log saying this feature is unimplemented on that platform.

@mattklein123
Copy link
Member

I would just return an optional and allow the caller to determine what to do if the value is not available.

@wrowe
Copy link
Contributor

wrowe commented Aug 18, 2020

I haven't done much collecting this data, but know it's available on Windows.

cc: @nigriMSFT @davinci26 @sunjayBhatia

@nigriMSFT
Copy link
Contributor

On Windows, RTT statistics can be gathered via SIO_TCP_INFO or GetPerTcpConnectionEStats in either the TCP_ESTATS_PATH_ROD or TCP_ESTATS_FINE_RTT_ROD structure.

@stale
Copy link

stale bot commented Sep 20, 2020

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or other activity occurs. Thank you for your contributions.

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label Sep 20, 2020
@pravb
Copy link

pravb commented Oct 2, 2020

We should use SIO_TCP_INFO on Windows. It is designed to be the equivalent of TCP_INFO.

@davinci26
Copy link
Member

davinci26 commented Oct 2, 2020

@mattklein123 I am reopening the issue under area\windows since the platforms offers support. We can/should add support for windows for this one as well

@davinci26 davinci26 reopened this Oct 2, 2020
@stale stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Oct 2, 2020
@mattklein123 mattklein123 added the help wanted Needs help! label Oct 2, 2020
@davinci26 davinci26 self-assigned this Oct 4, 2020
@davinci26
Copy link
Member

Assigned it to me because I have a good idea how to implement it and I have some time to pick it up this week.

That being said, If anyone from the community wants to take a stab at it more than happy to guide them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/http area/windows design proposal Needs design doc/proposal before implementation help wanted Needs help!
Projects
None yet
9 participants