-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configuring nighthawk-client not to reusing session makes results 10 times worse (and utilization of w CPU/RAM assign to nighthawk server is very low less than 10% ) #853
Comments
Thank you for the bug report @lzielezinska. Nighthawk reuses significant portions of the Envoy code which includes code that establishes TLS sessions. It is possible that you have discovered a hot spot in the Envoy code. To confirm this, would you be able to provide a few more details?
We can look to see if there is anything obvious in the outputs and counters. Once we have this, we can try to reproduce this problem and try to profile the execution. |
hello let me share some info.
|
Thank you for providing the details @pawelrxyz. The lowered CPU usage suggests some thread/lock contention. I will try to reproduce this problem and profile it to identify the contended path. |
I reran the experiment and collected CPU and contention profiles. I wasn't able to reproduce the problem and did not see any unexpected behavior when I executed the test. nighthawk_output_standard.txt is an experiment that runs without the custom TLS socket, i.e. session resumption was enabled. Confirmed by counters:
nighthawk_output_custom_socket.txt is an experiment that runs with the custom TLS socket, i.e. session resumption was disabled. The counter Each test took about 5 min of wall time. I observed the following:
Since contentions went down, I no longer suspect lock contention. The increased CPU usage in the NH test server seems to be directly related to the need to negotiate TLS sessions which is expected. See the diff below which compares the two runs: The increased CPU usage in the NH client seems to be related to more calls to Since I wasn't able to reproduce this, I could use your help here. For one, you could review this experiment and point out if I diverged from yours anywhere. Even better, if you can still reproduce this, could you provide both CPU and contention profiles from your run, so that we can see what the NH client and test server are doing while you are observing the decreased CPU usage. |
Title: Results for nighthawk client configured not to reuse the session are 10 times less (and utilization of w CPU/RAM assign to nighthawk server is very low, less than 10% )
Description:
Command used to run nighthawk:
nighthawk_client -p http2 --max-requests-per-connection 1 --address-family v4 "https://<IP>:<PORT>" --concurrency auto --rps "1000" --connections "$10" --duration "60" --request-body-size "400" --transport-socket '{"name": "envoy.transport_sockets.tls", "typed_config": { "@type":"type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext","max_session_keys":"0"}}'
When the paramater --transport-socket is not passed, we have noticed 10 times better results and almost 100% utilization of CPU assigned to nighthawk_server. We would like to run nighthawk in a way that client is not reusing the session, parameter max_session_keys":"0" configured that.
Reproduction steps:
Run nighthawk with parameter:
nighthawk_client ... --transport-socket '{"name": "envoy.transport_sockets.tls", "typed_config": { "@type":"type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext","max_session_keys":"0"}}' ...
force Nighthawk client not to reuse session, but makes result very low and Nighthawk_server almost not working
The text was updated successfully, but these errors were encountered: