-
Notifications
You must be signed in to change notification settings - Fork 573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TLS-Anvil server tests spordically fail #3694
Comments
I'll look into this! |
Well, the connection in this test sometimes runs into a timeout. Btw. there was also another test that failed once: selectsSameCipherSuite in https://github.com/randombit/botan/actions/runs/5915148567/job/16041123636. I tried both tests locally and had no problems at all, even when configuring a very small timeout of 50ms. Since the ci action runners are sometimes relatively slow, I configured a timeout of 5 sec per TLS session. I thought this was enough. My only guess is that the GitHub runner takes a nap for whatever reason, taking longer than these 5 seconds to send the respective TLS packets. Since HelloRetryRequest tests take longer due to the additional roundtrip, it becomes only apparent in these tests. I can't think of any other reason. What do you think? Can GitHub runners really be this slow? |
Lags of five seconds sound quite unlikely to me, tbh. It could also be some sort of race condition in the CLI's TLS code. Some very narrow races tend to show up much more frequently on GitHub Actions than on local hosts, because the runners are slower. 🤡 Would be amazing to get a stack trace of the server process when it fails with a timeout. Is that something the TLS anvil toolchain supports? |
Might really be that slow, might be a race in our code - either seems like a plausible explanation to me. Might be worth trying to run this test while the server process is executed under |
I admit I have some trouble understanding the helgrind output. I ran the tests with the tls_http_server and helgrind: For experimentation, I even ran the tests against the normal tls_server
This means asynchronous RSA is only used when the There are also a bunch of races within the ASIO stuff of the tls_http_server, but it's hard for me to interpret these results... |
Let's look into it together. Maybe we can ultimately combine that with #3659. |
I misinterpreted the TLS-Anvil log. The tests did not fail due to a timeout, as I thought, but for another reason. Botan does not cause it, but TLS-Anvil, it seems. To address this issue, I opened a PR for TLS-Anvil. I'm optimistic about fixing the problem soon. |
The test
server.tls13.rfc8446.HelloRetryRequest.selectsSameCipherSuiteAllAtOnce
seems to fail every so oftenhttps://github.com/randombit/botan/actions/runs/6079851755/job/16492923998
https://github.com/randombit/botan/actions/runs/6167596027/job/16738779015
(cc @FAlbertDev)
The text was updated successfully, but these errors were encountered: