Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Empty reply from server" for long haul connections with Varnish 6 #330

Closed
jpastuszek opened this issue Mar 26, 2020 · 1 comment
Closed

Comments

@jpastuszek
Copy link

I am not sure where this report belongs but this started happening after I upgraded hitch from version 1.4.8 to 1.5.2.
With this upgrade I turned off TLS 1.0 and TLS 1.1; turned on TLS 1.3 and TCP Fast Open and selected TLS 1.2 recommended ciphers.

I am using hitch as TLS terminator for Varnish 6 using TCP (localhost) and PROXY v2 (write-proxy-v2 = on) protocol connection.

Just after the upgrade I noticed that long haul (Dublin to Singapore or California) HTTPS connection requests started getting "Empty reply from server" errors - this are done using curl (but have same issues with wget) to test HTTPS connectivity. Same requests with lower latency (in same AWS region) are finishing successfully. I can also see this errors in real live traffic on this upgraded hosts but not on the host which has not yet been upgraded.

After some investigation I found out that Varnish will be closing this connections with SessClose OVERLOAD and always exactly 0.049 real time value:

*   << Session  >> 21
-   Begin          sess 0 PROXY
-   SessOpen       127.0.0.1 40986 a1 127.0.0.1 2443 1585236427.002574 44
-   Proxy          2 10.3.0.237 46144 10.1.1.48 443
-   SessClose      OVERLOAD 0.049
-   End

Looks like this session close is called in bin/varnishd/cache/cache_session.c in ses_handle function. The 0.049 is related to default value of timeout_linger which is 0.05.
After I increased the timeout_linger I was getting different value that was the value of timeout_linger - 1. After increasing it high enough (0.4 in my case) the issue went away.
So it looks like latest hitch upgrade (or possibly something else as some other system packages were upgraded (kernel, libc, opnessl (just patch version))) resulted in triggering this timeout_linger before first request is even processed by Varnish resulting in drop of session without sending any data back to client and leaving no trace in access logs.

@dridi
Copy link
Member

dridi commented Mar 27, 2020

You need to increase workspace_session, Hitch is now passing the SNI entry in PROXY v2 preambles and as obscurely indicated you might "overload" the workspace.

See: varnishcache/varnish-cache@0f3bb35, the new default is 0.75k. If that's still not enough, try adding 0.25k more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants