-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Caddy prematurely closes connections to reverse proxies when using HTTP/1 #5431
Comments
Hi, thanks for the issue. In order to reproduce this, can you please provide the backend code that produces the response that Caddy has trouble proxying? |
I think golang/go@457fd1d might eventually fix that when we have access to it. I don't have time to test it though. |
You can find a minimal example here: https://github.com/supriyo-biswas/caddy-http1-proxy-issue I run the app server on Ubuntu 20.04 as well, though I believe the issue should also be reproducible on other distributions. Please let me know if you have trouble with getting it up and running. |
I'm seeing a very similar behaviour, it was very confusing because the reported "content size" does NOT match the transferred. However when I "curl" the actual endpoint that's behind the proxy the correct file content is returned. This pointed towards Caddy as the source of the issue. EDIT I'm seeing this on Caddy v2.3.0-alpine (docker image) |
If someone wants to push this along, you could try building Caddy with Go's master branch (see https://go.dev/doc/install/source) and changing Caddy to use The problem is that this feature will only land in Go 1.21, and we need to wait for Caddy's minimum version to be 1.21 to safely use this new functionality. That means at the earliest, it'll still be another year until we can do so (due to Go's releases being every 6 months). |
@AnharMiah Does your python code require a domain name? I tried it always return 404 when I |
@WeidiDeng I think you intended to mention me instead. If so, please look at the curl command in script.sh within the repo to see how you can make an example request. If you’re running the server directly (as opposed to proxying through Caddy) you’ll need to use the /run/xyz instead of /api/run/xyz instead. |
Roughly one in 50 requests, though it could be that this occurs only on Linux; I use it for both development and production. Next steps for me:
|
I updated the repository with instructions so that the repository can be run on any Linux distribution that has Docker and git installed, without having to deal with the mess that is Python packaging. I was able to reproduce the issue on both HTTP and HTTPS with Debian 10 (5.10.0-21, x86_64) and CentOS 9 Stream (5.14.0-285, x86_64), so it seems to affect all Linux distributions. How often it occurs seems to be based on the network latency, when testing over the loopback (localhost) interface I needed ~400-500 requests to see the disconnect, whereas when testing the server over the internet it could be reproduced in under 50 requests, and sometimes even with 2 requests! When the connection is closed by Caddy, an error showing a bad attempt to read from a closed socket shows up in the logs:
|
I'm guessing this is a issue with stdlib, but I'll do some digging to determine which part is wrong. |
In a rather confusing twist, the issue can't be reproduced on CentOS 8 Stream (4.18.0-448, x86_64) for upto ~3k requests, so I wonder if there's a kernel bug that is surfaced only by how Go's stdlib (or perhaps Caddy) makes network syscalls. In any case, this is a very strange issue and I'm out of ideas about how to dig deep into it. |
Update:Found a terrible work around 🤣 : I used an nginx as a side car, so it appears nginx is able to handle proxing it and then Caddy sits in front of nginx. It's terrible but works. This means Nginx is more forgiving/graceful of this connection (I also tried with Apache and can confirm Apache is also able to handle this type of connection). Also it if helps it only seems to happen on a very specific sized file which is about 140kb, smaller files transfer fine. But at 140kb the file gets cut to about 109kb. The back end is Rust Canteen. This leads me to conclude that both Nginx and Apache handle this more gracefully then Caddy, so I would say Caddy is in my view the issue here. Edit: I'm not entirely sure if my issue is related to this issue, as in my case it happens consistently every single time, that is that single 140kb JavaScript file is truncted without fail (but Nginx and Apache do not) |
@AnharMiah it would be helpful if you could try to assemble a minimally reproducible example using that stack. Ideally in Docker if you can, so we don't need to install the toolchains on our machines. |
@francislavoie I'll see what I can do when I get some time this weekend, and yes I'll containerise the entire thing. The environment is Linux, but I'll include all the relevant version numbers 👍 |
Is there any way I can help move forward this issue? I'm more than happy to help, its just that I'm out of ideas as to how to proceed, especially given my observation about the kernel versions being a factor. |
@supriyo-biswas See my comment above: #5431 (comment) |
Last night, I decided to see how far I could get playing with My progress on that can be found in the https://github.com/caddyserver/caddy/compare/response-controller branch (the first commit could land in when we bump Caddy's minimum to Go 1.20, but the Problem is, it's impossible to build this branch right now on the latest unreleased Go version (i.e. See quic-go/quic-go#3757 where I asked for guidance about building against But that does mean there's no way to push this issue forwards at all, we're hard-blocked on waiting for those things to get resolved upstream before we can test |
If anyone here would like to try out #5654, this should have the fix. You'll need to build with Go 1.21. You can install the latest commit of Go with https://pkg.go.dev/golang.org/dl/gotip, or use a Go 1.21 release candidate build from https://go.dev/dl/#go1.21rc3. Checkout the Caddy repo on that branch, and run
|
Details
Description
I had initially reported this issue with v2.6.2 here (https://caddy.community/t/caddy-prematurely-closes-connections-to-reverse-proxies-when-using-http-1/17974), which still persists despite the bugfixes to reverse_proxy in 2.6.4, so I'm creating this Github issue.
I have the following Caddyfile that reverse proxies connections to a locally running HTTP/1 API server at port 2000. Whenever it receives a POST request, it sends a 200 OK with Transfer-Encoding: chunked, and streams its response bit by bit, in delays of 2-3 seconds.
When I use HTTP/1 to make requests, I’ll sometimes get a truncated response that only contains the first chunk produced by the API server, but not the rest. However, this issue does not occur on HTTP/2.
A script like the one below can be used to reproduce the issue:
Looking at a packet capture taken on
lo
on TCP port 2000, (sudo tcpdump -i lo -w my.pcap -s 0 'tcp and src or dst port 2000'
), I see that the API server had just sent the first chunk of data. However, Caddy proceeded to acknowledge the received data, which is fine, but then it also proceeds to request a closure of the TCP connection. The FIN, ACK is reciprocated by the app server, and the connection is closed.The text was updated successfully, but these errors were encountered: