-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
podman 5.2.0 does not close connections properly on MacOS #23616
Comments
Can you replace only gvproxy binary with the one included in in 5.1.2 installer, should be version gvproxy 0.7.3, we updated the version to 0.7.4 in the 5.2 installer (https://github.com/containers/gvisor-tap-vsock/releases). So I wonder if there is a regression there or on the VM side somehow. |
Hey @Luap99! It works properly with gvproxy 0.7.3:
Machine internal:
|
@maciej-szlosarczyk Do you know how to use git bisect? It would be great if you could build gvproxy from source ( |
Bisect points at this commit: containers/gvisor-tap-vsock@600910c I managed to narrow it down to these 15 or so lines: |
Thanks |
Hey! I did some additional investigation there and discovered two things:
|
@maciej-szlosarczyk Can you file a PR for it to https://github.com/containers/gvisor-tap-vsock repo? |
Original issue reported in podman here: containers/podman#23616
Original issue reported in podman here: containers/podman#23616 Signed-off-by: Maciej Szlosarczyk <[email protected]>
This causes a regression in gvproxy when it's used by podman: containers/podman#23616 Reverting inetaf/tcpproxy commit 2862066 is a bit convoluted, as we need to first undo the module name change (inet.af/tcpproxy -> github.com/inetaf/tcpproxy) done in commit 600910c and then a go module `replace` directive to redirect the no-longer existing inet.af/tcpproxy to the commit we want in github.com/inetaf/tcpproxy/ This way, the module name in gvisor-tap-vsock go.mod and in github.com/inetaf/tcpproxy go.mod are the same (inet.af/tcpproxy), and we can use older commits in this repository. It's unclear what's causing the regression, as the commit log/PR description/associated issue don't provide useful details: inetaf/tcpproxy@2862066 The best I could find is: tailscale/tailscale#10070 > The close in the handler sometimes occurs before the buffered data is forwarded. The proxy could be improved to perform a half-close dance, such that it will only mutually close once both halves are closed or both halves error. and inetaf/tcpproxy#21 which seems to be the same issue as inetaf/tcpproxy#38 which is the issue fixed by the commit triggering the regression. What could be happening is that before inetaf/tcpproxy commit 2862066, as soon as one side of the connection was closed, the other half was also closed, while after commit 2862066, the tcpproxy code waits for both halves of the connection to be closed. So maybe we are missing a connection close somewhere in gvproxy's code :-/
The lines you pointed to as causing the regression are related to this inetaf/tcpproxy commit inetaf/tcpproxy@2862066 I think what happens with this commit is that before this commit After commit 2862066fc2a9405880 however, the code now waits for both |
This causes a regression in gvproxy when it's used by podman: containers/podman#23616 Thanks to Maciej Szlosarczyk <[email protected]> for investigating and finding the faulty commit! Reverting inetaf/tcpproxy commit 2862066 is a bit convoluted, as we need to first undo the module name change (inet.af/tcpproxy -> github.com/inetaf/tcpproxy) done in commit 600910c and then a go module `replace` directive to redirect the no-longer existing inet.af/tcpproxy to the commit we want in github.com/inetaf/tcpproxy/ This way, the module name in gvisor-tap-vsock go.mod and in github.com/inetaf/tcpproxy go.mod are the same (inet.af/tcpproxy), and we can use older commits in this repository. It's unclear what's causing the regression, as the commit log/PR description/associated issue don't provide useful details: inetaf/tcpproxy@2862066 The best I could find is: tailscale/tailscale#10070 > The close in the handler sometimes occurs before the buffered data is forwarded. The proxy could be improved to perform a half-close dance, such that it will only mutually close once both halves are closed or both halves error. and inetaf/tcpproxy#21 which seems to be the same issue as inetaf/tcpproxy#38 which is the issue fixed by the commit triggering the regression. What could be happening is that before inetaf/tcpproxy commit 2862066, as soon as one side of the connection was closed, the other half was also closed, while after commit 2862066, the tcpproxy code waits for both halves of the connection to be closed. So maybe we are missing a connection close somewhere in gvproxy's code :-/ Signed-off-by: Christophe Fergeau <[email protected]>
This causes a regression in gvproxy when it's used by podman: containers/podman#23616 Thanks to Maciej Szlosarczyk <[email protected]> for investigating and finding the faulty commit! Reverting inetaf/tcpproxy commit 2862066 is a bit convoluted, as we need to first undo the module name change (inet.af/tcpproxy -> github.com/inetaf/tcpproxy) done in commit 600910c and then a go module `replace` directive to redirect the no-longer existing inet.af/tcpproxy to the commit we want in github.com/inetaf/tcpproxy/ This way, the module name in gvisor-tap-vsock go.mod and in github.com/inetaf/tcpproxy go.mod are the same (inet.af/tcpproxy), and we can use older commits in this repository. It's unclear what's causing the regression, as the commit log/PR description/associated issue don't provide useful details: inetaf/tcpproxy@2862066 The best I could find is: tailscale/tailscale#10070 > The close in the handler sometimes occurs before the buffered data is forwarded. The proxy could be improved to perform a half-close dance, such that it will only mutually close once both halves are closed or both halves error. and inetaf/tcpproxy#21 which seems to be the same issue as inetaf/tcpproxy#38 which is the issue fixed by the commit triggering the regression. What could be happening is that before inetaf/tcpproxy commit 2862066, as soon as one side of the connection was closed, the other half was also closed, while after commit 2862066, the tcpproxy code waits for both halves of the connection to be closed. So maybe we are missing a connection close somewhere in gvproxy's code :-/ Signed-off-by: Christophe Fergeau <[email protected]> Tested-by: Maciej Szlosarczyk <[email protected]>
containers/gvisor-tap-vsock#386 is merged and should address this issue. I'll make a gvisor-tap-vsock release soon. |
Sorry I was ooto, we did a new release yesterday. |
This should fix the regression reported in containers#23616 Signed-off-by: Christophe Fergeau <[email protected]>
This should fix the regression reported in containers#23616 Signed-off-by: Christophe Fergeau <[email protected]>
I'll try to see if this can be assigned on our end; @evidolob WDYT? => containers/gvisor-tap-vsock#387 |
This was fixed in 5.2.3 |
Issue Description
After upgrade to 5.2.0 gvproxy keeps connections open for a long time I noticed it while running tests against postgresql 11 running in a container. After a few runs postgres would accumulate enough connections and memory usage that it would either get killed due to memory limits or would report that too many connections are open.
This is the output of lsof from both the host and inside the podman machine ten minutes after I ran the test:
Looks like a regression between 5.1.2 and 5.2.0. Downgrading back to 5.1.2 fixes this issue.
Steps to reproduce the issue
docker.io/library/postgres:11
.Describe the results you received
Connections/File descriptors should be closed properly once they're closed by the client.
Describe the results you expected
Describe the results you expected
podman info output
Podman in a container
No
Privileged Or Rootless
Rootless
Upstream Latest Release
Yes
Additional environment details
podman version:
Additional information
Additional information like issue happens only occasionally or issue happens with a particular architecture or on a particular setting
The text was updated successfully, but these errors were encountered: