-
Notifications
You must be signed in to change notification settings - Fork 113
yamux: keepalive failed: i/o deadline reached #231
Comments
Ah, I think I have seen this recently, but not had time to track down why... currently, if you 'ignore' a container, it will stop responding and then die... good find! +1 for I think this is a real current issue! |
I've added a P1 label as well... |
Dumb question, can we simply disable this timeout ? |
@sboeuf ehmm, which timeout? is there a timeout in yamux? |
Take a look at this test on the yamux repo: https://github.com/hashicorp/yamux/blob/master/session_test.go#L797-L838 |
yep, probably maybe we need to set the EnableKeepAlive? https://github.com/hashicorp/yamux/blob/d1caa6c97c9fc1cc9e83bbe34d0603f9ff0ce8bd/mux.go#L16 |
Yes we need to make sure we understand the potential cases but a smart usage of those three flags: https://github.com/hashicorp/yamux/blob/d1caa6c97c9fc1cc9e83bbe34d0603f9ff0ce8bd/mux.go#L16-L27 should do the trick ;) |
Interesting find 👍 |
@WeiZhang555 @devimc -- since this is marked P1... seems we should probably do this with priority. Any takers? |
This is interesting... Line 149 in ea0e6ae
So yamux will use default config, which already EnableKeepAlive : agent/vendor/github.com/hashicorp/yamux/mux.go Lines 38 to 47 in ea0e6ae
|
Seems this is because we should update the vendor package Now in While in current agent/protocols/client/client.go Lines 122 to 150 in ea0e6ae
I might give it a try on Monday... |
Sorry the yamux client here is In my TE, I can't reproduce the same phenomenon by leaving it for long time.
|
@jingxiaolu thanks for debugging it, let me try to find another way to reproduce it |
Hi I found an easy way to reproduce it Terminal 1
Terminal 2
Terminal 1 gets stuck |
actually 45 seconds is enough to reproduce this issue. |
Good job for both of you~~~ |
We don't know how much time a container can be paused, hence connection write timeout should be big enough to don't close the connection while the container is paused. fixes kata-containers/agent#231 fixes kata-containers#70 Signed-off-by: Julio Montes <[email protected]>
please take a look kata-containers/proxy#71 |
We don't know how much time a container can be paused, hence connection write timeout should be disabled to don't close the connection while the container is paused. fixes kata-containers/agent#231 fixes kata-containers#70 Signed-off-by: Julio Montes <[email protected]>
We don't know how much time a container can be paused, hence connection write timeout should be disabled to don't close the connection while the container is paused. fixes kata-containers/agent#231 fixes kata-containers#70 Signed-off-by: Julio Montes <[email protected]>
Making a mistake. Nowadays, we just support |
Disable yamux keep alive in channel and client. yamux keep alive feature closes the connection with proxy and agent when it's unable to ping them. fixes kata-containers#231 Signed-off-by: Julio Montes <[email protected]>
Disable yamux keep alive in channel and client. yamux keep alive feature closes the connection with proxy and agent when it's unable to ping them. fixes kata-containers/proxy#70 fixes kata-containers#231 Signed-off-by: Julio Montes <[email protected]>
yamux client runs in the proxy side, sometimes the client is handling other requests and it's not able to response to the ping sent by the server and the communication is closed. To avoid IO timeouts in the communication between agent and proxy, keep alive should be disabled. fixes kata-containers/proxy#70 fixes kata-containers#231 Signed-off-by: Julio Montes <[email protected]>
yamux client runs in the proxy side, sometimes the client is handling other requests and it's not able to response to the ping sent by the server and the communication is closed. To avoid IO timeouts in the communication between agent and proxy, keep alive should be disabled. fixes kata-containers/proxy#70 fixes kata-containers#231 Signed-off-by: Julio Montes <[email protected]>
yamux client runs in the proxy side, sometimes the client is handling other requests and it's not able to response to the ping sent by the server and the communication is closed. To avoid IO timeouts in the communication between agent and proxy, keep alive should be disabled. fixes kata-containers/proxy#70 fixes kata-containers#231 Signed-off-by: Julio Montes <[email protected]>
yamux client runs in the proxy side, sometimes the client is handling other requests and it's not able to response to the ping sent by the server and the communication is closed. To avoid IO timeouts in the communication between agent and proxy, keep alive should be disabled. fixes kata-containers/proxy#70 fixes kata-containers#231 Signed-off-by: Julio Montes <[email protected]>
I got following warning after upgrading dep tool: Warning: the following project(s) have [[constraint]] stanzas in Gopkg.toml: ✗ github.com/hashicorp/yamux However, these projects are not direct dependencies of the current project: they are not imported in any .go files, nor are they in the 'required' list in Gopkg.toml. Dep only applies [[constraint]] rules to direct dependencies, so these rules will have no effect. Either import/require packages from these projects so that they become direct dependencies, or convert each [[constraint]] to an [[override]] to enforce rules on these projects, if they happen to be transitive dependencies, So let's convert constraint to override over yamux. In the meanwhile, update the yamux vendor. Full commit list: 4c2fe0d (origin/b-consul-3040) Dont output keepalive error when the session is closed f21aae5 Make sure to drain the timer channel on defer, and a clarifying comment 601ccd8 Make receive window update logic a bit cleaner 02d320c Uses timer pool in sendNoWait, like in waitForSendErr cf433c5 window update unit test for partial read; benchmark large buffer ca8dfd0 improve memory utilization in receive buffer, fix flow control 683f491 Fix race around read and write deadlines in Stream (#52) 40b86b2 Add public session CloseChan method (#44) Note that commit 4c2fe0d might also help kata-containers/agent/issues/231. Signed-off-by: Peng Tao <[email protected]>
We don't know how much time a sandbox/container can be paused, hence connection write timeout should be disabled to don't close the connection while the sandbox/container is paused. The same issue has been fixed in kata-proxy, for katabuiltin proxy, it also needs this fix. fixes kata-containers#231 fixes kata-containers#70 Signed-off-by: fupan <[email protected]>
yamux client runs in the proxy side, sometimes the client is handling other requests and it's not able to response to the ping sent by the server and the communication is closed. To avoid IO timeouts in the communication between agent and proxy, keep alive should be disabled. fixes kata-containers/proxy#70 fixes kata-containers#231 Signed-off-by: Julio Montes <[email protected]>
yamux client runs in the proxy side, sometimes the client is handling other requests and it's not able to response to the ping sent by the server and the communication is closed. To avoid IO timeouts in the communication between agent and proxy, keep alive should be disabled. Depends-on: github.com/kata-containers/proxy#91 fixes kata-containers/proxy#70 fixes kata-containers#231 Signed-off-by: Julio Montes <[email protected]>
We don't know how much time a container can be paused, hence connection write timeout should be disabled to don't close the connection while the container is paused. fixes kata-containers/agent#231 fixes #70 Signed-off-by: Julio Montes <[email protected]>
yamux client runs in the proxy side, sometimes the client is handling other requests and it's not able to response to the ping sent by the server and the communication is closed. To avoid IO timeouts in the communication between agent and proxy, keep alive should be disabled. Depends-on: github.com/kata-containers/proxy#91 fixes kata-containers/proxy#70 fixes #231 Signed-off-by: Julio Montes <[email protected]>
yamux client runs in the proxy side, sometimes the client is handling other requests and it's not able to response to the ping sent by the server and the communication is closed. To avoid IO timeouts in the communication between agent and proxy, keep alive should be disabled. Depends-on: github.com/kata-containers/proxy#91 fixes kata-containers/proxy#70 fixes kata-containers#231 Signed-off-by: Julio Montes <[email protected]>
Description of problem
Run a container with -ti and pause it for 45 seconds
Terminal 1
Terminal 2
Terminal 1 gets stuck after 45 second and next message is logged in the journal
The text was updated successfully, but these errors were encountered: