-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
consider removing / disabling keep-alives #44
Comments
if our release isn't doing anything with them, I'd probably be in favor of a major version update that removes them entirely. |
Yeah, I'd love to remove this from the stream multiplexer (especially given that mplex doesn't support it). However, we need keepalives in all transports before we should consider removing them here.
|
On the protocol level, Websockets should have support for keep-alive: https://tools.ietf.org/html/rfc6455#section-5.5.2. That said, isn't it guaranteed that we always run Websockets over TCP or QUIC, both of which already support keep-alives?
I'm not sure there's a problem here. The relay will also run the TCP / QUIC transport, so they'll do keep-alives to both sides of the relayed connection. Even if they disable that (don't really see a reason why they'd do that, but let's just assume they do), receiving and acknowledging a keep-alive packet from both connection endpoints should be sufficient to keep any NAT binding alive. |
Yes, we just need to enable them on the websocket transport. That is, the websocket transport doesn't, IIRC, use the TCP transport under the covers. It uses a shared "reuseport transport". |
The problem is the relay. The relay can let the connection start, then DoS both sides by hanging (not necessarily due to malice). Without keepalives, we won't notice that the connection simply doesn't work. With keepalives, the relay will need to let some traffic through sometimes. (obviously, we should try to upgrade anyways; but we still want to detect hung relayed connections) |
A malicious relay can always stall the connection, so that's nothing we can defend against. The only thing we can and should do is detect it and time out in a timely manner. |
Yep, that's my point. We need some form of keepalive to detect stalled connections. |
Maybe I'm misunderstanding you. Dont we have that, because the two peers send keepalives to the relay? |
No. We have keepalives _to_ the relay but that assumes that the relay is behaving correctly. We need keepalives that go end-to-end. The relay might:
* Be malicious. We can't fully protect against this, but we can at least ensure that the relay allows _some_ traffic through.
* Have a stuck/frozen libp2p node. TCP keepalives will continue to work because the kernel is fine.
|
NAT mappings are kept from expiring if the the NAT occasionally sees a connection being used.
Architecturally, sending keep-alives is a responsibility of the underlying transport, not of the stream multiplexer. For TCP, we can set the SO_KEEPALIVE socket option in go-tcp-transport, and the kernel will take care of keeping the connection alive. When running yamux on top of any other transport, we can't make any assumptions about the necessity and the frequency of keep-alive intervals anyway.
I suggest adding SO_KEEPALIVE support to go-tcp-transport and removing the respective code from go-yamux. Open question: Do we only deprecate
Config.EnableKeepAlive
andConfig.KeepAliveInterval
, or do we remove them?WDYT, @Stebalien and @willscott?
The text was updated successfully, but these errors were encountered: