swarm: deprecate `KeepAlive::Until` #3844

thomaseizinger · 2023-04-27T13:31:42Z

Description

Deprecate the KeepAlive::Until variant. Users can achieve the same thing by having a timer internally.

Motivation

Understanding how to properly use KeepAlive::Until is difficult. It requires keeping around KeepAlive as a field in the ConnectionHandler otherwise it will be updated forever with a new timestamp. Having KeepAlive as a field thus makes it more difficult to accurately compute it. It could be set from No to Yes and back etc

Ideally, a ConnectionHandler can just compute its KeepAlive value from its internal state. There should be no reason to keep an idle connection around. If a ConnectionHandler as nothing to do, it should immediately return KeepAlive::No.

To avoid connections from being shut-down directly after it is established, NetworkBehaviours should pass initial state in handle_established_outbound_connection and handle_established_inbound_connection. This way, the connection_keep_alive function can correctly return KeepAlive::Yes.

Whether or not completely idle connections should be shut down or not is an application-wide decision and cannot reasonably be computed by a single ConnectionHandler. Users should use swarm::Config::with_idle_connection_timeout to configure that. See #4121.

Tasks

Give feedback

feat(identify): keep connection alive while we are using it #3876

send-it
feat(dcutr): keep connection alive while we are using it #3960

send-it
feat(request-response): deprecate Config::set_connection_keep_alive #4029

internal-change send-it
feat(gossipsub): deprecate Config::idle_timeout #4648

send-it
feat(swarm): deprecate KeepAlive::Until #4656

send-it
feat(kad): deprecate Config::set_connection_idle_timeout #4675

send-it
feat(gossipsub): remove KeepAlive::Until #4642

send-it
refactor(relay): remove KeepAlive::Until #4676

send-it
feat(swarm): remove KeepAlive::Until from OneshotHandler #4677

send-it
feat(request-response): remove Config::set_connection_keep_alive #4679

internal-change send-it
feat(kad): remove deprecated Config::set_connection_idle_timeout #4659

send-it
refactor(relay): keep connection alive while we are using it #4696

send-it trivial
Options

The text was updated successfully, but these errors were encountered:

thomaseizinger · 2023-04-27T13:32:25Z

cc @nathanielc

thomaseizinger · 2023-05-05T06:10:00Z

Identify doesn't use Until but #3876 still shows quite well how I think protocols should implement the keep-alive mechanism.

thomaseizinger · 2023-05-08T09:00:13Z

#3876 is queued for merging now. In a first step towards deprecating KeepAlive::Until, I'd like to move all protocols to this new approach of handling KeepAlive. We should do it as one PR per protocol.

Once all protocols have been ported, we can deprecate the enum variant itself and have our users to the same.

Currently, the `connection_keep_alive` function of identify does not compute anything but its return value is set through the handler state machine. This is hard to understand and causes surprising behaviour because at the moment, we set `KeepAlive::No` as soon as the remote has answered our identify request. Depending on what else is happening on the connection, this might close the connection before we have successfully answered the remote's identify request. To fix this, we now compute `connection_keep_alive` based on whether we are still using the connection. Related: #3844. Pull-Request: #3876.

nathanielc · 2023-05-09T14:43:28Z

To avoid connections from being shut-down directly after it is established, NetworkBehaviours should pass initial state in handle_established_outbound_connection and handle_established_inbound_connection. This way, the connection_keep_alive function can correctly return KeepAlive::Yes.

Has this part already been implemented? I had assumed this was a blocker to make many changes here. Was the identity protocol a simpler case?

thomaseizinger · 2023-05-09T18:28:27Z

To avoid connections from being shut-down directly after it is established, NetworkBehaviours should pass initial state in handle_established_outbound_connection and handle_established_inbound_connection. This way, the connection_keep_alive function can correctly return KeepAlive::Yes.

Has this part already been implemented? I had assumed this was a blocker to make many changes here. Was the identity protocol a simpler case?

Yes the identify protocol was simple in this regard because no messages are generated in the behaviour.

Help in migrating other protocols away from KeepAlive::Until would be much appreciated. Off the top of my head, kademlia might require some changes and perhaps gossipsub. I don't think any other protocol actively establishes connections?

drHuangMHT · 2023-05-10T03:02:13Z

After Until being removed, KeepAlive can be Copy or be replaced by boolean, any interest?

thomaseizinger · 2023-05-10T10:41:34Z

It is already Copy. Not sure there is any practical benefit from it being a boolean. Having a separate type gives us a place to put documentation on it for example.

Similar to #3876, we now compute `connection_keep_alive` based on whether we are still using the connection, applied to the `dcutr` protocol. Related: #3844. Pull-Request: #3960.

thomaseizinger · 2023-06-05T11:27:49Z

@mxinden Do you think it makes sense to add a general idle_connection_timeout to Swarm? It would trigger as soon as all ConnectionHandlers report KeepAlive::No. We can default it to 0 seconds.

When trying to transition libp2p-request-response away from KeepAlive::Until, I ran into the issue that the perf protocol currently relies on setting the connection timeout. Tests would also benefit from such a config option.

One could even ask whether we shouldn't deprecate keep_alive::Behaviour completely in favor of an idle connection timeout setting on Swarm.

tcoratger · 2023-06-09T14:42:43Z

@thomaseizinger In the relay protocol, I saw that there was at the end of the poll function in the behavior handler, there was the following part:

// Check keep alive status.
if self.reservation_request_future.is_none()
    && self.circuit_accept_futures.is_empty()
    && self.circuit_deny_futures.is_empty()
    && self.alive_lend_out_substreams.is_empty()
    && self.circuits.is_empty()
    && self.active_reservation.is_none()
{
    match self.keep_alive {
        KeepAlive::Yes => {
            self.keep_alive = KeepAlive::Until(Instant::now() + Duration::from_secs(10));
        }
        KeepAlive::Until(_) => {}
        KeepAlive::No => panic!("Handler never sets KeepAlive::No."),
    }
} else {
    self.keep_alive = KeepAlive::Yes;
}

If I'm correct, the goal of this snippet is to keep the connection alive few more seconds, even if the condition is false. By removing the self.keep_alive in the context of this issue, this snippet is removed and then we have to find a way to mimic this in order for the function connection_keep_alive to return KeepAlive::Yes in such circumstances, correct?

thomaseizinger · 2023-06-11T13:42:45Z

Yes, although in general, I'd like to move away from implicitly keeping connections open for longer than they are needed.

For the relay protocol, what happens if we just remove the Until bit and just determine KeepAlive based on the above conditions?

mxinden · 2023-06-26T11:40:52Z

Proposal above, namely to add an idle_connection_timeout to Swarm makes sense to me.

At first I wondered whether a idle_at_start_connection_timeout would make more sense. Thinking about it some more idle_connection_timeout i.e. a timeout applicable not only at the start of the connection, but throughout its lifecycle, might be easier to reason about.

thomaseizinger · 2023-06-26T17:19:48Z

Proposal above, namely to add an idle_connection_timeout to Swarm makes sense to me.

At first I wondered whether a idle_at_start_connection_timeout would make more sense. Thinking about it some more idle_connection_timeout i.e. a timeout applicable not only at the start of the connection, but throughout its lifecycle, might be easier to reason about.

Yeah I think that is very hard to teach and is a bit too specialized for my taste. Also, with the strategy of initialising ConnectionHandlers with the work they should do, the "a connection that might be useful later is idle initially" is not a problem.

On the other hand, having connections not close immediately when they go idle after doing some work is definitely a problem:

in our tests
in examples
when running the benchmarks

The benchmarks in particular are a special-case of applications that aren't strictly p2p but more client-server style, i.e. you want to connect to very particular nodes.

I opened an issue here: #4121

Previously, a connection would be shut down immediately as soon as its `ConnectionHandler` reports `KeepAlive::No`. As we have gained experience with libp2p, it turned out that this isn't ideal. For one, tests often need to keep connections alive longer than the configured protocols require. Plus, some usecases require connections to be kept alive in general. Both of these needs are currently served by the `keep_alive::Behaviour`. That one does essentially nothing other than statically returning `KeepAlive::Yes` from its `ConnectionHandler`. It makes much more sense to deprecate `keep_alive::Behaviour` and instead allow users to globally configure an `idle_conncetion_timeout` on the `Swarm`. This timeout comes into effect once a `ConnectionHandler` reports `KeepAlive::No`. To start with, this timeout is 0. Together with #3844, this will allow us to move towards a much more aggressive closing of idle connections, together with a more ergonomic way of opting out of this behaviour. Fixes #4121. Pull-Request: #4161.

Previously, a connection would be shut down immediately as soon as its `ConnectionHandler` reports `KeepAlive::No`. As we have gained experience with libp2p, it turned out that this isn't ideal. For one, tests often need to keep connections alive longer than the configured protocols require. Plus, some usecases require connections to be kept alive in general. Both of these needs are currently served by the `keep_alive::Behaviour`. That one does essentially nothing other than statically returning `KeepAlive::Yes` from its `ConnectionHandler`. It makes much more sense to deprecate `keep_alive::Behaviour` and instead allow users to globally configure an `idle_conncetion_timeout` on the `Swarm`. This timeout comes into effect once a `ConnectionHandler` reports `KeepAlive::No`. To start with, this timeout is 0. Together with libp2p#3844, this will allow us to move towards a much more aggressive closing of idle connections, together with a more ergonomic way of opting out of this behaviour. Fixes libp2p#4121. Pull-Request: libp2p#4161.

Previously, a connection would be shut down immediately as soon as its `ConnectionHandler` reports `KeepAlive::No`. As we have gained experience with libp2p, it turned out that this isn't ideal. For one, tests often need to keep connections alive longer than the configured protocols require. Plus, some usecases require connections to be kept alive in general. Both of these needs are currently served by the `keep_alive::Behaviour`. That one does essentially nothing other than statically returning `KeepAlive::Yes` from its `ConnectionHandler`. It makes much more sense to deprecate `keep_alive::Behaviour` and instead allow users to globally configure an `idle_conncetion_timeout` on the `Swarm`. This timeout comes into effect once a `ConnectionHandler` reports `KeepAlive::No`. To start with, this timeout is 0. Together with #3844, this will allow us to move towards a much more aggressive closing of idle connections, together with a more ergonomic way of opting out of this behaviour. Fixes #4121. Pull-Request: #4161.

thomaseizinger · 2023-10-15T08:59:54Z

@mxinden Shall we put a #[deprecated] on KeepAlive::Until already and currently #[allow] them within our repository? That would give us a head-start and allow (pun intended) us to remove KeepAlive::Until with the next breaking release already!

Deprecate the `Config::idle_timeout` function in preparation for removing the `KeepAlive::Until` entirely. Related: #3844. Pull-Request: #4648.

Related: #3844. Pull-Request: #4655.

mxinden · 2023-10-16T14:37:23Z

@mxinden Shall we put a #[deprecated] on KeepAlive::Until already and currently #[allow] them within our repository? That would give us a head-start and allow (pun intended) us to remove KeepAlive::Until with the next breaking release already!

Works for me @thomaseizinger.

In preparation for removing this variant, we are issuing a deprecation notice. The linked issue describes why and guides users in migrating away from it. This deprecation is backwards-compatible and allows us to remove the variant in the next breaking release. We haven't migrated all internal protocols yet but that isn't strictly necessary to issue the deprecation. Related: #3844. Pull-Request: #4656.

Related: #3844. Related: #4659. Pull-Request: #4675.

Related: #3844. Related: #4677 Pull-Request: #4680.

Related: #3844. Related: #4656. Pull-Request: #4677.

Related: #3844. Pull-Request: #4642.

`KeepAlive::Until` is being deprecated. We emulate the functionality within the `ConnectionHandler` to ensure relayed connections don't get immediately closed when a client establishes one. Related: #3844. Related: #4656. Pull-Request: #4676.

This function has been deprecated and can now be removed. Related: #3844. Related: #4678. Pull-Request: #4679.

Previously, the relay client was applying a 10 second timeout to idle connections. Instead, we now compute the connection keep alive based on whether we are still using the connection. This makes all tests pass apart from 1: `reuse_connection`. This makes sense as that connection is immediately idle once dialed. To make that test pass, we configure a global idle connection timeout of 1 second. In a real-world scenario, reusing a connection only applies if the connection is still alive due to other protocols being active. Related: #3844. Pull-Request: #4696.

Related: #3844. Related: #4656. Pull-Request: #4659.

thomaseizinger · 2023-10-20T11:16:10Z

This is all done! Many thanks to @leonzchang for all the help!

leonzchang · 2023-10-20T15:51:46Z

This is all done! Many thanks to @leonzchang for all the help!

@thomaseizinger Thanks for your guidance too!

…p_setup/alice.rs See: libp2p/rust-libp2p#3844

drHuangMHT mentioned this issue May 5, 2023

feat: deprecate 'KeepAlive::Until' globally #3880

Closed

4 tasks

thomaseizinger mentioned this issue May 5, 2023

swarm: Remove ConnectionHandler::Error #3591

Closed

thomaseizinger added priority:nicetohave difficulty:moderate help wanted labels May 8, 2023

thomaseizinger mentioned this issue May 8, 2023

feat(identify): keep connection alive while we are using it #3876

Merged

4 tasks

tcoratger mentioned this issue May 17, 2023

feat(dcutr): keep connection alive while we are using it #3960

Merged

4 tasks

thomaseizinger mentioned this issue Jun 5, 2023

feat(request-response): deprecate Config::set_connection_keep_alive #4029

Merged

4 tasks

thomaseizinger mentioned this issue Jun 26, 2023

swarm: Introduce an idle_connection_timeout config option #4121

Closed

thomaseizinger mentioned this issue Jul 31, 2023

feat(swarm): allow configuration to idle connection timeout #4161

Merged

4 tasks

thomaseizinger mentioned this issue Aug 9, 2023

Simplified handling of idle connections #4306

Closed

thomaseizinger added this to the Simplify idle connection management milestone Sep 17, 2023

thomaseizinger removed this from the Simplify idle connection management milestone Sep 19, 2023

thomaseizinger mentioned this issue Oct 13, 2023

fix(swarm): keep connections alive while active streams exist #4595

Merged

4 tasks

This was referenced Oct 13, 2023

feat(gossipsub): remove KeepAlive::Until #4642

Merged

feat(gossipsub): deprecate Config::idle_timeout #4648

Merged

mergify bot pushed a commit that referenced this issue Oct 15, 2023

feat(gossipsub): deprecate Config::idle_timeout

15ad4ea

Deprecate the `Config::idle_timeout` function in preparation for removing the `KeepAlive::Until` entirely. Related: #3844. Pull-Request: #4648.

leonzchang mentioned this issue Oct 16, 2023

chore(perf): remove unused module #4655

Merged

4 tasks

thomaseizinger mentioned this issue Oct 16, 2023

feat(swarm): deprecate KeepAlive::Until #4656

Merged

4 tasks

leonzchang mentioned this issue Oct 16, 2023

feat(kad): remove deprecated Config::set_connection_idle_timeout #4659

Merged

4 tasks

mergify bot pushed a commit that referenced this issue Oct 16, 2023

chore(perf): remove unused module

e1c1f03

Related: #3844. Pull-Request: #4655.

leonzchang mentioned this issue Oct 17, 2023

feat(kad): deprecate Config::set_connection_idle_timeout #4675

Merged

4 tasks

mergify bot pushed a commit that referenced this issue Oct 17, 2023

feat(kad): deprecate Config::set_connection_idle_timeout

411824a

Related: #3844. Related: #4659. Pull-Request: #4675.

mergify bot pushed a commit that referenced this issue Oct 17, 2023

feat(swarm): deprecated keep_alive_timeout from OneShotHandlerConfig

203e11c

Related: #3844. Related: #4677 Pull-Request: #4680.

mergify bot pushed a commit that referenced this issue Oct 20, 2023

feat(swarm): remove KeepAlive::Until from OneshotHandler

b1b46d1

Related: #3844. Related: #4656. Pull-Request: #4677.

mergify bot pushed a commit that referenced this issue Oct 20, 2023

feat(gossipsub): remove KeepAlive::Until

fafa6bc

Related: #3844. Pull-Request: #4642.

mergify bot pushed a commit that referenced this issue Oct 20, 2023

feat(request-response): remove Config::set_connection_keep_alive

79f961f

This function has been deprecated and can now be removed. Related: #3844. Related: #4678. Pull-Request: #4679.

thomaseizinger mentioned this issue Oct 20, 2023

refactor(relay): keep connection alive while we are using it #4696

Merged

4 tasks

mergify bot pushed a commit that referenced this issue Oct 20, 2023

feat(kad): remove deprecated Config::set_connection_idle_timeout

1e3ffeb

Related: #3844. Related: #4656. Pull-Request: #4659.

thomaseizinger closed this as completed Oct 20, 2023

binarybaron added a commit to UnstoppableSwap/core that referenced this issue Oct 23, 2024

fix(libp2p): Use Instant instead of KeepAlive in swap/src/network/swa…

01e8810

…p_setup/alice.rs See: libp2p/rust-libp2p#3844

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

swarm: deprecate `KeepAlive::Until` #3844

swarm: deprecate `KeepAlive::Until` #3844

thomaseizinger commented Apr 27, 2023 •

edited

Loading

Tasks

thomaseizinger commented Apr 27, 2023

thomaseizinger commented May 5, 2023

thomaseizinger commented May 8, 2023

nathanielc commented May 9, 2023

thomaseizinger commented May 9, 2023

drHuangMHT commented May 10, 2023

thomaseizinger commented May 10, 2023

thomaseizinger commented Jun 5, 2023

tcoratger commented Jun 9, 2023

thomaseizinger commented Jun 11, 2023

mxinden commented Jun 26, 2023

thomaseizinger commented Jun 26, 2023 •

edited

Loading

thomaseizinger commented Oct 15, 2023

mxinden commented Oct 16, 2023

thomaseizinger commented Oct 20, 2023

leonzchang commented Oct 20, 2023

swarm: deprecate KeepAlive::Until #3844

swarm: deprecate KeepAlive::Until #3844

Comments

thomaseizinger commented Apr 27, 2023 • edited Loading

Description

Motivation

Tasks

thomaseizinger commented Apr 27, 2023

thomaseizinger commented May 5, 2023

thomaseizinger commented May 8, 2023

nathanielc commented May 9, 2023

thomaseizinger commented May 9, 2023

drHuangMHT commented May 10, 2023

thomaseizinger commented May 10, 2023

thomaseizinger commented Jun 5, 2023

tcoratger commented Jun 9, 2023

thomaseizinger commented Jun 11, 2023

mxinden commented Jun 26, 2023

thomaseizinger commented Jun 26, 2023 • edited Loading

thomaseizinger commented Oct 15, 2023

mxinden commented Oct 16, 2023

thomaseizinger commented Oct 20, 2023

leonzchang commented Oct 20, 2023

swarm: deprecate `KeepAlive::Until` #3844

swarm: deprecate `KeepAlive::Until` #3844

thomaseizinger commented Apr 27, 2023 •

edited

Loading

thomaseizinger commented Jun 26, 2023 •

edited

Loading