Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an AddressBook to Swarm #2133

Closed
wants to merge 3 commits into from

Conversation

thomaseizinger
Copy link
Contributor

@thomaseizinger thomaseizinger commented Jul 11, 2021

Couple of things to discuss:

  1. I chose to put the AddressBook within the Swarm to minimize changes of the public API.
  2. To make my test more reliable, I added a variant SwarmEvent::AddressBookUpdated. This could in theory be removed again if we want to keep changes to the public API as small as possible.
  3. I only updated the Identify protocol to use this. I don't know enough about Kademlia to make changes there and I think for the time being (i.e. as long as NetworkBehaviour::addresses_of_peer exists), letting Kademlia manage its own internal address book is actually better.

Open questions

  • Should we add a method Swarm::add_address? Doing so would allow us to simplify the public API of at least libp2p-request-response but it is also a commitment to Swarm knowing about the address book. With the current API changes (NetworkBehaviourAction::ReportPeerAddr and SwarmEvent::AddressBookUpdated) we could still move into a different direction and f.e. not embed an AddressBook inside Swarm.
  • Should libp2p-kad emit NetworkBehaviourAction::ReportPeerAddr? This could lead to a very large address book if we are connected to something like the IPFS DHT.

Open tasks (once we have consensus on the design)

  • Update Mdns to not use addresses_of_peer but report the discovered addresses via NetworkBehaviourAction::ReportPeerAddr
  • Normalize addresses in address book to always contain /p2p (Is this actually necessary? I think with [libp2p-dns] Implement /dnsaddr resolution. #1931, we already always append /p2p to a multiaddr before dialling it.)

For now, the `Swarm` doesn't do anything with this, we are just putting
infrastructure in place.
@dvc94ch
Copy link
Contributor

dvc94ch commented Jul 12, 2021

I'd like to point to https://github.com/ipfs-rust/ipfs-embed/blob/master/src/net/peers.rs as a reference of previous work.

  • the lifecycle methods should be handled. When a client opens a connection to a server for example the server should have the client in it's address book. When an address is unreachable it should be removed from the address book.

  • the swarm should get new events. In particular a Discovered(PeerId) and Unreachable(PeerId) are very useful. This allows dialing on the Discovered event and when a permanent connection to a server is needed, after a delay readd the address and redialing.

  • adding events for when a new address was added are less useful, the swarm should allow querying addresses of a PeerId and other information like rtt, if we are connected, when the connection started, etc.

  • mdns should also cause peers to be added to the address book.

  • kad should probably also cause peers to be added to the address book, although how it should work in detail is not clear to me yet as I don't have much experience with kad atm. we can probably get away without it since we have this addresses_of_peers thing for the time being.

  • addresses of peers should be normalized. my preference is to make sure that only addresses with a /p2p/ suffix are added. dialing unknown peers is not supported in ipfs-embed and will cause an error. See [1] for the details of how to do it.

  • [1] https://github.com/ipfs-rust/ipfs-embed/blob/master/src/net/p2p_wrapper.rs

@thomaseizinger
Copy link
Contributor Author

I'd like to point to https://github.com/ipfs-rust/ipfs-embed/blob/master/src/net/peers.rs as a reference of previous work.

Thanks! I will definitely take a look to see how this PR can be improved.

Please have a look at this discussion as well for some more context on what I am trying to achieve here.

the lifecycle methods should be handled. When a client opens a connection to a server for example the server should have the client in it's address book. When an address is unreachable it should be removed from the address book.

That makes sense although not all addresses of incoming connections will be dialable (TCP's outgoing ports are an issue here f.e.) but it may be worth to try anyway.

the swarm should get new events. In particular a Discovered(PeerId) and Unreachable(PeerId) are very useful. This allows dialing on the Discovered event and when a permanent connection to a server is needed, after a delay readd the address and redialing.

I agree that such events would generally be useful. At the moment, I am trying to land an MVP version of address book that allows NetworkBehaviours to report discovered peers in a generalized fashion. That should serve as a minimal but useful building block for a feature-rich address book.

adding events for when a new address was added are less useful, the swarm should allow querying addresses of a PeerId and other information like rtt, if we are connected, when the connection started, etc.

As stated in the PR description, I primarily added this for testing purposes. If we find a good way of testing this reliably, I am happy to remove it.

mdns should also cause peers to be added to the address book.

mdns is already reporting its discovered peers via addresses_of_peer. They would be duplicates if we would also add them to the AddressBook. Once we have consensus that we want to move forward with this design, we can update mdns to not use addresses_of_peers and use NetworkBehaviourAction::ReportPeerAddr instead.

kad should probably also cause peers to be added to the address book, although how it should work in detail is not clear to me yet as I don't have much experience with kad atm. we can probably get away without it since we have this addresses_of_peers thing for the time being.

See (3) in the PR description.

addresses of peers should be normalized. my preference is to make sure that only addresses with a /p2p/ suffix are added. dialing unknown peers is not supported in ipfs-embed and will cause an error. See [1] for the details of how to do it.

That makes sense, I'll add it to the PR description as a task. Will only proceed if we can get an ACK on the design though! :)

@dvc94ch
Copy link
Contributor

dvc94ch commented Jul 13, 2021

That makes sense although not all addresses of incoming connections will be dialable (TCP's outgoing ports are an issue here f.e.) but it may be worth to try anyway.

Ah without port reuse enabled there will certainly be a lot of garbage. But not handling this correctly was actually a bug in actyx at some point. After sim-open lands I think making port-reuse the default may be sensible. Not enabling it just causes a larger test matrix of behaviours, most of which will be untested.

@thomaseizinger
Copy link
Contributor Author

I am having 2nd thoughts on the design of this. Instead of introducing ReportPeerAddr on NetworkBehaviourAction, we could improve other discovery protocols.

  1. Identify could build up a list of addresses through the returned listenAddr and return them via addresses_of_peer.
  2. Mdns could do the same thing.

If we are consistent in the use of addresses_of_peer for protocols with discovery-like functionality, then we can narrow down the usecase of the AddressBook more. In particular, we could adopt the following mindset:

  • Discovery protocols MUST report discovered addresses via addresses_of_peer. Currently, mdns and identify do not adhere to this.
  • Discovery protocols MAY purge their cache of discovered addresses for all kinds of reasons (i.e. libp2p-kad does this).
  • The AddressBook would serve as a persistent store of peer ID <> multiaddress mappings that is always user-defined and never automatically updated. This would allow users to remember connection details for "important" peers, the same way people save phone numbers to their address book.

The reason I am proposing this behaviour is because having all NetworkBehaviours report addresses to the address book means we would need some kind of mechanism to cap the number of addresses we are storing. If you connect to something like the IPFS-DHT and keep doing random-walks, you will discover A LOT of peers. The whole point of a DHT is that you don't need to store all connection details. Having some NetworkBehaviours use ReportPeerAddr and others not is weirdly inconsistent and doesn't make rust-libp2p easier to understand.

On the upside, if we manage to design an AddressBook that can be the single source of truth for addresses then we could remove addresses_of_peer entirely which would simplify the implementation of NetworkBehaviours.

@thomaseizinger
Copy link
Contributor Author

My conclusion from the above post is that we should actually fix Identify first and potentially add an AddressBook that implements NetworkBehaviour.

Not only is this less invasive but I think it will also already improve how our NetworkBehaviours interact with each other.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants