Skip to content

Commit

Permalink
Documentation of p2p layer in Tendermint v0.34 (#9348)
Browse files Browse the repository at this point in the history
* spec: overview of p2p in v0.34 (#9120)

* Iniital comments on v0.34 p2p

* Added conf, updated text

* Moved everything to spec

* Update README.md

* spec: overview of the p2p implementation in v0.34 (#9126)

* Spec: p2p v0.34 doc, switch initial documentation

* Spec: p2p v0.34 doc, list of source files

* Spec: p2p v0.34 doc, transport documentation

* Spec: p2p v0.34 doc, transport error handling

* Spec: p2p v0.34 doc, PEX initial documentation

   * PEX protocol documentation is a separated file
   * PEX reactor documentation with a general documentation, including
     the address book and its role as (outbound) peer manager.

* Spec: p2p v0.34 doc, PEX protocol documentation

* Spec: p2p v0.34 doc, PEX protocol on seed nodes

* Spec: p2p v0.34 doc, address book

* Spec: p2p v0.34 doc, address book, more details

* Spec: p2p v0.34 doc, address book persistence

* Spec: p2p v0.34 doc, address book random samples

* Spec: p2p v0.34 doc, status of this documentation

* Spec: p2p v0.34 doc, pex reactor documentation

* Spec: p2p v0.34 doc, addressing PR #9126 comments

Co-authored-by: Jasmina Malicevic <[email protected]>

* Spec: p2p v0.34 doc, peer manager, outbound peers

Co-authored-by: Daniel Cason <cason@gandria>
Co-authored-by: Jasmina Malicevic <[email protected]>

* spec:p2p v0.34 introduction (#9319)

* restructure README.md initial

* Fix typos

* Reorganization

* spec: overview of p2p in v0.34 (#9120)

* Iniital comments on v0.34 p2p

* Added conf, updated text

* Moved everything to spec

* Update README.md

* spec: overview of the p2p implementation in v0.34 (#9126)

* Spec: p2p v0.34 doc, switch initial documentation

* Spec: p2p v0.34 doc, list of source files

* Spec: p2p v0.34 doc, transport documentation

* Spec: p2p v0.34 doc, transport error handling

* Spec: p2p v0.34 doc, PEX initial documentation

   * PEX protocol documentation is a separated file
   * PEX reactor documentation with a general documentation, including
     the address book and its role as (outbound) peer manager.

* Spec: p2p v0.34 doc, PEX protocol documentation

* Spec: p2p v0.34 doc, PEX protocol on seed nodes

* Spec: p2p v0.34 doc, address book

* Spec: p2p v0.34 doc, address book, more details

* Spec: p2p v0.34 doc, address book persistence

* Spec: p2p v0.34 doc, address book random samples

* Spec: p2p v0.34 doc, status of this documentation

* Spec: p2p v0.34 doc, pex reactor documentation

* Spec: p2p v0.34 doc, addressing PR #9126 comments

Co-authored-by: Jasmina Malicevic <[email protected]>

* Spec: p2p v0.34 doc, peer manager, outbound peers

Co-authored-by: Daniel Cason <cason@gandria>
Co-authored-by: Jasmina Malicevic <[email protected]>

* spec:p2p v0.34 introduction (#9319)

* restructure README.md initial

* Fix typos

* Reorganization

* spec: p2p v0.34, addressbook review

* spec: p2p v0.34, peer manager review

* spec: p2p v0.34, peer manager review

* spec: p2p v0.34, peer manager review

* spec: p2p v0.34, peer manager review

* spec: p2p v0.34, peer manager review

* spec: p2p v0.34, peer manager review

* spec: p2p v0.34, peer manager review

* Filled config description

* spec: p2p v0.34, transport review

* spec: p2p v0.34, switch review

* spec: p2p v0.34, overview, first version

* spec: p2p v0.34, peer manager review

* spec: p2p v0.34, shorter readme

* Configuration update

* Configuration update

* Shortened README

* spec: p2p v0.34, readme intro

* spec: p2p v0.34, readme contents

* spec: p2p v0.34, readme references

* spec: p2p readme points to v0.34

* spec: p2p, v0.34, fixing brokend markdown links

* Makrdown fix

* Apply suggestions from code review

Co-authored-by: Adi Seredinschi <[email protected]>
Co-authored-by: Zarko Milosevic <[email protected]>

* spec: p2p v0.34, address book new intro

* spec: p2p v0.34, address book buckets summary

* spec: p2p v0.34, peer manager, issue link

* spec: p2p v0.34, fixing links

* spec: p2p v0.34, addressing comments from reviews

* spec: p2p v0.34, addressing comments from reviews

* Apply suggestions from Jasmina's code review

Co-authored-by: Jasmina Malicevic <[email protected]>

* spec: p2p v0.34, addressing comments from reviews

* Apply suggestions from code review

Co-authored-by: Sergio Mena <[email protected]>

* Apply suggestions from code review

Co-authored-by: Jasmina Malicevic <[email protected]>
Co-authored-by: Sergio Mena <[email protected]>

* spec: p2p, v0.34, address book section reorganized

* spec: p2p, v0.34, addressing review comments

* Typos

* Typo

* spec: p2p, v0.34, address book markbad

Co-authored-by: Jasmina Malicevic <[email protected]>
Co-authored-by: Daniel Cason <cason@gandria>
Co-authored-by: Adi Seredinschi <[email protected]>
Co-authored-by: Zarko Milosevic <[email protected]>
Co-authored-by: Sergio Mena <[email protected]>
  • Loading branch information
6 people authored Nov 3, 2022
1 parent d704c0a commit 6bde634
Show file tree
Hide file tree
Showing 11 changed files with 1,691 additions and 0 deletions.
7 changes: 7 additions & 0 deletions spec/p2p/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,10 @@ parent:
title: P2P
order: 6
---

# Peer-to-Peer Communication

The operation of the p2p adopted in production Tendermint networks is [HERE](./v0.34/).

> This is part of an ongoing [effort](https://github.com/tendermint/tendermint/issues/9089)
> to produce a high-level specification of the operation of the p2p layer.
70 changes: 70 additions & 0 deletions spec/p2p/v0.34/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Peer-to-Peer Communication

This document describes the implementation of the peer-to-peer (p2p)
communication layer in Tendermint.

It is part of an [effort](https://github.com/tendermint/tendermint/issues/9089)
to produce a high-level specification of the operation of the p2p layer adopted
in production Tendermint networks.

This documentation, therefore, considers the releases `0.34.*` of Tendermint, more
specifically, the branch [`v0.34.x`](https://github.com/tendermint/tendermint/tree/v0.34.x)
of this repository.

## Overview

A Tendermint network is composed of multiple Tendermint instances, hereafter
called **nodes**, that interact by exchanging messages.

Tendermint assumes a partially-connected network model.
This means that a node is not assumed to be directly connected to every other
node in the network.
Instead, each node is directly connected to a subset of other nodes in the
network, hereafter called its **peers**.

The peer-to-peer (p2p) communication layer is responsible for establishing
connections between nodes in a Tendermint network,
for managing the communication between a node and its peers,
and for intermediating the exchange of messages between peers in Tendermint protocols.

## Contents

The documentation follows the organization of the `p2p` package of Tendermint,
which implements the following abstractions:

- [Transport](./transport.md): establishes secure and authenticated
connections with peers;
- [Switch](./switch.md): responsible for dialing peers and accepting
connections from peers, for managing established connections, and for
routing messages between the reactors and peers,
that is, between local and remote instances of the Tendermint protocols;
- [PEX Reactor](./pex.md): a reactor is the implementation of a protocol which
exchanges messages through the p2p layer. The PEX reactor manages the [Address Book](./addressbook.md) and implements both the [PEX protocol](./pex-protocol.md) and the [Peer Manager](./peer_manager.md) role.
- [Peer Exchange protocol](./pex-protocol.md): enables nodes to exchange peer addresses, thus implementing a peer discovery service;
- [Address Book](./addressbook.md): stores discovered peer addresses and
quality metrics associated to peers with which the node has interacted;
- [Peer Manager](./peer_manager.md): defines when and to which peers a node
should dial, in order to establish outbound connections;
- Finally, [Types](./types.md) and [Configuration](./configuration.md) provide
a list of existing types and configuration parameters used by the p2p layer implementation.

## Further References

Existing documentation referring to the p2p layer:

- https://github.com/tendermint/tendermint/tree/main/spec/p2p: p2p-related
configuration flags; overview of connections, peer instances, and reactors;
overview of peer discovery and node types; peer identity, secure connections
and peer authentication handshake.
- https://github.com/tendermint/tendermint/tree/main/spec/p2p/messages: message
types and channel IDs of Block Sync, Mempool, Evidence, State Sync, PEX, and
Consensus reactors.
- https://docs.tendermint.com/v0.34/tendermint-core: the p2p layer
configuration and operation is documented in several pages.
This content is not necessarily up-to-date, some settings and concepts may
refer to the release `v0.35`, that was [discontinued][v35postmorten].
- https://github.com/tendermint/tendermint/tree/master/docs/tendermint-core/pex:
peer types, peer discovery, peer management overview, address book and peer
ranking. This documentation refers to the release `v0.35`, that was [discontinued][v35postmorten].

[v35postmorten]: https://interchain-io.medium.com/discontinuing-tendermint-v0-35-a-postmortem-on-the-new-networking-layer-3696c811dabc
367 changes: 367 additions & 0 deletions spec/p2p/v0.34/addressbook.md

Large diffs are not rendered by default.

51 changes: 51 additions & 0 deletions spec/p2p/v0.34/configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Tendermint p2p configuration

This document contains configurable parameters a node operator can use to tune the p2p behaviour.

| Parameter| Default| Description |
| --- | --- | ---|
| ListenAddress | "tcp://0.0.0.0:26656" | Address to listen for incoming connections (0.0.0.0:0 means any interface, any port) |
| ExternalAddress | "" | Address to advertise to peers for them to dial |
| [Seeds](pex-protocol.md#seed-nodes) | empty | Comma separated list of seed nodes to connect to (ID@host:port )|
| [Persistent peers](peer_manager.md#persistent-peers) | empty | Comma separated list of nodes to keep persistent connections to (ID@host:port ) |
| UPNP | false | UPNP port forwarding enabled |
| [AddrBook](addressbook.md) | defaultAddrBookPath | Path do address book |
| AddrBookStrict | true | Set true for strict address routability rules and false for private or local networks |
| [MaxNumInboundPeers](switch.md#accepting-peers) | 40 | Maximum number of inbound peers |
| [MaxNumOutboundPeers](peer_manager.md#ensure-peers) | 10 | Maximum number of outbound peers to connect to, excluding persistent peers |
| [UnconditionalPeers](switch.md#accepting-peers) | empty | These are IDs of the peers which are allowed to be (re)connected as both inbound or outbound regardless of whether the node reached `max_num_inbound_peers` or `max_num_outbound_peers` or not. |
| PersistentPeersMaxDialPeriod| 0 * time.Second | Maximum pause when redialing a persistent peer (if zero, exponential backoff is used) |
| FlushThrottleTimeout |100 * time.Millisecond| Time to wait before flushing messages out on the connection |
| MaxPacketMsgPayloadSize | 1024 | Maximum size of a message packet payload, in bytes |
| SendRate | 5120000 (5 mB/s) | Rate at which packets can be sent, in bytes/second |
| RecvRate | 5120000 (5 mB/s) | Rate at which packets can be received, in bytes/second|
| [PexReactor](pex.md) | true | Set true to enable the peer-exchange reactor |
| SeedMode | false | Seed mode, in which node constantly crawls the network and looks for. Does not work if the peer-exchange reactor is disabled. |
| PrivatePeerIDs | empty | Comma separated list of peer IDsthat we do not add to the address book or gossip to other peers. They stay private to us. |
| AllowDuplicateIP | false | Toggle to disable guard against peers connecting from the same ip.|
| [HandshakeTimeout](transport.md#connection-upgrade) | 20 * time.Second | Timeout for handshake completion between peers |
| [DialTimeout](switch.md#dialing-peers) | 3 * time.Second | Timeout for dialing a peer |


These parameters can be set using the `$TMHOME/config/config.toml` file. A subset of them can also be changed via command line using the following command line flags:

| Parameter | Flag| Example|
| --- | --- | ---|
| Listen address| `p2p.laddr` | "tcp://0.0.0.0:26656" |
| Seed nodes | `p2p.seeds` | `--p2p.seeds “[email protected]:26656,[email protected]:4444”` |
| Persistent peers | `p2p.persistent_peers` | `--p2p.persistent_peers “[email protected]:26656,[email protected]:26656”` |
| Unconditional peers | `p2p.unconditional_peer_ids` | `--p2p.unconditional_peer_ids “id100000000000000000000000000000000,id200000000000000000000000000000000”` |
| UPNP | `p2p.upnp` | `--p2p.upnp` |
| PexReactor | `p2p.pex` | `--p2p.pex` |
| Seed mode | `p2p.seed_mode` | `--p2p.seed_mode` |
| Private peer ids | `p2p.private_peer_ids` | `--p2p.private_peer_ids “id100000000000000000000000000000000,id200000000000000000000000000000000”` |

**Note on persistent peers**

If `persistent_peers_max_dial_period` is set greater than zero, the
pause between each dial to each persistent peer will not exceed `persistent_peers_max_dial_period`
during exponential backoff and we keep trying again without giving up.

If `seeds` and `persistent_peers` intersect,
the user will be warned that seeds may auto-close connections
and that the node may not be able to keep the connection persistent.
Binary file added spec/p2p/v0.34/img/p2p_state.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
147 changes: 147 additions & 0 deletions spec/p2p/v0.34/peer_manager.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
# Peer Manager

The peer manager is responsible for establishing connections with peers.
It defines when a node should dial peers and which peers it should dial.
The peer manager is not an implementation abstraction of the p2p layer,
but a role that is played by the [PEX reactor](./pex.md).

## Outbound peers

The `ensurePeersRoutine` is a persistent routine intended to ensure that a node
is connected to `MaxNumOutboundPeers` outbound peers.
This routine is continuously executed by regular nodes, i.e. nodes not
operating in seed mode, as part of the PEX reactor implementation.

The logic defining when the node should dial peers, for selecting peers to dial
and for actually dialing them is implemented in the `ensurePeers` method.
This method is periodically invoked -- every `ensurePeersPeriod`, with default
value to 30 seconds -- by the `ensurePeersRoutine`.

A node is expected to dial peers whenever the number of outbound peers is lower
than the configured `MaxNumOutboundPeers` parameter.
The current number of outbound peers is retrieved from the switch, using the
`NumPeers` method, which also reports the number of nodes to which the switch
is currently dialing.
If the number of outbound peers plus the number of dialing routines equals to
`MaxNumOutboundPeers`, nothing is done.
Otherwise, the `ensurePeers` method will attempt to dial node addresses in
order to reach the target number of outbound peers.

Once defined that the node needs additional outbound peers, the node queries
the address book for candidate addresses.
This is done using the [`PickAddress`](./addressbook.md#pick-address) method,
which returns an address selected at random on the address book, with some bias
towards new or old addresses.
When the node has up to 3 outbound peers, the adopted bias is towards old
addresses, i.e., addresses of peers that are believed to be "good".
When the node has from 5 outbound peers, the adopted bias is towards new
addresses, i.e., addresses of peers about which the node has not yet collected
much information.
So, the more outbound peers a node has, the less conservative it will be when
selecting new peers.

The selected peer addresses are then dialed in parallel, by starting a dialing
routine per peer address.
Dialing a peer address can fail for multiple reasons.
The node might have attempted to dial the peer too many times.
In this case, the peer address is marked as bad and removed from the address book.
The node might have attempted and failed to dial the peer recently
and the exponential `backoffDuration` has not yet passed.
Or the current connection attempt might fail, which is registered in the address book.
None of these errors are explicitly handled by the `ensurePeers` method, which
also does not wait until the connections are established.

The third step of the `ensurePeers` method is to ensure that the address book
has enough addresses.
This is done, first, by [reinstating banned peers](./addressbook.md#Reinstating-addresses)
whose ban period has expired.
Then, the node randomly selects a connected peer, which can be either an
inbound or outbound peer, to [requests addresses](./pex-protocol.md#Requesting-Addresses)
using the PEX protocol.
Last, and this action is only performed if the node could not retrieve any new
address to dial from the address book, the node dials the configured seed nodes
in order to establish a connection to at least one of them.

### Fast dialing

As above described, seed nodes are actually the last source of peer addresses
for regular nodes.
They are contacted by a node when, after an invocation of the `ensurePeers`
method, no suitable peer address to dial is retrieved from the address book
(e.g., because it is empty).

Once a connection with a seed node is established, the node immediately
[sends a PEX request](./pex-protocol.md#Requesting-Addresses) to it, as it is
added as an outbound peer.
When the corresponding PEX response is received, the addresses provided by the
seed node are added to the address book.
As a result, in the next invocation of the `ensurePeers` method, the node
should be able to dial some of the peer addresses provided by the seed node.

However, as observed in this [issue](https://github.com/tendermint/tendermint/issues/2093),
it can take some time, up to `ensurePeersPeriod` or 30 seconds, from when the
node receives new peer addresses and when it dials the received addresses.
To avoid this delay, which can be particularly relevant when the node has no
peers, a node immediately attempts to dial peer addresses when they are
received from a peer that is locally configured as a seed node.

> FIXME: The current logic was introduced in [#3762](https://github.com/tendermint/tendermint/pull/3762).
> Although it fix the issue, the delay between receiving an address and dialing
> the peer, it does not impose and limit on how many addresses are dialed in this
> scenario.
> So, all addresses received from a seed node are dialed, regardless of the
> current number of outbound peers, the number of dialing routines, or the
> `MaxNumOutboundPeers` parameter.
>
> Issue [#9548](https://github.com/tendermint/tendermint/issues/9548) was
> created to handle this situation.
### First round

When the PEX reactor is started, the `ensurePeersRoutine` is created and it
runs thorough the operation of a node, periodically invoking the `ensurePeers`
method.
However, if when the persistent routine is started the node already has some
peers, either inbound or outbound peers, or is dialing some addresses, the
first invocation of `ensurePeers` is delayed by a random amount of time from 0
to `ensurePeersPeriod`.

### Persistent peers

The node configuration can contain a list of *persistent peers*.
Those peers have preferential treatment compared to regular peers and the node
is always trying to connect to them.
Moreover, these peers are not removed from the address book in the case of
multiple failed dial attempts.

On startup, the node immediately tries to dial the configured persistent peers
by calling the switch's [`DialPeersAsync`](./switch.md#manual-operation) method.
This is not done in the p2p package, but it is part of the procedure to set up a node.

> TODO: the handling of persistent peers should be described in more detail.
### Life cycle

The picture below is a first attempt of illustrating the life cycle of an outbound peer:

<img src="img/p2p_state.png" width="50%" title="Outgoing peers lifecycle">

A peer can be in the following states:

- Candidate peers: peer addresses stored in the address boook, that can be
retrieved via the [`PickAddress`](./addressbook.md#pick-address) method
- [Dialing](switch.md#dialing-peers): peer addresses that are currently being
dialed. This state exists to ensure that a single dialing routine exist per peer.
- [Reconnecting](switch.md#reconnect-to-peer): persistent peers to which a node
is currently reconnecting, as a previous connection attempt has failed.
- Connected peers: peers that a node has successfully dialed, added as outbound peers.
- [Bad peers](addressbook.md#bad-peers): peers marked as bad in the address
book due to exhibited [misbehavior](pex-protocol.md#misbehavior).
Peers can be reinstated after being marked as bad.

## Pending of documentation

The `dialSeeds` method of the PEX reactor.

The `dialPeer` method of the PEX reactor.
This includes `dialAttemptsInfo`, `maxBackoffDurationForPeer` methods.
Loading

0 comments on commit 6bde634

Please sign in to comment.