diff --git a/content/concepts/fundamentals/protocols.md b/content/concepts/fundamentals/protocols.md index 4079d38e..4d355af5 100644 --- a/content/concepts/fundamentals/protocols.md +++ b/content/concepts/fundamentals/protocols.md @@ -9,225 +9,118 @@ summary: There are protocols everywhere you look when you're writing network app # Protocols -There are protocols everywhere you look when you're writing network applications, and libp2p is -especially thick with them. - -The kind of protocols this article is concerned with are the ones built with libp2p itself, -using the core libp2p abstractions like [transport](/concepts/transport), [peer identity](/concepts/peers#peer-id/), [addressing](/concepts/addressing/), and so on. - -Throughout this article, we'll call this kind of protocol that is built with libp2p -a **libp2p protocol**, but you may also see them referred to as "wire protocols" or "application protocols". - -These are the protocols that define your application and provide its core functionality. - -This article will walk through some of the key [defining features of a libp2p protocol](#what-is-a-libp2p-protocol), give an overview of the [protocol negotiation process](#protocol-negotiation), and outline some of the [core libp2p protocols](#core-libp2p-protocols) that are included with libp2p and provide key functionality. - +This article is concerned with protocols that compose libp2p +using core libp2p abstractions like the [transport](/concepts/transport), +[peer identity](/concepts/peers#peer-id/), [addressing](/concepts/addressing/) +abstractions to name a few. ## What is a libp2p protocol? -A libp2p protocol has these key features: +A libp2p protocol is a network protocol that includes +the key features mentioned below. -#### Protocol Ids +### Protocol IDs -libp2p protocols have unique string identifiers, which are used in the [protocol negotiation](#protocol-negotiation) process when connections are first opened. +libp2p protocols have unique string identifiers, which are used in the +[protocol negotiation](#protocol-negotiation) process when streams are first opened. -By convention, protocol ids have a path-like structure, with a version number as the final component: +Historically, the IDs of many protocols in use by libp2p have a path-like structure, +with a version number as the final component: ``` /my-app/amazing-protocol/1.0.1 ``` -Breaking changes to your protocol's wire format or semantics should result in a new version -number. See the [protocol negotiation section](#protocol-neotiation) for more information about -how version selection works during the dialing and listening process. - -{{% notice "info" %}} +Breaking changes to a protocol's wire format or semantics should result in a new +version number. See the [protocol negotiation section](#protocol-neotiation) for more +information about how version selection works during the dialing and listening. -While libp2p will technically accept any string as a valid protocol id, -using the recommended path structure with a version component is both -developer-friendly and enables [easier matching by version](#match-using-semver). - -{{% /notice %}} +### Handler functions -#### Handler functions +A libp2p application will define a stream handler that takes over the +stream after protocol negotiation. Everything sent and received after the +negotiation phase is then defined by the application protocol. -To accept connections, a libp2p application will register handler functions for protocols using their protocol id with the -[switch][definition_switch] (aka "swarm"), or a higher level interface such as [go's Host interface](https://github.com/libp2p/go-libp2p-core/blob/master/host/host.go). +The handler function is invoked when an incoming stream is received with +the registered protocol ID. If you register a handler with a +[match function](#matching-protocol-ids-and-versions), you can choose whether +to accept non-exact string matches for protocol ids, for example, to match +on [semantic major versions](#match-using-semver). -The handler function will be invoked when an incoming stream is tagged with the registered protocol id. -If you register your handler with a [match function](#using-a-match-function), you can choose whether -to accept non-exact string matches for protocol ids, for example, to match on [semantic major versions](#match-using-semver). +### Binary streams - -#### Binary streams - -The "medium" over which a libp2p protocol transpires is a bi-directional binary stream with the following -properties: +The "medium" over which a libp2p protocol transpires is a bi-directional binary stream +with the following properties: - Bidirectional, reliable delivery of binary data - Each side can read and write from the stream at any time - Data is read in the same order as it was written - - Can be "half-closed", that is, closed for writing and open for reading, or closed for reading and open for writing + - Can be "half-closed", that is, closed for writing and open for reading, or closed + for reading and open for writing - Supports backpressure - - Readers can't be flooded by eager writers - -Behind the scenes, libp2p will also ensure that the stream is [secure](/concepts/secure-comms/) and efficiently -[multiplexed](/concepts/stream-multiplexing/). This is transparent to the protocol handler, which reads and writes -unencrypted binary data over the stream. - -The format of the binary data and the mechanics of what to send when and by whom are all up to the protocol to determine. For inspiration, some [common patterns](#common-patterns) that are used in libp2p's internal protocols are outlined below. - - -## Protocol Negotiation - -When dialing out to initiate a new stream, libp2p will send the protocol id of the protocol you want to use. -The listening peer on the other end will check the incoming protocol id against the registered protocol handlers. + - Eager writers can't flood readers -If the listening peer does not support the requested protocol, it will end the stream, and the dialing peer can -try again with a different protocol, or possibly a fallback version of the initially requested protocol. +Behind the scenes, libp2p will also ensure that the stream is +[secure](/concepts/secure-comms/) and efficiently +[multiplexed](/concepts/stream-multiplexing/). This is transparent to the protocol +handler, which reads and writes unencrypted binary data over the stream. -If the protocol is supported, the listening peer will echo back the protocol id as a signal that future data -sent over the stream will -use the agreed protocol semantics. +The protocol determines the binary data format and the transport mechanics. +For inspiration, some [common patterns](#common-patterns) that are used in libp2p's +internal protocols are outlined below. -This process of reaching agreement about what protocol to use for a given stream or connection is called -**protocol negotiation**. +## Life cycle of a stream +Libp2p streams are an abstraction +Libp2p protocols can have sub-protocols or protocol-suites. It isn't easy +to select and route protocols, or even know which protocols are available. -### Matching protocol ids and versions +[Multistream-select](https://github.com/multiformats/multistream-select) +is a protocol multiplexer that negotiates the application-layer protocol. When +a new stream is created, it is run through `multistream-select`, which routes the +stream to the appropriate protocol handler. -When you register a protocol handler, there are two methods you can use. +When dialing out to initiate a new stream, libp2p sends the protocol ID of the +protocol you want to use. The listening peer on the other end checks the incoming +protocol ID against the registered protocol handlers. -The first takes two arguments: a protocol id, and a handler function. If an incoming stream request sends an exact -match for the protocol id, the handler function will be invoked with the new stream as an argument. +If the listening peer does not support the requested protocol, it will send "na" +on the stream, and the dialing peer can try again with a different protocol or possibly +a fallback version of the initially requested protocol. -#### Using a match function +If the protocol is supported, the listening peer will echo back the protocol ID as +a signal that future data sent over the stream will use the agreed protocol semantics. -The second kind of protocol registration takes three arguments: the protocol id, a protocol match function, and the handler function. +This process of reaching an agreement about what protocol to use for a given stream +or connection is called **protocol negotiation**. -When a stream request comes in whose protocol id doesn't have any exact matches, the protocol id will be passed through -all of the registered match functions. If any returns `true`, the associated handler function will be invoked. - -This gives you a lot of flexibility to do your own "fuzzy matching" and define whatever rules for protocol matching -make sense for your application. - -### Dialing a specific protocol - -When dialing a remote peer to open a new stream, the initiating peer sends the protocol id that they'd like to use. The remote peer will use -the matching logic described above to accept or reject the protocol. If the protocol is rejected, the dialing peer can try again. - -When dialing, you can optionally provide a list of protocol ids instead of a single id. When you provide multiple protocol ids, they will -each be tried in succession, and the first successful match will be used if at least one of the protocols is supported by the remote peer. -This can be useful if you support a range of protocol versions, since you can propose the most recent version and fallback to older versions -if the remote hasn't adopted the latest version yet. +{{% notice "note" %}} +Not to be mistaken with [stream multiplexers](/concepts/stream-multiplexing), +which are solely responsible for the internal streams on a connection and aren't +concerned with underlying protocol being used. +{{% /notice %}} ## Core libp2p protocols -In addition to the protocols that you write when developing a libp2p application, libp2p itself defines several foundational protocols that are used for core features. - -### Common patterns - -The protocols described below all use [protocol buffers](https://developers.google.com/protocol-buffers/) (aka protobuf) to define message schemas. - -Messages are exchanged over the wire using a very simple convention which prefixes binary -message payloads with an integer that represents the length of the payload in bytes. The -length is encoded as a [protobuf varint](https://developers.google.com/protocol-buffers/docs/encoding#varints) (variable-length integer). - - - -### Ping - -| **Protocol id** | spec | | | implementations | -|--------------------|------|---------------|---------------|-------------------| -| `/ipfs/ping/1.0.0` | N/A | [go][ping_go] | [js][ping_js] | [rust][ping_rust] | - - -[ping_go]: https://github.com/libp2p/go-libp2p/tree/master/p2p/protocol/ping -[ping_js]: https://github.com/libp2p/js-libp2p-ping -[ping_rust]: https://github.com/libp2p/rust-libp2p/blob/master/protocols/ping/src/lib.rs +In addition to the protocols written when developing a libp2p application, libp2p defines +several foundational protocols used for core features. +{{% notice "note" %}} +Check out the [libp2p implementations page](https://libp2p.io/implementations/) for +updates on all the libp2p implementations. +{{% /notice %}} -The ping protocol is a simple liveness check that peers can use to quickly see if another peer is online. - -After the initial protocol negotiation, the dialing peer sends 32 bytes of random binary data. The listening -peer echoes the data back, and the dialing peer will verify the response and measure -the latency between request and response. - -### Identify - -| **Protocol id** | spec | | | implementations | -|------------------|--------------------------------|-------------------|-------------------|-----------------------| -| `/ipfs/id/1.0.0` | [identify spec][spec_identify] | [go][identify_go] | [js][identify_js] | [rust][identify_rust] | - - -[spec_identify]: https://github.com/libp2p/specs/pull/97/files -[identify_go]: https://github.com/libp2p/go-libp2p/tree/master/p2p/protocol/identify -[identify_js]: https://github.com/libp2p/js-libp2p-identify -[identify_rust]: https://github.com/libp2p/rust-libp2p/tree/master/protocols/identify/src - -The `identify` protocol allows peers to exchange information about each other, most notably their public keys -and known network addresses. - - -The basic identify protocol works by establishing a new stream to a peer using the identify protocol id -shown in the table above. - -When the remote peer opens the new stream, they will fill out an [`Identify` protobuf message][identify_proto] containing -information about themselves, such as their public key, which is used to derive their [`PeerId`](/concepts/peers/). - -Importantly, the `Identify` message includes an `observedAddr` field that contains the [multiaddr][definition_multiaddr] that -the peer observed the request coming in on. This helps peers determine their NAT status, since it allows them to -see what other peers observe as their public address and compare it to their own view of the network. - -[identify_proto]: https://github.com/libp2p/go-libp2p/blob/master/p2p/protocol/identify/pb/identify.proto - -#### identify/push - -| **Protocol id** | spec & implementations | -|-----------------------|-------------------------------------| -| `/ipfs/id/push/1.0.0` | same as [identify above](#identify) | - -A slight variation on `identify`, the `identify/push` protocol sends the same `Identify` message, but it does so proactively -instead of in response to a request. - -This is useful if a peer starts listening on a new address, establishes a new [relay circuit](/concepts/circuit-relay/), or -learns of its public address from other peers using the standard `identify` protocol. Upon creating or learning of a new address, -the peer can push the new address to all peers it's currently aware of. This keeps everyone's routing tables up to date and -makes it more likely that other peers will discover the new address. - -### kad-dht - -`kad-dht` is a [Distributed Hash Table][wiki_dht] based on the [Kademlia][wiki_kad] routing algorithm, with some modifications. - -libp2p uses the DHT as the foundation of its [peer routing](/concepts/peer-routing/) and [content routing](/concepts/content-routing/) functionality. To learn more about DHT and the Kademlia algorithm, -check out the [Distributed Hash Tables guide][dht] on the IPFS documentation site. In addition, check out the [libp2p implementations page](https://libp2p.io/implementations/) for updates on all the kad-libp2p implementations. - - - -[spec_kad]: https://github.com/libp2p/specs/pull/108 - -[wiki_dht]: https://en.wikipedia.org/wiki/Distributed_hash_table -[wiki_kad]: https://en.wikipedia.org/wiki/Kademlia -[dht]: https://docs.ipfs.tech/concepts/dht/ - -### Circuit Relay - -| **Protocol id** | spec | | implementations | -|-------------------------------|----------------------------------|----------------|-----------------| -| `/libp2p/circuit/relay/0.1.0` | [circuit relay spec][spec_relay] | [go][relay_go] | [js][relay_js] | - -[spec_relay]: https://github.com/libp2p/specs/tree/master/relay -[relay_js]: https://github.com/libp2p/js-libp2p-circuit -[relay_go]: https://github.com/libp2p/go-libp2p-circuit - -As described in the [Circuit Relay article](/concepts/circuit-relay/), libp2p provides a protocol -for tunneling traffic through relay peers when two peers are unable to connect to each other -directly. See the article for more information on working with relays, including notes on relay -addresses and how to enable automatic relay connection when behind an intractable NAT. - - - -[definition_switch]: /reference/glossary/#switch -[definition_multiaddr]: /reference/glossary/#multiaddr - -[repo_multistream-select]: https://github.com/multiformats/multistream-select +| **Specification** | **Protocol ID** | +|--------------------------------------------------------------------------------------------|------------------------------------| +| [AutoNAT](https://github.com/libp2p/specs/blob/master/autonat/README.md#autonat-protocol) | `/libp2p/autonat/1.0.0` | +| [Circuit Relay v2 (hop) ](https://github.com/libp2p/specs/blob/master/relay/circuit-v2.md) | `/libp2p/circuit/relay/0.2.0/hop` | +| [Circuit Relay v2 (stop)](https://github.com/libp2p/specs/blob/master/relay/circuit-v2.md) | `/libp2p/circuit/relay/0.2.0/stop` | +| [DCUtR](https://github.com/libp2p/specs/blob/master/relay/DCUtR.md) | `/libp2p/dcutr/1.0.0` | +| [Fetch](https://github.com/libp2p/specs/tree/master/fetch) | `/libp2p/fetch/0.0.1` | +| [GossipSub v1.0](https://github.com/libp2p/specs/tree/master/pubsub/gossipsub) | `/libp2p/gossipsub/1.0.0` | +| [GossipSub v1.1](https://github.com/libp2p/specs/tree/master/pubsub/gossipsub) | `/libp2p/gossipsub/1.1.0` | +| [Identify](https://github.com/libp2p/specs/blob/master/identify/README.md) | `/ipfs/id/1.0.0` | +| [Identify (push)](https://github.com/libp2p/specs/blob/master/identify/README.md) | `/ipfs/id/push/1.0.0` | +| [Kademlia DHT](https://github.com/libp2p/specs/blob/master/kad-dht/README.md) | `/ipfs/kad/1.0.0` | +| Ping | `/ipfs/ping/1.0.0` | +| [Rendezvous](https://github.com/libp2p/specs/blob/master/rendezvous/README.md) | `/libp2p/rendezvous/1.0.0` |