-
Notifications
You must be signed in to change notification settings - Fork 14
add(RFC): sharding + index alloc #566
Changes from all commits
6835bb6
c0c99e5
b134df0
4e4bc56
df6c44a
a8923ec
99c71f2
f1a745f
00af4aa
c3ef8ef
a62dffc
123ea35
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,250 @@ | ||
--- | ||
slug: 51 | ||
title: 51/WAKU2-RELAY-SHARDING | ||
name: Waku v2 Relay Sharding | ||
status: raw | ||
category: Standards Track | ||
tags: | ||
editor: Daniel Kaiser <[email protected]> | ||
contributors: | ||
--- | ||
|
||
# Abstract | ||
|
||
This document describes ways of sharding the [Waku relay](/spec/11/) topic, | ||
allowing Waku networks to scale in the number of content topics. | ||
|
||
> *Note*: Scaling in the size of a single content topic is out of scope for this document. | ||
|
||
# Background and Motivation | ||
|
||
[Unstructured P2P networks](https://en.wikipedia.org/wiki/Peer-to-peer#Unstructured_networks) | ||
are more robust and resilient against DoS attacks compared to | ||
[structured P2P networks](https://en.wikipedia.org/wiki/Peer-to-peer#Structured_networks)). | ||
However, they do not scale to large traffic loads. | ||
A single [libp2p gossipsub mesh](https://github.com/libp2p/specs/blob/master/pubsub/gossipsub/gossipsub-v1.0.md#gossipsub-the-gossiping-mesh-router), | ||
which carries messages associated with a single pubsub topic, can be seen as a separate unstructured P2P network | ||
(gossip and control messages go beyond these boundaries, but at its core, it is a separate P2P network). | ||
With this, the number of [Waku relay](/spec/11/) content topics that can be carried over a pubsub topic is limited. | ||
This prevents app protocols that aim to span many multicast groups (realized by content topics) from scaling. | ||
|
||
This document specifies three pubsub topic sharding methods (with varying degrees of automation), | ||
which allow application protocols to scale in the number of content topics. | ||
This document also covers discovery of topic shards. | ||
|
||
# Named Sharding | ||
|
||
*Named sharding* offers apps to freely choose pubsub topic names. | ||
App protocols SHOULD follow the naming structure detailed in [23/WAKU2-TOPICS](/spec/23/). | ||
With named sharding, managing discovery falls into the responsibility of apps. | ||
|
||
The default Waku pubsub topic `/waku/2/default-waku/proto` can be seen as a named shard available to all app protocols. | ||
|
||
> *Note*: Future versions of this document are planned to give more guidance with respect to discovery via | ||
[33/WAKU2-DISCV5](/spec/33/), | ||
[DNS discovery](https://eips.ethereum.org/EIPS/eip-1459), | ||
and inter-mesh discovery via gossipsub control messages (also using circuit relay). | ||
It might make sense to deprecate [23/WAKU2-TOPICS](/spec/23/) as a separate spec and merge it here. | ||
|
||
From an app protocol point of view, a subscription to a content topic `waku2/xxx` on a shard named /mesh/v1.1.1/xxx would look like: | ||
|
||
`subscribe("/waku2/xxx", "/mesh/v1.1.1/xxx")` | ||
|
||
# Static Sharding | ||
|
||
*Static sharding* offers a set of shards with fixed names. | ||
Assigning content topics to specific shards is up to app protocols, | ||
but the discovery of these shards is managed by Waku. | ||
|
||
These shards are managed in an array of $2^16$ shard clusters. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Pull up these constants and define them above perhaps? |
||
A shard cluster, in turn, contains 64 shards. | ||
A shard cluster is either globally available to all apps (like the default pubsub topic), | ||
specific for an app protocol, | ||
or reserved for automatic sharding (see next section). | ||
In total, there are $2^16 * 64 = 4194304$ shards for which Waku manages discovery. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wondering if there is any rationale behind this numbers? How is the mapping to pubsub topics done? Reading whats below, its 1 shard per topic? Will we have 4194304 gossip sub topics? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 64 shards per shard cluster is chosen to match the Eth ENR shard representation. If there are strong arguments for other numbers, we can of course adjust while in the raw phase.
For static sharding: up to the app layer. The document states this.
One shard per pubsub topic yes.
yes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Can gossipsub scale to this amount of topics? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (For a long time at least,) most topics/shards would be not be used. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This would be good to check with libp2p team in terms of how this would work in practice. I can't find it now, but long time I ago I tried to have multiple pubsub topics (basically using content topics as pubsub topics) and there were problems with creating meshes for pubsub topics. If a client is listening to these topics ahead of time it is probably fine, but it is something worth checking with network testing too. Maybe Nimbus knows of some potential gotchas here? cc @Menduist re libp2p and @jm-clius re network testing (not sure who to ping re this) |
||
|
||
App protocols can either choose to use global shards, or app specific shards. | ||
(In future versions of this document, automatic sharding, described in the next section, will become the default.) | ||
|
||
Like the [IANA ports](https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xhtml), | ||
shard clusters are divided into ranges: | ||
|
||
| index (range) | usage | | ||
| --- | --- | | ||
| 0 | global | | ||
| 1 - 15 | reserved | | ||
| 16 - 1023 | specific app protocols | | ||
| 1024 - 49125 | all app protocols | | ||
| 49152 - 65535 | automatic sharding | | ||
|
||
The informational RFC [52/WAKU2-RELAY-STATIC-SHARD-ALLOC](/spec/52) lists the current index allocations. | ||
|
||
The global shard with index 0 and the "all app protocols" range are treated in the same way, | ||
but choosing shards in the global cluster has a higher probability of sharing the shard with other apps. | ||
This offers k-anonymity and better connectivity, but comes at a higher bandwidth cost. | ||
|
||
The name of the pubsub topic corresponding to a given static shard is specified as | ||
|
||
`/waku/2/static-rshard/<shard_cluster_index>/<shard_number>`, | ||
|
||
an example for the 2nd shard in the global shard cluster: | ||
|
||
`/waku/2/static-rshard/0/2`. | ||
|
||
> *Note*: Because *all* shards distribute payload defined in [14/WAKU2-MESSAGE](spec/14/) via [protocol buffers](https://developers.google.com/protocol-buffers/), | ||
the pubsub topic name does not explicitly add `/proto` to indicate protocol buffer encoding. | ||
We use `rshard` to indicate it is a relay shards; further shard types might follow in the future. | ||
|
||
From an app point of view, a subscription to a content topic `waku2/xxx` on a static shard would look like: | ||
|
||
`subscribe("/waku2/xxx", 43)` | ||
|
||
for global shard 43. | ||
And for shard 43 of the Status app (which has allocated index 16): | ||
|
||
`subscribe("/waku2/xxx", 16, 43)` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What would the actual pubsub string would like? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added in df6c44a |
||
|
||
## Discovery | ||
|
||
Waku v2 supports the discovery of peers within static shards, | ||
so app protocols do not have to implement their own discovery method. | ||
To enable discovery of static shards, | ||
the array of shard clusters is added to [31/WAKU2-ENR](https://rfc.vac.dev/spec/31/). | ||
The representation is specified as follows. | ||
|
||
The array index is a 2 bytes field. | ||
As the array is expected to be sparse (and because ENRs do not feature an array/map type), | ||
the ENR contains a list of occupied array slots. | ||
Each shard cluster is represented by a bit vector, | ||
which indicates which shards of the respective shard cluster the node is part of | ||
(see Ethereum ENR sharding bit vector [here](https://github.com/ethereum/consensus-specs/blob/dev/specs/altair/p2p-interface.md#metadata) | ||
and [here](https://github.com/ethereum/consensus-specs/blob/dev/specs/altair/validator.md#sync-committee-subnet-stability)). | ||
The right-most bit in the bit vector represents shard `0`, the left-most bit represents shard `63`. | ||
|
||
> *Note*: We will update the [31/WAKU2-ENR](https://rfc.vac.dev/spec/31/) accordingly, once this RFC moves forward.) | ||
|
||
|
||
Having a static shard participation indication as part of the ENR allows nodes | ||
to discover peers that are part of shards via [33/WAKU2-DISCV5](/spec/33/) as well as via DNS. | ||
|
||
In its current raw version, this document proposes two representations in the ENR. | ||
(Which one to choose is open for discussion in the raw phase of the document. | ||
Future versions will only specify a single representation.) | ||
|
||
### One key per Shard Cluster | ||
|
||
For each shard cluster a node is part of, the node adds a separate key to its ENR. | ||
The representation corresponds to [Ethereum shard ENRs](https://github.com/ethereum/consensus-specs/blob/dev/specs/altair/validator.md#sync-committee-subnet-stability)). | ||
|
||
Example | ||
|
||
| key | value | | ||
|--- |--- | | ||
| `rshard-0` | `0x0000100000000000` | | ||
| `rshard-16` | `0x0000100000003000` | | ||
|
||
This example node is part of shard `45` in the global shard cluster, | ||
and part shards `13`, `14`, and `45` in the Status main-net shard cluster. | ||
|
||
This method is easier to read. | ||
It is feasible, assuming nodes are only part of a few apps using specific shard clusters. | ||
|
||
### Single Key | ||
|
||
Example | ||
|
||
| key | value | | ||
|--- |--- | | ||
| `rshards` | `num_shards` | 0u16 | `0x0000100000000000` | 16u16 | `0x0000100000003000` | | ||
|
||
The two-byte index uses network byte order. | ||
|
||
# Automatic Sharding | ||
|
||
> *Note:* Automatic sharding is not yet part of this specification. | ||
This section merely serves as an outlook. | ||
A specification of automatic sharding will be added to this document in a future version. | ||
|
||
Automatic sharding is a method for scaling Waku relay in the number of (smaller) content topics. | ||
It automatically maps Waku content topics to pubsub topics. | ||
Clients and protocols building on Waku relay only see content topics, while Waku relay internally manages the mapping. | ||
This provides both scaling as well as removes confusion about content and pubsub topics on the consumer side. | ||
|
||
From an app point of view, a subscription to a content topic `waku2/xxx` using automatic sharding would look like: | ||
|
||
`subscribe("/waku2/xxx", auto=true)` | ||
|
||
The app is oblivious to the pubsub topic layer. | ||
(Future versions could deprecate the default pubsub topic and remove the necessity for `auto=true`.) | ||
|
||
*The basic idea behind automatic sharding*: | ||
Content topics are mapped using [consistent hashing](https://en.wikipedia.org/wiki/Consistent_hashing). | ||
Like with DHTs, the hash space is split into parts, | ||
each covered by a Pubsub topic (mesh network) that carries content topics which are mapped into the respective part of the hash space. | ||
|
||
There are (at least) two issues that have to be solved: *Hot spots* and *Discovery* (see next subsection). | ||
|
||
Hot spots occur (similar to DHTs), when a specific mesh network becomes responsible for (several) large multicast groups (content topics). | ||
The opposite problem occurs when a mesh only carries multicast groups with very few participants: this might cause bad connectivity within the mesh. | ||
Our research goal here is finding efficient ways of distribution. | ||
We could get inspired by the DHT literature. | ||
We also have to consider: | ||
If a node is part of many content topics which are all spread over different shards, | ||
the node will potentially be exposed to a lot of network traffic. | ||
|
||
## Discovery | ||
|
||
For the discovery of automatic shards this document specifies two methods (the second method will be detailed in a future version of this document). | ||
|
||
The first method uses the discovery introduced above in the context of static shards. | ||
The index range `49152 - 65535` is reserved for automatic sharding. | ||
Each index can be seen as a hash bucket. | ||
Consistent hashing maps content topics in one of these buckets. | ||
|
||
The second discovery method will be a successor to the first method, | ||
but is planned to preserve the index range allocation. | ||
Instead of adding the data to the ENR, it will treat each array index as a capability, | ||
which can be hierarchical, having each shard in the indexed shard cluster as a sub-capability. | ||
When scaling to a very large number of shards, this will avoid blowing up the ENR size, and allows efficient discovery. | ||
We currently use [33/WAKU2-DISCV5](https://rfc.vac.dev/spec/33/) for discovery, | ||
which is based on Ethereum's [discv5](https://github.com/ethereum/devp2p/blob/master/discv5/discv5.md). | ||
While this allows to sample nodes from a distributed set of nodes efficiently and offers good resilience, | ||
it does not allow to efficiently discover nodes with specific capabilities within this node set. | ||
Our [research log post](https://vac.dev/wakuv2-apd) explains this in more detail. | ||
Adding efficient (but still preserving resilience) capability discovery to discv5 is ongoing research. | ||
[A paper on this](https://github.com/harnen/service-discovery-paper) has been completed, | ||
but the [Ethereum discv5 specification](https://github.com/ethereum/devp2p/blob/master/discv5/discv5-theory.md) | ||
has yet to be updated. | ||
When the new capability discovery is available, | ||
this document will be updated with a specification of the second discovery method. | ||
The transition to the second method will be seamless and fully backwards compatible because nodes can still advertise and discover shard memberships in ENRs. | ||
|
||
# Security/Privacy Considerations | ||
|
||
See [45/WAKU2-ADVERSARIAL-MODELS](/spec/45), especially the parts on k-anonymity. | ||
We will add more on security considerations in future versions of this document. | ||
|
||
## Receiver Anonymity | ||
|
||
The strength of receiver anonymity, i.e. topic receiver unlinkablity, | ||
depends on the number of content topics (`k`) that get mapped onto a single pubsub topic (shard). | ||
For *named* and *static* sharding this responsibility is at the app protocol layer. | ||
|
||
# Copyright | ||
|
||
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). | ||
|
||
# References | ||
|
||
* [11/WAKU2-RELAY](/spec/11/) | ||
* [Unstructured P2P network](https://en.wikipedia.org/wiki/Peer-to-peer#Unstructured_networks) | ||
* [33/WAKU2-DISCV5](/spec/33/) | ||
* [31/WAKU2-ENR](https://rfc.vac.dev/spec/31/) | ||
* [23/WAKU2-TOPICS](/spec/23/) | ||
* [51/WAKU2-RELAY-SHARDING](/spec/51/) | ||
* [Ethereum ENR sharding bit vector](https://github.com/ethereum/consensus-specs/blob/dev/specs/altair/p2p-interface.md#metadata) | ||
* [Ethereum discv5 specification](https://github.com/ethereum/devp2p/blob/master/discv5/discv5-theory.md) | ||
* [consistent hashing](https://en.wikipedia.org/wiki/Consistent_hashing). | ||
* [Research log: Waku Discovery](https://vac.dev/wakuv2-apd) | ||
* [45/WAKU2-ADVERSARIAL-MODELS](/spec/45) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,65 @@ | ||
--- | ||
slug: 52 | ||
title: 52/WAKU2-RELAY-STATIC-SHARD-ALLOC | ||
name: Waku v2 Relay Static Shard Allocation | ||
status: raw | ||
category: Informational | ||
tags: | ||
editor: Daniel Kaiser <[email protected]> | ||
contributors: | ||
--- | ||
|
||
# Abstract | ||
|
||
This document lists static shard flag index assignments (see [51/WAKU2-RELAY-SHARDING](/spec/51/). | ||
|
||
# Background | ||
|
||
Similar to the [IANA port allocation](https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xhtml), | ||
this document lists static shard index assignments (see [51/WAKU2-RELAY-SHARDING](/spec/51/). | ||
|
||
The index ranges are as follows: | ||
index `0` represents the global shards, | ||
indices `1` to `15` are reserved, | ||
indices `16` to `1023` are reservable for apps, | ||
indices `1024` to `49151` are for open use by apps, | ||
indices `49152` to `65535` are reserved. | ||
|
||
# Assingment Process | ||
|
||
> *Note*: Future versions of this document will specify the assignment process. | ||
|
||
# List of Static Shard Indices | ||
|
||
| index | Protocol/App | Description | | ||
| --- | --- | | | ||
| 0 | global | global use | | ||
| 1 | reserved | | | ||
| 2 | reserved | | | ||
| 3 | reserved | | | ||
| 4 | reserved | | | ||
| 5 | reserved | | | ||
| 6 | reserved | | | ||
| 7 | reserved | | | ||
| 8 | reserved | | | ||
| 9 | reserved | | | ||
| 10 | reserved | | | ||
| 11 | reserved | | | ||
| 12 | reserved | | | ||
| 13 | reserved | | | ||
| 14 | reserved | | | ||
| 15 | reserved | | | ||
| 16 | Status | Status main net | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since waku is permissionless how do you enforce this? I mean, this could be an internal recommendation but any app can send messages to Status mainnet shard. So wondering about the impact it will have if people don't respect this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Similar to the IANA process, there would be no enforcement. |
||
| 17 | Status | | | ||
| 18 | Status | | | ||
|
||
|
||
# Copyright | ||
|
||
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). | ||
|
||
# References | ||
|
||
* [51/WAKU2-RELAY-SHARDING](/spec/51/) | ||
* [IANA port allocation](https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xhtml) | ||
|
||
rymnc marked this conversation as resolved.
Show resolved
Hide resolved
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if "named sharding" is already covered here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The document mentiones his (in this section), along with the option (in the note) to merge RFC 23 here.
I put this in to this RFC to consolidate sharding strategies into one RFC, and to categorize this approach as "named sharding", distinguishing it from the other strategies.
We could leave RFC 23 as an informational RFC discussing naming strategies, or merge it here and deprecate 23.