-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Write out stubs for most backing and availability subsystems #1199
Conversation
One common pattern is that all gossip services are sending out and receiving neighbor packets that are just N chain heads. This is wasteful in bandwidth as well as in code. We could alternatively create a separate network protocol strictly for sending these notifications, which informs other networking subsystems of new neighbor packets from peers and when we have broadcast a neighbor packet. On balance, this seems like a good idea. The only annoying thing is that we don't have any broadcast in the overseer, so we would have to manually specify sending each message to each other networking service. This we would have to account for on adding new network protocols. Alternatively, we could have a message type like this: NeighborPacketProtocol::RegisterNeighborPacketUpdateGenerator(
Box<Fn(NeighborPacketUpdate) -> AllMessages>
) and any subsystem interested in notifications will register an update generator on startup. This seems like the best of both worlds. The |
|
||
Implemented as a gossip protocol. Neighbor packets are used to inform peers which chain heads we are interested in data for. Track equivocating validators and stop accepting information from them. Forward double-vote proofs to the double-vote reporting system. Establish a data-dependency order: | ||
- In order to receive a `Seconded` message we must be working on corresponding chain head | ||
- In order to receive an `Invalid` or `Valid` message we must have received the corresponding `Seconded` message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Say Peer A has seconded a particular candidate. However, due to the network configuration and the vagaries of gossip, we haven't yet seen that A has seconded it. Peer B, meanwhile, has seen that; it sends us a Valid
message. What happens then; we just drop B's approving vote?
This seems like an approach which could lead to a hung quorum where everyone agrees that a particular candidate should be included in the next relay block, but nobody's retained enough vote messages to record it as approved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First I'll nitpick a bit: The model separates peers from validators. Peer B doesn't issue an approving vote, Validator X does. Whether that vote arrives via Peer A or B is irrelevant, as is whether Validator X controls peer B or is on another peer X'
So I will clarify your question statement: We are talking about two validators X and Y. X has issued a Seconded
statement of a candidate, and Y has issued a corresponding Valid
statement.
Yes, we drop the Valid
vote if we haven't seen the Seconded
message. But we don't blacklist it forever. If we get the Seconded
message later, we will then accept the Valid
message. The goal is to ensure that the messages propagate through the network in the correct data-dependency order and we don't risk dealing with unpinned data generated by a rogue validator.
But the situation outlined in your example cannot happen. Peer B can have an indication that we have seen the Seconded
message one of two ways:
- We have sent them the
Seconded
message. - They have sent us the
Seconded
message.
If Peer B has not seen a Seconded
message from us, they should send the Seconded
message before sending the Valid
message, with the understanding that the Valid
message will then be kept.
If Peer B has seen a Seconded
message from us, they can just send the Valid
message with the same understanding.
Now apply the transitive property. It is not possible for Peer B to have the Valid
message without a corresponding Seconded
message. This can be traced back all the way to the originator of the Valid
message, the peer Y', controlled by validator Y.
|
||
Implemented as a gossip system, where `PoV`s are not accepted unless we know a `Seconded` message. | ||
|
||
[TODO: this requires a lot of cross-contamination with statement distribution even if we don't implement this as a gossip system. In a point-to-point implementation, we still have to know _who to ask_, which means tracking who's submitted `Seconded`, `Valid`, or `Invalid` statements - by validator and by peer. One approach is to have the Statement gossip system to just send us this information and then we can separate the systems from the beginning instead of combining them] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for separating the systems from the start. PoV Distribution should only request a PoV from a peer after we know it's been Seconded
.
|
||
For each of these data we have pruning rules that determine how long we need to keep that data available. | ||
|
||
PoV hypothetically only need to be kept around until the block where the data was made fully available is finalized. However, disputes can revert finality, so we need to be a bit more conservative. We should keep the PoV until a block that finalized availability of it has been finalized for 1 day (TODO: arbitrary, but extracting `acceptance_period` is kind of hard here...). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
disputes can revert finality
I don't understand this. The only other mention of disputes I see is
...with enough positive statements, the block can be noted on the relay-chain. Negative statements are not a veto but will lead to a dispute, with those on the wrong side being slashed.
My understanding is that no parachain block is considered finalized until its relay parent is finalized, and that statement collection etc happens before finalization. How can disputes revert finality?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#1176 should have more details. Disputes happen after inclusion, during the acceptance period. The initial statements give us only the guarantee that validators will be slashed, but the validator groups are sampled out of the overall validator set and thus can be wholly compromised with non-negligible probability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although as seen in paritytech/substrate#6224 there is still some contention as to whether to allow reverting finality or not.
clean up language Co-authored-by: Peter Goodspeed-Niklaus <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
roadmap/implementors-guide/guide.md
Outdated
### Block Authorship (Provisioning) | ||
|
||
#### Description | ||
|
||
This subsystem is not actually responsible for authoring blocks, but instead is responsible for providing data to an external block authorship service beyond the scope of the overseer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A subsystem which is not responsible for block authoring should not be called "Block Authorship". I think this is the same subsystem that in other PRs we've been calling "Candidate Backing.
### Block Authorship (Provisioning) | |
#### Description | |
This subsystem is not actually responsible for authoring blocks, but instead is responsible for providing data to an external block authorship service beyond the scope of the overseer. | |
### Candidate Backing (Provisioning) | |
#### Description | |
This subsystem is responsible for providing data to an external block authorship service beyond the scope of the overseer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is different than Candidate Backing - Candidate backing is Candidate backing, whereas this provides backed candidates and bitfields and maybe some other stuff to block authorship
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Regardless, it should have a less confusing name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to just Provisioner
. We can improve that in a follow-up
These are not meant to be complete descriptions, but should capture the goals of each subsystem and how they fit together.
It should be clear what
StartWork
andStopWork
are doing for each subsystem and how it fuels interaction between subsystems.Follow-up PRs will clarify the mechanics of specific subsystems and clean up the work in this PR.
I perceive this to be the set of subsystems needed for a basic, functioning relay-chain that does not do validity slashing, but does guarantee availability. Collator subsystems are not specified, but I created stubs for them.