-
Notifications
You must be signed in to change notification settings - Fork 378
Aura alterations for Asynchronous Backing compatibility #2476
Comments
Re: this bullet point "Ensure only the first V + 1 blocks from a given author at a slot are accepted by the runtime" What I'm imagining is an extension to the logic of
Along with a struct FixedVelocityConsensusHook<const N: u32>;
|
paritytech/polkadot-sdk#83 <- this is necessary for node-side fork choice rules to inspect the relay-parent of parachain blocks which haven't been backed yet. And that unfortunately depends on paritytech/polkadot-sdk#645 . i.e. what we'd like is to have something that fits this description that we can use for selecting blocks optimally. /// Find potential parent blocks for a new block to be authored against the given relay-parent.
///
/// This accepts a relay-chain block and a maximum search depth along with some arguments for
/// filtering parachain blocks and performs a recursive search for parachain blocks.
///
/// The search begins at the last included parachain block and returns a set of block hashes
/// which could be potential parents of a new block with this relay-parent.
///
/// A parachain block is a potential parent if it is either the last included parachain block or
/// all of the following hold:
/// * its parent is a potential parent
/// * its relay-parent is within `ancestry_lookback` of the targeted relay-parent.
/// * the block number is within `max_depth` of the included block
fn find_potential_parents(relay_parent, ancestry_lookback, max_depth) { ... } However, the implementation of that function requires paritytech/polkadot-sdk#83 or some other way of tagging Cumulus blocks securely with their relay-parent. That, in turn, requires making Cumulus runtimes aware of the relay-parent of the executing block in the PVF. We can work around this for now, by focusing on a simpler fork-choice rule:
|
The bulk of the changes here should also apply to SASSAFRAS.
High Level
(feel free to skip ahead to the proposed modifications section)
As a recap (more details #2301), we need to prepare Cumulus collators for more advanced parachains protocol features such as Asynchronous Backing, Elastic Scaling, and Blockspace Regions. These changes impact the collator logic and protocols in similar ways: they enable more blockspace to be utilized, and they enable this blockspace to be consumed optimistically, based on predictions of what will happen on the relay chain.
Collator-selection algorithms are not responsible for consensus safety (i.e. security) but just consensus liveness (not halting). Liveness is a spectrum, though. From a formal standpoint, consensus algorithms usually focus on guaranteeing some minimal level of liveness, whereas in practice we care about maximizing throughput, minimizing latency, and reducing waste.
To be more specific, we can clarify two desirable properties for a collator-selection algorithm:
The phrase "available blockspace" here means something very specific. It refers to the number of cores and the proportion of each of those cores allocated to the parachain. This deliberately ignores the Pay-as-you-Go model, as Aura and similar collator-selection mechanisms are focused on dividing up authorship rights within already allocated blockspace regions, as opposed to Pay-as-you-Go, which is focused on purchasing and consuming blockspace in small increments.
The available blockspace can be roughly condensed into two variables: the maximum capacity C and the expected velocity V. Capacity refers to the amount of blocks that can be pending at any given time. e.g. for 1 core this might be 1 + 4 (the 4 comes from some lookahead performed by Polkadot's validators). The velocity of blockspace refers to the rate of block processing by the relay chain. Roughly speaking, with one core, V = 1, with two cores V = 2 and so on. Velocity calculations are more subtle when we take into account parachains sharing cores with strided blockspace regions.
The term "unincluded segment" (cc #1866) refers to the currently used capacity of the chain. Parachains control the maximum size of the unincluded segment and set C themselves, but the C variable is in practice capped by the relay chain's configuration. The remaining capacity of the unincluded segment is c = C - len(unincluded).
New relay-chain blocks increase c at velocity v <= V (the actual velocity is always less than or equal to the maximum), via the advancement of the backing and availability protocols. That is, the advancement of the relay-chain opens up new capacity to be fulfilled by parachain block authors. New parachain blocks extend the unincluded segment and utilize capacity.
This "capacity"/"velocity" framing may seem overwrought given the current state of blockspace in Polkadot: parachains have 1 core and they are scheduled at every moment on that core. However, with elastic scaling and divisible blockspace regions, it starts to become more reasonable to talk about capacity and velocity values that are much greater than 1 and much smaller than 1.
Proposed Aura modifications
Aura slots in Cumulus are the same as the relay-parent's slot.
All of these checks can be guaranteed by runtime pallets. The velocity can be deduced from the relay-chain state and the maximum unincluded segment length is determined by the parachain's own runtime.
In practice, we need a few changes to support this.
pallet-aura
or create drop-in replacement that allows multiple blocks per slot (pallet-aura: Allow multiple blocks per slot substrate#14024)Liveness Analysis
There are 2 states a (parachain parent, relay-parent) pair can be in, and we'll analyze the outcomes at each of them.
c = 0
By the definition of c, this means there is no remaining capacity for new blocks at this relay-parent.
The slot author for any relay-parent where the unincluded segment is totally full cannot author any blocks.
c > 0
The unincluded segment has some remaining capacity c. The slot owner as-of R can create min(V + 1, c) blocks with R as a relay-parent. This increases the utilization of the maximum capacity.
Implications
The goal is to build as long of a backlog of blocks as the parachain will allow (latency/throughput tradeoff). When the backlog is empty, each collator adds 1 extra parachain block per relay-chain block until the backlog is full. When the backlog is full, each collator simply replaces the blocks that have now been included into the relay chain.
This does imply that it takes N relay-chain blocks to fill an empty backlog, but backlogs are expected to be short, e.g. 2-4 blocks, so it will only take 12-24 seconds to fill the backlog completely. This fulfills Goal L1.
Goal L2 is mostly fulfilled by limiting the contention over which collators get to fill up which parts of the backlog. By limiting collators to at most v+1 blocks, the overlap between collators is limited. There are still issues where collators can decide to build on entirely competing forks, but these need to be addressed at the fork-choice and incentive levels.
The text was updated successfully, but these errors were encountered: