-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Agile Coretime #1
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial review with minor comments and one question for clarity. I think the mechanism is simple and a much better model than the current slots. I will take another pass once we’ve discussed this a bit more.
RFC-0001-Agile Coretime.md
Outdated
} | ||
``` | ||
|
||
Notably, if a region is split or transferred, then the `price` is reset to `None`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be easier to understand if you explicitly say that splitting a region removes the ability to renew at the same price which I think you are implicitly saying by specifying splitting sets the price to “none”. This also implies though that “split cores” are not eligible for priority renewal, correct? You also don’t seem to mention anything of the fact that current owners of cores should be able to get priority renewal. Is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is written above:
Notably, if a region is split or transferred, then the
price
is reset toNone
.
Not enough?
Also, this is about split regions, i.e. taking the month-long piece and splitting in into smaller pieces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t understand if there is a specific reason that the price needs to be set to none. Does that mean a split region can’t be renewed the same way a “full core” region would?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - because it would then have two owners - which one could "renew"? We generally want to minimise renewals since they bias the market. I think it's ok when the core would be used consistently by the same paras in the same way from month to month, but it doesn't make sense when they're being carved up and, presumably, traded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment initially was to clarify that split cores shouldn’t be able to get priority renewal. Carefully reading the spec does imply that but I think making this more explicit would be helpful for people to understand.
I think it's ok when the core would be used consistently by the same paras in the same way from month to month, but it doesn't make sense when they're being carved up and, presumably, traded.
I think this does not go into the initial spec. It would should be possible to offer to buy split cores at some point at which point this can be added. I doubt the demand for them will be particularly high today and it shouldn’t delay a first implementation.
I believe this RFC makes transfer of regions lose the priority on core renewal but changing the allocation of the slot from one parachain to another does not. This would mean the system can quite easily be gamed to transfer ownership of a region without it being prevented from being renewed (for example by holding the region in a pure proxy and simply transferring ownership). Wouldn’t we want to tie the ability to renew to the paraid that the core is running and not which account controls the core? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regions Architecture
The regions here have a number of similarities with the regions I've proposed in #3 but lack a couple significant details which will be important for implementing elastic scaling and other future mechanisms.
My proposal was focused on relay-chain coretime scheduling and is actually a replacement for all existing scheduling logic in the relay-chain. In my opinion, the mechanisms on the broker-chain should match the mechanisms on the relay-chain closely in order to avoid friction between those components.
Scheduling on the relay-chain, looking forward, needs to solve a few key problems:
- Gracefully handle tens of thousands of parachains without significant runtime scheduling overhead.
- In the case that a parachain has scheduled assignments on several cores, the mapping between upcoming blocks on the parachain and specific cores must be unambiguous in the near-term future to the validators working on those cores
- When cores are highly shared, the time it takes to make up for a missed opportunity should be minimally dependent on the number of other applications also scheduled on the core
- Accommodate a variety of different scheduling frequencies and overlapping durations, all on the same core
(more detailed description of all of the above and more in #3)
The regions RFC is my attempt to solve all these problems. When we discussed offline, @gavofyork raised the point that handling splitting/transferring on the relay-chain is likely to incur too much load, which is fair. However, the relay-chain Region
primitive itself is still important. The broker-chain should be able to use its own primitives, but they ought to be compatible with the direction of relay-chain scheduling.
To solve (1), the proposal focuses on having parachain candidates tagged with a deterministic and immutable region identifier which is submitted along with the candidate by the relay-chain block author. The relay-chain logic needs only lazily check that the region is in surplus and the parachain assignment is correct, which is a single load, modify, store operation per core per relay-chain block. This way, all scheduling overhead is pushed to the node-side.
To solve (2), we can build upon the solution to (1). In the future, this region identifier for a particular candidate may be included inside the CandidateReceipt
by the collator, as collator-selection algorithms working across many regions must already figure out how to utilize their regions and asking validators to re-run this allocation logic with less information is a redundancy that can be avoided. This solves the elastic scaling candidate-group problem, as validators will know unequivocally which backing group is intended to work on the candidate.
To solve (3) and (4), the regions proposal schedules core-time somewhat probabilistically. P2P network systems experience variance in practice, and the time it takes to back a block or make its data available does not always fall within prescribed bounds. Stated otherwise - if Polkadot were to set time limits on backing and availability timeouts such that they were always met, or were met 99.9% of the time, those bounds would likely be too conservative and we'd be leaving significant performance on the table. By giving regions a single assignee, the probabilistic scheduling allows for cores and parachains to "make up" recently missed opportunities by accepting more than one candidate at a time, up to configured per-core and per-relay-chain-block maximum to avoid massive per-block loads. Chains can never access more core-time than they've been allocated in total. With this solution, even with 1000 chains scheduled on a single core with varying frequencies, they cause minimal friction on each others' timing and system load is both predictable and capped when averaged out over any period longer than a minute or two.
Applying Regions to this RFC
I suggest that my regions proposal be altered, removing the ability to split, transfer, and reassign regions in an unpermissioned way on the Polkadot relay-chain.
Instead, these actions would become permissioned, with the permission being held by the broker-chain. The broker-chain would then 'blit' the data structures it manages onto regions in the relay-chain to manage scheduling. This can be done with a single XCM to create the regions on the relay-chain, and the scheduling logic there would handle the rest.
The region records described in this RFC will be compatible, in that they could be transformed into single-assignee, frequency-based regions when blitting them up to the relay-chain, but I suggest we outline that intention in this RFC to commit to that as a plan. The regions in this RFC also have a few properties I'll comment on:
- The
RegionId
is not unique - everyBULK_PERIOD
,RegionId
s will be reused, as they are dependent on only the core Index and the timeslice within theBULK_PERIOD
. That may make it harder for logic living in other chains to do bookkeeping about which regions they own. - The
allocation
containing aVec<ParaId>
rather than having multiple regions, each with their own singleParaId
may discourage region-sharing among chains, as either they all have to renew together or none of them can. In my opinion, this is likely to take away from the value proposition of sharing regions altogether, as all parachains will want to stay on theRENEWAL_PRICE_CAP
curve but some will be chained to sinking partners. That said, it does also give parachains very strong incentives to help their region companions survive. Let's discuss this property & alternatives.
Tight integration with the regions RFC would make updating the broker parachain to elastic scaling, or adding other mechanisms for accessing core-time technically trivial, economic design notwithstanding.
Since full implementation of the regions RFC will likely take a while, it'd probably make most sense to include a new call on the relay-chain that the broker can invoke via XCM::Transact: BlitBrokerRegion. This would take the Broker region format as an argument and transform it into whatever the scheduling mechanism that the relay chain currently uses is - whether that's the existing scheduling infrastructure quickly adapted for the purpose, or the new regions architecture when that lands.
RFC-0001-Agile Coretime.md
Outdated
5. The design MUST work with a limited set of resources (cores on the Polkadot UC) whose properties and number may evolve over time. | ||
6. The design MUST avoid creating additional dependency on functionality which the Relay-chain need not strictly provide for the delivery of the Polkadot UC. This includes any dependency on the Relay-chain hosting a DOT token. | ||
|
||
Furthermore, the design SHOULD be implementable and deployable in a timely fashion; three months from the acceptance of this RFC would seem reasonable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems optimistic, especially on the "deployable" part given the Root track takes one month and would eat into a third of this. Perhaps deployable to testnet with concrete migration path proposed for existing parachains?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't the fellowship be able to whitelist this upgrade?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but the track is still 28 days. And even though it is less restrictive to pass earlier on that track, a lot of parachain teams have expressed that they prefer a set block number (i.e. At
over After
) for runtime upgrades so that they can prepare for any breaking changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll see. As long as the Relay-chain support for core-assignment exists, then this really shouldn't be a big change. Three months should be notionally possible, even if it ends up being missed due to of factors outside of the scope of this RFC.
From the Renewals section:
Note containing the same [set of parachains]. This prevents transfer using proxies. Transfer of regions would indeed lose renewal rights since the price information would be dropped. This is intentional. The point of renewals isn't to attempt to give as many entities as possible a discount for the next month: it's to ensure that committed teams get some guarantees about price for predicting future costs. |
RFC-0001-Agile Coretime.md
Outdated
|
||
The present system of allocating time for parachains on the cores of the Polkadot Ubiquitous Computer (aka "Polkadot") is through a process known as *slot auctions*. These are on-chain candle auctions which proceed for several days and result in a core being assigned to a single parachain for six months at a time up to 18 months in advance. Practically speaking, we only see two year periods being bid upon and leased. | ||
|
||
Funds behind the bids made in the slot auctions are merely locked, not consumed or paid and become unlocked and returned to the bidder on expirt of the lease period. A means of sharing the deposit trustlessly known as a *crowdloan* is available allowing token holders to contribute to the overall deposit of a chain without any counterparty risk. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Funds behind the bids made in the slot auctions are merely locked, not consumed or paid
This wording and multiple other references to sales/ purchase suggest that Bulk coretime will be paid for in DOT, in contrast to currently where it's simply locked and the "cost" is the opportunity cost of not staking - it should be explicit if this is the case.
It would also be useful to understand - if this is the case - where those DOT are sent, i.e. who is paid? Is it validators? Is it burned? If validators, would high reward rates through demand for blockspace impact inflation that currently forms the vast majority of their revenue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bulk coretime will be paid for in DOT
That's right.
Any DOT recuperated for sales of system resources (Coretime, in this case) would by default be placed in the treasury. Governance would be able to determine what to do with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe its better to burn the DOT instead of diverting them to Treasury. There are a few reasons for that:
- In the absence of 2y locks on DOT, the system might benefit from a permanent sink for DOTs. We also might consider to increase the ideal staking rate. Non-interactive staking might serve well here, too.
- Inflow from coretime usage might, especially in the short-term, be very volatile. The Treasury conceptually benefits from predictable inflow, allowing for long-term budgeting. Inflation is the best way to do that. We'd counter that with burning for coretime.
- This mechanism would lead to high inflow in times with high coretime usage and low inflow in times of low usage. It seems to me that, if anything, it should be the opposite. With a steady inflow the Treasury would always have enough funds to respond to demand shocks in coretime when necessary (by funding good projects / initiatives).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally agree with @jonasW3F , DOT from Coretime sales should be instantly burnt in order to slow down inflation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no objection, though I would prefer to leave the specific economics (including this and the price adaption) for other RFCs so that we can document the motivations properly. The implementation of RFC-1 can (and indeed does) just provide a OnUnbalanced<Credit>
endpoint which can just as easily burn as send to the treasury.
@jonasW3F perhaps you can write a short RFC expanding out your points above.
@gavofyork you write:
I think this point perhaps wasn't clear enough. Thanks for clarifying that this is implemented by having |
Co-authored-by: Keith Yeung <[email protected]>
Co-authored-by: joe petrowski <[email protected]>
Co-authored-by: joe petrowski <[email protected]>
Co-authored-by: asynchronous rob <[email protected]>
…ows/RFCs into gav-agile-coretime
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>
Co-authored-by: Bastian Köcher <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good read!
@@ -167,7 +167,7 @@ The Sale Price varies during an initial portion of the Purchasing Period called | |||
|
|||
At any time when there are remaining Regions of Bulk Coretime to be sold, *including during the Interlude Period*, then certain Bulk Coretime assignmnents may be *Renewed*. This is similar to a purchase in that funds must be paid and it consumes one of the Regions of Bulk Coretime which would otherwise be placed for purchase. However there are two key differences. | |||
|
|||
Firstly, the price paid is exactly `RENEWAL_PRICE_CAP` more than what the purchase/renewal price was in the previous sale. | |||
Firstly, the price paid is the minimum of `RENEWAL_PRICE_CAP` more than what the purchase/renewal price was in the previous renewal and the current (or initial, if yet to begin) regular Sale Price. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for incorporating this change! Simplifies the purchase strategy that Centrifuge would choose significantly and increases certainty over core availability.
type CoreMask = [u8; 10]; // 80-bit bitmap. | ||
|
||
// 128-bit (16 bytes) | ||
struct RegionId { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as an implementation note, out-of-scope for the RFC, but these datatypes depend on using a packed representation as opposed to standard alignments if they're meant to have these exact sizes in memory.
in SCALE encoding they should be packed automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did make a test to check the size (at least on my 64-bit M1 architecture) and it was indeed 128-bit as expected without any explicit packing.
This proposes a periodic, sale-based method for assigning Polkadot Coretime. The method takes into account the need for long-term capital expenditure planning for teams building on Polkadot, yet also provides a means to allow Polkadot to capture long-term value in the resource which it sells. It supports the possibility of building secondary markets to make resource allocation more efficient and largely avoids the need for parameterisation.
Implementation: paritytech/substrate#14568