-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add epoch number to Tendermint light client, alter block height representation #439
Comments
Hmm this sounds good to me. There's one case where I'd like clarification. Suppose a chain B will upgrade from epoch Then a proof of absence at height Is the above understanding correct? |
Yes, this is possible.
Yes, because |
I am quite against continuing to support this "chop block height to zero and start new 'successor' chain" approach in the ibc era. Thia made sense for early software, but given that 0.38 includes a live upgrade module tested on various testnets, and I believe there is also an option to "dump state and restart" without resetting the height to zero (changing state, but leaving the block history intact), I see no reason to support "reset height to zero and assume IBC will keep working". My opinion is that anyone who wants to run an ibc connected blockchain should provide some basic guarantees to other chains, including not rolling back state. I also know of no other blockchains besides the cosmos sdk, where such upgrade paths are considered normal, so I really don't think this would make too much inconvenience. |
I also think we want to discourage this, but I would still rather support it cleanly than make it unnecessarily difficult.
Yes, however, several (non-Tendermint) consensus algorithms do use epoch numbers as part of normal operations, and it would be helpful to be able to incorporate them more easily into the IBC framework. |
It would be good to add some references to those algorithms (and the epoch/height issues), to ensure this change solves their issues as well. I agree, this should support as many BFT algorithms as possible. |
A quick survey of alternative well specified consensus algorithms.
|
@zmanian Thank you for the summary. Can you link to the description of those algorithms? It would be good to look at the details. Also curious about Ava / Avalanche consensus which seems to be well-specified BFT |
Is it really I agree in general with having the Height be a more abstract thing (since it might be needed for certain other chains), but I wonder if we should really change the Tendermint client, especially if in the future for most Tendermint chains the epoch will never change ... |
During the planned test upgrade of Stargate testnet, all Tendermint light clients will change epoch. So we will be doing this at least once in a high profile setting.
During the planned test upgrade of Stargate testnet, all Tendermint light clients will change epoch. So we will be doing this at least once in a high profile setting. |
The particular problem is that force-closing channels may lead to isolated state, e.g. stuck tokens in ICS 20, which cannot be recovered without governance intervention (maybe that needs to happen as part of the upgrade). We could decide to accept this cost if we wanted to, though. |
I think it would be awesome if you did the "live" / "in place" upgrade inside the stargate testnet (once you are on stargate you can upgrade to another post-stargate point without the reset). Since @aaronc is leading up the sdk for stargate and was the lead architect of this live upgrade, you should have plenty of support, plus the Regen team (and Chrous One and VitWit) have run this many times in testnets. I am not saying we shouldn't add such epochs, but asking for the use of best possible design in testnets. If there are backwards-incompatible breaking changes in tendermint post-stargate, yes, the chains will be forced to use "dump state and restart" or "enter a new epoch". |
At the moment, Tendermint does not support non-zero-height restarts, and that support is not expected in the next release, so we will need epoch numbers to pull off such an upgrade. (as far as I know, maybe @marbar3778 can confirm) |
this is correct. It is slated as a followup once 0.34 is released. There is refactoring that would need to happen at the same time. Here is the issue with the outline: tendermint/tendermint#4767 |
@cwgoes the upgrade i discuss with regen was done with backports on the 0.34 line... and definitely should work on stargate. This never shuts down the chain, just coordinates changing the binary (and running a migration) at a specific block. But good to know that "dump state and restart" still cannot handle non-zero heights (I thought this was working already) |
Motivation
At present, upgrades to the Tendermint light client which reset the height to zero will screw up timeouts unless either:
Neither of these options are ideal.
Proposal
Change the "block height" used through the specification from a concrete
uint64
(unsigned integer) to an abstract ordered set:Add an epoch number to the Tendermint light client height, and represent it as:
Upgrades which reset the block height in Tendermint, represented as
epochHeight
here, can then increment theepochNumber
, and timeouts will continue to work - if the upgrade is scheduled in advance, users can set timeouts with a later epoch number than the current epoch if they want the expected time of the timeout to be after the upgrade.Timestamp-based timeouts should continue to work unaffected as long as the timestamp never decreases between upgrades.
This will require changes wherever a concrete
uint64
height is used throughout the spec, but conceptually it's pretty simple.The text was updated successfully, but these errors were encountered: