Upgrade Refactors #999

AdityaSripal · 2023-07-06T15:23:16Z

This PR makes the following changes:

Remove notion of upgrade states and instead only add FLUSHING and FLUSHINGCOMPLETE to state enum
Make INITUpgrade authority gated
Start timers on TRY and ACK messages
Add ChanUpgradeConfirm to ensure both sides are aware of counterparty upgrade timeout before moving to FLUSHCOMPLETE
Add packet processing logic

crodriguezvega

I know this is still in draft, but I gave the PR a first read and I noticed one thing about the storage of upgrade timeout...

spec/core/ics-004-channel-and-packet-semantics/UPGRADES.md

* refactor and fix bugs * fix packet flushing logic * remove unnecessary states

crodriguezvega

Thanks @AdityaSripal. I left some comments/questions.

There are still some places in the spec where the old states (TRYUPGRADE, ACKUPGRADE) are still mention and also FlushStatus.

spec/core/ics-004-channel-and-packet-semantics/README.md

spec/core/ics-004-channel-and-packet-semantics/UPGRADES.md

spec/core/ics-004-channel-and-packet-semantics/README.md

spec/core/ics-004-channel-and-packet-semantics/UPGRADES.md

colin-axner

Looking excellent! Only reviewed up to the confirm step so far. Will continue later today. Only two issues which I noted offline:

need to ensure counterparty upgrade timeout is stored even when we are in flushing
explicitly handle crossing hellos case in try step

spec/core/ics-004-channel-and-packet-semantics/README.md

colin-axner · 2023-07-26T10:09:53Z

spec/core/ics-004-channel-and-packet-semantics/README.md

    abortTransactionUnless(channel !== null)
-    abortTransactionUnless(channel.state !== CLOSED && channel.flushStatus === NOTINFLUSH)
+    abortTransactionUnless(channel.state === OPEN)


This is disabling optimistic sends. Is this intentional? Maybe it should be in a separate pr? The previous behaviour before channel upgradability changes was channel.state !== CLOSED

Yes, its necessary now but wasn't done before. I suppose i could do it in a separate pr and then merge conflict

spec/core/ics-004-channel-and-packet-semantics/README.md

spec/core/ics-004-channel-and-packet-semantics/UPGRADES.md

colin-axner

LGTM

Only recommendation would be to have startFlushUpgradeHandshake only abort (no restores)

Oh and see this #999 (comment)

spec/core/ics-004-channel-and-packet-semantics/README.md

colin-axner · 2023-07-26T14:23:51Z

spec/core/ics-004-channel-and-packet-semantics/UPGRADES.md

+    // abort the transaction if the callback returns an error and
+    // there was no existing upgrade. This will allow the counterparty upgrade


Suggested change

// abort the transaction if the callback returns an error and

// there was no existing upgrade. This will allow the counterparty upgrade

// abort the transaction if the callback returns an error

// This will allow the counterparty upgrade

spec/core/ics-004-channel-and-packet-semantics/UPGRADES.md

colin-axner · 2023-07-26T14:39:17Z

spec/core/ics-004-channel-and-packet-semantics/UPGRADES.md

-    // counterparty channel must be proved to still be in OPEN state or INITUPGRADE state (crossing hellos)
-    abortTransactionUnless(counterpartyChannel.State === OPEN || counterpartyChannel.State == INITUPGRADE)
+    // counterparty channel must be proved to not have completed flushing after timeout has passed
+    abortTransactionUnless(counterpartyChannel.state !== OPEN || counterpartyChannel.state == FLUSHCOMPLETE)


I think this check needs to be modified. This should only state that counterpartyChannel must be in any state except FLUSHCOMPLETE?

It could have completed handshake and went to OPEN. added more logic to check if this indeed did happen

see latest comment on this #999 (comment)

colin-axner · 2023-07-26T14:40:42Z

spec/core/ics-004-channel-and-packet-semantics/UPGRADES.md

-    abortTransactionUnless(counterpartyChannel.State === OPEN || counterpartyChannel.State == INITUPGRADE)
+    // counterparty channel must be proved to not have completed flushing after timeout has passed
+    abortTransactionUnless(counterpartyChannel.state !== OPEN || counterpartyChannel.state == FLUSHCOMPLETE)
+    abortTransactionUnless(counterpartyChannel.sequence === currentChannel.sequence)


Just noting, in ibc-go we would need a proof guarentee. That is, don't allow an INIT if a channel been restored/upgraded in the same block

colin-axner · 2023-07-26T14:43:35Z

spec/core/ics-004-channel-and-packet-semantics/UPGRADES.md

@@ -855,12 +885,12 @@ function timeoutChannelUpgrade(
 }
 ```

-Note that the timeout logic only applies to the INIT step. This is to protect an upgrading chain from being stuck in a non-OPEN state if the counterparty cannot execute the TRY successfully. Once the TRY step succeeds, then both sides are guaranteed to have the upgrade feature enabled. Liveness is no longer an issue, because we can wait until liveness is restored to execute the ACK step which will move the channel definitely into an OPEN state (either a successful upgrade or a rollback).
+Both parties must not complete the upgrade handshake if the counterparty upgrade timeout has already passed. Even if both sides could have successfully moved to FLUSHCOMPLETE. This will prevent the channel ends from reaching incompatible states.


minor nit: if the timeout has passed, but each chain moved to flush complete before the timeout passed. That's fine

colin-axner · 2023-07-26T14:45:07Z

spec/core/ics-004-channel-and-packet-semantics/UPGRADES.md


-The TRY chain will receive the timeout parameters chosen by the counterparty on INIT, so that it can reject any TRY message that is received after the specified timeout. This prevents the handshake from entering into an invalid state, in which the INIT chain processes a timeout successfully and restores its channel to `OPEN` while the TRY chain at a later point successfully writes a `TRY` state.
+Note that a channel upgrade handshake may never complete successfully if the in-flight packets cannot successfully be cleared. This can happen if the timeout value of a packet is too large, or an acknowledgement never arrives, or if there is a bug that makes acknowledging or timing out a packet impossible. In these cases, some out-of-protocol mechanism (e.g. governance) must step in to clear the packets "manually" perhaps by forcefully clearing the packet commitments before restarting the upgrade handshake.


must step in to clear the packets "manually" perhaps by forcefully clearing the packet commitments before restarting the upgrade handshake.

historical proofs scare me with this statement 😱 Maybe we can leave off any suggestion until the situation arises

spec/core/ics-004-channel-and-packet-semantics/UPGRADES.md

colin-axner

I like all the changes! I just have some confusion on how timeouts should work and I think if we write an error receipt for a lower counterparty sequence, the cancellation handler should be updated to only fast forward the sequence (and not delete the upgrade) in that situation

colin-axner · 2023-07-27T09:39:22Z

spec/core/ics-004-channel-and-packet-semantics/UPGRADES.md

    }

    // connectionHops can change in a channelUpgrade, however both sides must still be each other's counterparty.
-    proposedConnection = provableStore.get(connectionPath(proposedUpgradeFields.connectionHops[0])
+    // since connection hops may be provided by relayer, we will abort to avoid changing state based on relayer-provided value


I think this comment can be removed?

colin-axner · 2023-07-27T09:40:28Z

spec/core/ics-004-channel-and-packet-semantics/UPGRADES.md

+// it will set the channel to desiredChannel state and move to flushing mode if we are not already in flushing mode
+// it will store the upgrade timeout in hte upgrade state


This can be updated

colin-axner · 2023-07-27T09:42:52Z

spec/core/ics-004-channel-and-packet-semantics/UPGRADES.md


-    currentChannel.state = desiredChannelState
-    currentChannel.flushState = FLUSHING
+`startFlushUpgradeHandshake` will set the counterparty last packet send and continue blocking the upgrade from continuing until all in-flight packets have been flushed. When the channel is in blocked mode, any packet receive above the counterparty last packet send will be rejected. It will set the channel state to `FLUSHING` and block `sendPackets`. During this time; `receivePacket`, `acknowledgePacket` and `timeoutPacket` will still be allowed and processed according to the original channel parameters. The state machine will set a timer for how long the other side can take before it completes flushing and moves to `FLUSHCOMPLETE`. The new proposed upgrade will be stored in the public store for counterparty verification.


super nit: feel like this could be called startFlushing

colin-axner · 2023-07-27T09:48:16Z

spec/core/ics-004-channel-and-packet-semantics/UPGRADES.md

+    abortTransactionUnless(counterpartyChannel.state !== OPEN && counterpartyChannel.state !== FLUSHCOMPLETE)
+    // if counterparty channel state is OPEN, we should abort only if the counterparty has successfully completed upgrade
+    if counterpartyChannel.state === OPEN {


I don't understand. You wrote this:

if counterpartyChannel.state == OPEN || counterpartyChanne.state == FLUSHCOMPLETE { return err } if counterpartyChannel.state == OPEN { }

spec/core/ics-004-channel-and-packet-semantics/UPGRADES.md

colin-axner · 2023-07-27T09:51:11Z

spec/core/ics-004-channel-and-packet-semantics/UPGRADES.md

    }
+    abortTransactionUnless(counterpartyChannel.sequence === currentChannel.sequence)
+    abortTransactionUnless(verifyChannelState(connection, proofHeight, proofChannel, currentChannel.counterpartyPortIdentifier, currentChannel.counterpartyChannelIdentifier, counterpartyChannel))


A little odd you do proofs after the if statement. I feel like proof verification should happen first?

colin-axner · 2023-07-27T09:52:32Z

spec/core/ics-004-channel-and-packet-semantics/UPGRADES.md

+            version: upgrade.fields.version,
+            sequence: currentChannel.sequence,
+        }
+        abortTransactionUnless(counterpartyChannel != upgradedChannel)


Okay so the logic here is that there's one specific counterparty channel state we don't want to restore on. Still feels a bit odd to prove the counterparty isn't in the upgraded state rather than proving the counterparty is still in the pre-upgrade state

spec/core/ics-004-channel-and-packet-semantics/UPGRADES.md

colin-axner · 2023-07-27T10:09:16Z

spec/core/ics-004-channel-and-packet-semantics/UPGRADES.md

+    if counterpartyUpgradeSequence < channel.sequence {     
+        errorReceipt = ErrorReceipt{
+        channel.sequence - 1,
+        "sequence out of sync", // constant string changable by implementation
+        }
+        provableStore.set(channelUpgradeErrorPath(portIdentifier, channelIdentifier), errorReceipt)
+        return


If we are adding this, I think you need to handle this in the cancellation message to not delete the upgrade but just fast forward the sequence

colin-axner · 2023-07-27T11:49:54Z

I think the app callbacks need to be more explicitly defined. Apps should ensure they are still processing on existing channel field information until the upgrade succeeds. Init/Try/Ack should probably just be used to validate proposed upgrade fields

colin-axner

ACK, noting I gave @AdityaSripal my approval for merge offline

AdityaSripal added 5 commits June 29, 2023 19:19

first pass at changes

49f1fde

progress

2415bee

add confirm step

1526dda

restore on timeout in ack

747614b

add set counterparty timeout in ack

b33c6ee

DimitrisJim mentioned this pull request Jul 13, 2023

ICS 04 (upgrades): Use counterparty connection hops when verifying channel state in UpgradeOpen. #1004

Closed

crodriguezvega reviewed Jul 18, 2023

View reviewed changes

spec/core/ics-004-channel-and-packet-semantics/UPGRADES.md Show resolved Hide resolved

spec/core/ics-004-channel-and-packet-semantics/UPGRADES.md Outdated Show resolved Hide resolved

fix docs

171ee4f

AdityaSripal marked this pull request as ready for review July 19, 2023 12:44

AdityaSripal requested review from mpoke, cwgoes, colin-axner and angbrav as code owners July 19, 2023 12:44

Upgrade refactor (#1006)

cd9f4cf

* refactor and fix bugs * fix packet flushing logic * remove unnecessary states

AdityaSripal changed the title ~~Extend upgrade timeout~~ Upgrade Refactors Jul 19, 2023

AdityaSripal added 2 commits July 19, 2023 18:15

fix packet procesing

52c1064

cleanup documentation

6fe02ca

charleenfei mentioned this pull request Jul 20, 2023

Handle upgrading channels in TimeoutPacket. cosmos/ibc-go#4138

Closed

9 tasks

crodriguezvega reviewed Jul 21, 2023

View reviewed changes

crodriguezvega reviewed Jul 24, 2023

View reviewed changes

spec/core/ics-004-channel-and-packet-semantics/README.md Show resolved Hide resolved

crodriguezvega reviewed Jul 24, 2023

View reviewed changes

spec/core/ics-004-channel-and-packet-semantics/UPGRADES.md Outdated Show resolved Hide resolved

AdityaSripal added 3 commits July 25, 2023 12:41

add consideration

f939393

remove channel state and flush status references

a75b594

fix try error handling

2968957

colin-axner reviewed Jul 26, 2023

View reviewed changes

AdityaSripal added 3 commits July 26, 2023 14:01

fix packet processing bug

128bf26

change equality and reorder

168760c

address feedback:

e81dcea

colin-axner reviewed Jul 26, 2023

View reviewed changes

break issue

1f4f1cd

AdityaSripal added 2 commits July 26, 2023 18:38

fix timeout logic

1a6843a

refactor to simplify start flushing code

53df5e4

colin-axner reviewed Jul 27, 2023

View reviewed changes

crodriguezvega mentioned this pull request Jul 27, 2023

Check MsgChannelUpgradeInit is signed by authority cosmos/ibc-go#4186

Closed

AdityaSripal added 3 commits July 27, 2023 17:56

fix timeout logic

716f7cf

counterparty hops

98c5616

more explicit equality check

7f5bf68

AdityaSripal merged commit 9c842a4 into main Jul 27, 2023

AdityaSripal deleted the aditya/extend-upgrade-timeout branch July 27, 2023 16:06

crodriguezvega mentioned this pull request Jul 31, 2023

Consider wrapping ChanUpgradeTry call in cache ctx cosmos/ibc-go#3823

Closed

3 tasks

colin-axner reviewed Aug 2, 2023

View reviewed changes

damiannolan mentioned this pull request Aug 2, 2023

Remove restore logic in upgrade TRY handler cosmos/ibc-go#4239

Closed

3 tasks

DimitrisJim mentioned this pull request Aug 16, 2023

Update callback signatures for channel upgradability #1012

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade Refactors #999

Upgrade Refactors #999

AdityaSripal commented Jul 6, 2023 •

edited

Loading

crodriguezvega left a comment

crodriguezvega left a comment •

edited

Loading

colin-axner left a comment

colin-axner Jul 26, 2023

AdityaSripal Jul 26, 2023

colin-axner Jul 27, 2023

colin-axner left a comment •

edited

Loading

colin-axner Jul 26, 2023

colin-axner Jul 26, 2023

AdityaSripal Jul 26, 2023

colin-axner Jul 27, 2023

colin-axner Jul 26, 2023

colin-axner Jul 26, 2023

colin-axner Jul 26, 2023

colin-axner left a comment

colin-axner Jul 27, 2023

colin-axner Jul 27, 2023

colin-axner Jul 27, 2023

colin-axner Jul 27, 2023

colin-axner Jul 27, 2023

colin-axner Jul 27, 2023

colin-axner Jul 27, 2023

colin-axner commented Jul 27, 2023 •

edited

Loading

colin-axner left a comment

		// abort the transaction if the callback returns an error and
		// there was no existing upgrade. This will allow the counterparty upgrade


		The TRY chain will receive the timeout parameters chosen by the counterparty on INIT, so that it can reject any TRY message that is received after the specified timeout. This prevents the handshake from entering into an invalid state, in which the INIT chain processes a timeout successfully and restores its channel to `OPEN` while the TRY chain at a later point successfully writes a `TRY` state.
		Note that a channel upgrade handshake may never complete successfully if the in-flight packets cannot successfully be cleared. This can happen if the timeout value of a packet is too large, or an acknowledgement never arrives, or if there is a bug that makes acknowledging or timing out a packet impossible. In these cases, some out-of-protocol mechanism (e.g. governance) must step in to clear the packets "manually" perhaps by forcefully clearing the packet commitments before restarting the upgrade handshake.

		// it will set the channel to desiredChannel state and move to flushing mode if we are not already in flushing mode
		// it will store the upgrade timeout in hte upgrade state

Upgrade Refactors #999

Upgrade Refactors #999

Conversation

AdityaSripal commented Jul 6, 2023 • edited Loading

crodriguezvega left a comment

Choose a reason for hiding this comment

crodriguezvega left a comment • edited Loading

Choose a reason for hiding this comment

colin-axner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

colin-axner left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

colin-axner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

colin-axner commented Jul 27, 2023 • edited Loading

colin-axner left a comment

Choose a reason for hiding this comment

AdityaSripal commented Jul 6, 2023 •

edited

Loading

crodriguezvega left a comment •

edited

Loading

colin-axner left a comment •

edited

Loading

colin-axner commented Jul 27, 2023 •

edited

Loading