Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nodes shouldn't publish their commitment when receiving outdated channel_reestablish #934

Open
t-bast opened this issue Nov 8, 2021 · 22 comments

Comments

@t-bast
Copy link
Collaborator

t-bast commented Nov 8, 2021

This was discussed during the spec meeting and is weakly linked to #932

Let's imagine we have two nodes Alice and Bob.
Alice takes her node offline, does weird stuff with her DB (e.g. a migration from sqlite to postgres) then comes back online.
She sends a channel_reestablish to Bob with outdated values (because she messed up her migration).

Expected behavior

Bob detects that Alice is late, so Bob will likely need to publish his latest commitment to help Alice get her funds back.
Bob waits for Alice to send an error before publishing his commitment.

When Alice receives Bob's channel_reestablish, she realizes she's late.
She stops her node (without sending an error), figures out where she messed up in her migration, fixes her DB, restarts.
Now she sends a channel_reestablish with the up-to-date value, so the channel can resume operating.

Non optimal behavior

Bob detects that Alice is late, so Bob publishes his latest commitment to help Alice.
But now Alice lost her chance to fix her DB.
If Alice is a big node with a ton of channels, she just lost a ton of money on on-chain fees...

Conclusion

Implementations should really wait to receive an error before publishing their commitment!
@Roasbeef is that clearer than during the spec meeting?

EDIT: the spec was clarified in #942 to highlight that this is the desired behavior.

@rustyrussell
Copy link
Collaborator

OK, I looked at this for the latest c-lightning release. Changing behavior to not close when sending error was a big change this close to release, but there is an alternative: send a warning instead on an error here:

BOLT 2:
	 *  - otherwise:
	 *    - if `next_revocation_number` is not equal to 1 greater
	 *      than the commitment number of the last `revoke_and_ack` the
	 *      receiving node has sent:
	 *      - SHOULD fail the channel.

This works for this particular case, at least.

rustyrussell added a commit to rustyrussell/lightning that referenced this issue Apr 1, 2022
This is the minimal change to meet the desired outcome of lightning/bolts#934
which wants to give obsolete-db nodes a chance to fix things up, before we
close the channel.

We need to dance around a bit here, since we *will* close the channel if
we receive an ERROR, so we suppress that.

Signed-off-by: Rusty Russell <[email protected]>
@vincenzopalazzo
Copy link
Contributor

There is another thing that is unclear to me! what does the recipient need to do in the case when he receives the warning message?

Right now, the spec (with warning message proposed by @rustyrussell ) gives the possibility to know where there is some error, but what about the following case:

  • Sender -> outdated channel_reestablish -> receiver
  • Sender <- warning <- receiver
  • Sender -> another outdated channel_reestablish -> receiver

In this case, the sender should have only one choice, or send the right channel_reestablish or he should be able to force close the channel, right?

because in che case of warning message we can fall in a ping pong case, where the sender continue to send wrong data, and the receive continue to response with warnings!

@TheBlueMatt
Copy link
Collaborator

That isn't something the spec should really concern itself with - the node that realized its out of date has to do something, but what that is is up to the implementation - it can suggest the user search for a newer state on drive somewhere, it can ask the peer to force-close (using an error message) or it can throw up its hands and light the cat on fire.

@vincenzopalazzo
Copy link
Contributor

Ok! this is clear, but in the spec should be something that should avoid the ping point event? In other words, the sender can put another wrong channel_reestablish? and in this case, the receiver should force close the channel?

Or this is something that will never happen?

@vincenzopalazzo
Copy link
Contributor

vincenzopalazzo commented Apr 29, 2022

After some reasoning over a discussion that @TheBlueMatt points out on LDK (lightningdevkit/rust-lightning#1430 (comment))

I think that I expressed my idea wrong here, and I agree that all this behavior should not specify in the spec, but in my opinion, here we should have a line that specifies "the sender should not send multiple times the wrong channel_restablish" because we know that if a node receives a wrong channel_reestablish it should send a channel_reestablish with the data updated.

What are the benefits of this addition? that we know when a node is not spec compliant, and we can detect this with a test in lnprototest. I mean what is the benefit to keep receiving them wrong channel_restablish?

On the other hand, this is possible from a software point of view, basically because the sender doesn't take into count this case and ignores the update data received with the sender channel_reestablish and due to a possible DB data lost for the send, the data that it sends are perfectly sane.

cdecker pushed a commit to rustyrussell/lightning that referenced this issue Jul 19, 2022
This is the minimal change to meet the desired outcome of lightning/bolts#934
which wants to give obsolete-db nodes a chance to fix things up, before we
close the channel.

We need to dance around a bit here, since we *will* close the channel if
we receive an ERROR, so we suppress that.

Signed-off-by: Rusty Russell <[email protected]>
cdecker pushed a commit to rustyrussell/lightning that referenced this issue Sep 19, 2022
This is the minimal change to meet the desired outcome of lightning/bolts#934
which wants to give obsolete-db nodes a chance to fix things up, before we
close the channel.

We need to dance around a bit here, since we *will* close the channel if
we receive an ERROR, so we suppress that.

Signed-off-by: Rusty Russell <[email protected]>
cdecker pushed a commit to rustyrussell/lightning that referenced this issue Sep 21, 2022
This is the minimal change to meet the desired outcome of lightning/bolts#934
which wants to give obsolete-db nodes a chance to fix things up, before we
close the channel.

We need to dance around a bit here, since we *will* close the channel if
we receive an ERROR, so we suppress that.

Signed-off-by: Rusty Russell <[email protected]>
cdecker pushed a commit to ElementsProject/lightning that referenced this issue Sep 26, 2022
This is the minimal change to meet the desired outcome of lightning/bolts#934
which wants to give obsolete-db nodes a chance to fix things up, before we
close the channel.

We need to dance around a bit here, since we *will* close the channel if
we receive an ERROR, so we suppress that.

Signed-off-by: Rusty Russell <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants