Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server+contractcourt: create breachResolver to ensure htlc's are failed back #6158

Merged
merged 5 commits into from
Jan 26, 2022

Conversation

Crypt-iQ
Copy link
Collaborator

@Crypt-iQ Crypt-iQ commented Jan 13, 2022

This pull request fails back HTLCs if we detect our remote party has broadcasted a breach transaction. This is accomplished by launching a breachResolver like any other resolver. A nice bonus is that since the breach-lifecycle is now tracked in the channel arbitrator, we can tack on an anchorResolver and sweep the anchor output. This would also allow the breacharbiter code to be moved into this breachResolver in a later changeset.

Fixes #3494

Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to finally resolve this cancel back issue, and also incrementally refactor things to let us bring this breach logic into a resolver, which was created only after the current breach arb was dreamed up!

Completed an initial pass, need to double check some assumptions here w.r.t restarts, and also handling cases where we didn't have a breach resolver for older nodes.

contractcourt/breach_resolver.go Show resolved Hide resolved
contractcourt/breach_resolver.go Show resolved Hide resolved
contractcourt/breach_resolver.go Outdated Show resolved Hide resolved
contractcourt/channel_arbitrator.go Show resolved Hide resolved
contractcourt/chain_watcher.go Show resolved Hide resolved
Copy link
Member

@yyforyongyu yyforyongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice set of improvements! Will do a second round to review the tests and to understand the impact, meanwhile got a few questions.

contractcourt/breach_resolver.go Show resolved Hide resolved
contractcourt/briefcase.go Show resolved Hide resolved
@@ -57,6 +57,11 @@ type RemoteUnilateralCloseInfo struct {
CommitSet CommitSet
}

// BreachResolution wraps the outpoint of the breached channel.
type BreachResolution struct {
FundingOutPoint wire.OutPoint
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the top bucket logScope already has the chanPoint info, seems like a bool can carry enough info here to make sure our db is compact?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, this FundingOutPoint does not need to be here since in the breachResolver it will have access to the chanpoint through the resolver kit. I didn't know what else to put here. What would the bool store? I don't believe we can just store a nil value in bbolt because I would have done this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't want to store this, but want to still store something useful, we could store the state number being breached.

Copy link
Collaborator Author

@Crypt-iQ Crypt-iQ Jan 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking a version field in case the breachresolution needs to change when we upgrade it later?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could achieve that simply with a new TLV type?

Either way we need to settle on what we want to store now. Funding outpoint here is fine, just that it doesn't give us any other unique information fwiw.

We also have the option to use TLV within the serialization as well which gives us a flexible upgrade path.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't have a better option here. Initially was thinking storing a 1 might be enough and save us some space. Since a breach rarely happens, I think it's fine to leave it as is.

Copy link
Collaborator Author

@Crypt-iQ Crypt-iQ Jan 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it's fine to store as-is, I would have liked to have caught this earlier in the process, but oh well. Wouldn't a TLV where the value is the version be the same as just storing the version but with slightly more bytes?

contractcourt/chain_watcher.go Show resolved Hide resolved
}

// We'll set the ConfCommitKey here as the remote htlc set. This is
// only used to ensure a nil-pointer-dereference doesn't occur and is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I got this doc right, this can be dangerous if we are setting a value just to get around the nil pointer issue. Shouldn't we just check against nil?

Copy link
Collaborator Author

@Crypt-iQ Crypt-iQ Jan 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't use nil without changing the way encodeHtlcSetKey works in InsertConfirmedCommitSet. InsertConfirmedCommitSet will need a ConfCommitKey when encoding the commit set. An alternative to this could be making a new ConfCommitKey since 3/4 are used we can "make" another one. IMO seems like more effort than is necessary as the callpath later on just uses RemoteHtlcSet in prepContractResolutions which calls constructChainActions which the breach case ignores. No codepath currently uses nil in the ConfCommitKey so it would only be dangerous if a code-path used nil.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct though AFAICT: we're handling a breach, which means the remote party force closed, which means we only have one main commitment to play.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Crypt-iQ thanks for the detailed explanation! My main concern is that we are setting a value where it shouldn't, as later on other callers may access it while it doesn't fulfill what it claims to be, thus increasing the cognitive load. Looks like it's the very ending of the code-path so I'd figure it should be fine, plus based on @Roasbeef 's comment I think the ConfCommitKey is indeed the RemoteHtlcSet?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The breach case is technically none of these ConfCommitKey - they track the current state of the commitment transactions (local, remote, remote pending) where the breach case is in the past. It is a remote commit tx, but some parts of the code in the channel arbitrator treats these commit keys as "current" to make decisions with so that's why I say the breach case is not really RemoteHtlcSet. After looking at the possible options I decided on the first one:

  • keep it as is in the pr
  • set nil and adjust all call-sites to handle nil, knowing it's the breach case
  • create a new confcommitkey that is the breach case
    • downside is that IsRemote would be false and IsPending would be true which is not the case and any code relying on IsRemote or IsPending may be messed up in the future
      var (
      // LocalHtlcSet is the HtlcSetKey used for local commitments.
      LocalHtlcSet = HtlcSetKey{IsRemote: false, IsPending: false}
      // RemoteHtlcSet is the HtlcSetKey used for remote commitments.
      RemoteHtlcSet = HtlcSetKey{IsRemote: true, IsPending: false}
      // RemotePendingHtlcSet is the HtlcSetKey used for dangling remote
      // commitment transactions.
      RemotePendingHtlcSet = HtlcSetKey{IsRemote: true, IsPending: true}
      )
  • go from the current scheme to something different

contractcourt/breach_resolver.go Show resolved Hide resolved
@@ -57,6 +57,11 @@ type RemoteUnilateralCloseInfo struct {
CommitSet CommitSet
}

// BreachResolution wraps the outpoint of the breached channel.
type BreachResolution struct {
FundingOutPoint wire.OutPoint
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't want to store this, but want to still store something useful, we could store the state number being breached.

}

// We'll set the ConfCommitKey here as the remote htlc set. This is
// only used to ensure a nil-pointer-dereference doesn't occur and is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct though AFAICT: we're handling a breach, which means the remote party force closed, which means we only have one main commitment to play.

@Crypt-iQ Crypt-iQ force-pushed the breach_resolver_cancel branch 2 times, most recently from 52a436f to e23fa57 Compare January 24, 2022 20:07
Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🪟

Working on a set up to be able to manually test some of the added backwards compatibility here.

func newBreachResolver(resCfg ResolverConfig) *breachResolver {
r := &breachResolver{
contractResolverKit: *newContractResolverKit(resCfg),
replyChan: make(chan struct{}),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could safely be buffered to a size of 1.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opted to close it, do you think it should be passed a struct{}{} instead?

@@ -57,6 +57,11 @@ type RemoteUnilateralCloseInfo struct {
CommitSet CommitSet
}

// BreachResolution wraps the outpoint of the breached channel.
type BreachResolution struct {
FundingOutPoint wire.OutPoint
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could achieve that simply with a new TLV type?

Either way we need to settle on what we want to store now. Funding outpoint here is fine, just that it doesn't give us any other unique information fwiw.

We also have the option to use TLV within the serialization as well which gives us a flexible upgrade path.

Copy link
Member

@yyforyongyu yyforyongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good, just a few nits, also the state transition doc needs an update, feel free to copy this commit.

Meanwhile I think the changes made here have already been covered by our itest right?


// Otherwise since the channel point is not resolved, add a
// subscription. There can only be one subscription per channel point.
b.subscriptionsMtx.Lock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can just do b.Lock() here? Why do we need subscriptionsMtx since they are locking the same thing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, habit of mine to have a mutex with a map, will change

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

@@ -45,7 +49,7 @@ func (c *ContractResolutions) IsEmpty() bool {
return c.CommitResolution == nil &&
len(c.HtlcResolutions.IncomingHTLCs) == 0 &&
len(c.HtlcResolutions.OutgoingHTLCs) == 0 &&
c.AnchorResolution == nil
c.AnchorResolution == nil && c.BreachResolution == nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the godoc needs an update

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

@@ -1996,10 +2027,62 @@ func (c *ChannelArbitrator) prepContractResolutions(
commitHash := contractResolutions.CommitHash
failureMsg := &lnwire.FailPermanentChannelFailure{}

var htlcResolvers []ContractResolver
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order doesn't matter here right? Looks like we used to put anchor resolver at the end.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Order doesn't matter, caller just launches them

}

// We'll set the ConfCommitKey here as the remote htlc set. This is
// only used to ensure a nil-pointer-dereference doesn't occur and is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Crypt-iQ thanks for the detailed explanation! My main concern is that we are setting a value where it shouldn't, as later on other callers may access it while it doesn't fulfill what it claims to be, thus increasing the cognitive load. Looks like it's the very ending of the code-path so I'd figure it should be fine, plus based on @Roasbeef 's comment I think the ConfCommitKey is indeed the RemoteHtlcSet?

@@ -686,13 +704,95 @@ func TestChannelArbitratorBreachClose(t *testing.T) {
// It should start out in the default state.
chanArbCtx.AssertState(StateDefault)

outgoingIdx := uint64(2)

rHash1 := [lntypes.PreimageSize]byte{1, 2, 3}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we add some docs here saying we've created one outgoing and one incoming HTLCs on the remote commit and test that only the outgoing one is resolved? Can be helpful for future developers.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

…iter

Introduces a breachResolver that subscribes to the breacharbiter to
determine if the final justice transaction has confirmed and can
clean itself up.
Also transitions to the proper state based on if this is a legacy
breach in the channel arbitrator or a modern breach with a resolver.
This also changes the chain_watcher and breacharbiter handoff. The
new logic ensures that the channel is only marked as pending closed
when the channel arbitrator has persisted the resolutions and commit
set.
@Crypt-iQ Crypt-iQ force-pushed the breach_resolver_cancel branch from e23fa57 to 64a51c1 Compare January 25, 2022 20:43
Copy link
Member

@yyforyongyu yyforyongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice path on this one, LGTM👍

A total off-topic Q, looks like the commits are not signed...?

@Roasbeef Roasbeef merged commit a57c650 into lightningnetwork:master Jan 26, 2022
@Crypt-iQ Crypt-iQ deleted the breach_resolver_cancel branch January 26, 2022 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cnct: handle breach in contract resolvers
3 participants