-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Beat [3/4]: prepare resolvers to handle the blockbeat
#9276
base: yy-feature-blockbeat
Are you sure you want to change the base?
Conversation
Important Review skippedAuto reviews are limited to specific labels. 🏷️ Labels to auto review (1)
Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Pull reviewers statsStats of the last 30 days for lnd:
|
763bfc7
to
d7b0d2f
Compare
ac8ce82
to
a16b2d4
Compare
02cebbb
to
e40d7aa
Compare
a16b2d4
to
ae69b98
Compare
e40d7aa
to
736edda
Compare
3456f2e
to
3d08b74
Compare
1e17c5c
to
9ab2e53
Compare
3d08b74
to
e69cd41
Compare
e69cd41
to
cdd805f
Compare
9ab2e53
to
9ed2522
Compare
Minor point, it it isn't inheritance, it's composition, with the inner resolver replacing the outer one. Go technically doesn't have inheritance at all. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't review the first two PRs, so one thing I'm missing context on is: why dees Resolve
need to split up in order to be incorporated into the new blockbeat
architecture?
Is that that since we don't know which ones will be launched ahead of time, we can't set up the blockbeat
consumer DAG upfront?
contractcourt/contract_resolver.go
Outdated
prefix) | ||
|
||
// If the ShortChanID is empty, we use the ChannelPoint instead. | ||
if r.ShortChanID.IsDefault() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the scid isn't used above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yikes yeah removed since SCID isn't that useful for lookups
@@ -304,23 +247,14 @@ func (h *htlcSuccessResolver) resolveRemoteCommitOutput() ( | |||
return nil, err | |||
} | |||
|
|||
// Once the transaction has received a sufficient number of | |||
// confirmations, we'll mark ourselves as fully resolved and exit. | |||
h.resolved = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this attribute no longer used? IIRC, it has a role for the subset of outputs that are still sent to the nursery.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah it's moved into checkpointClaim
@@ -373,6 +307,7 @@ func (h *htlcSuccessResolver) checkpointClaim(spendTx *chainhash.Hash, | |||
} | |||
|
|||
// Finally, we checkpoint the resolver with our report(s). | |||
h.resolved = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah nvm I see it here restored. Though IIRC, with the current control flow, we shouldn't do this unconditionally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah the slight difference is we put it in checkpointClaim
, after we've done NotifyFinalHtlcEvent
, and move it closer to Checkpoint
as it's where we write to disk.
return err | ||
} | ||
|
||
h.outputIncubating = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here re this state value. I do think it's possible to make the resolvers 100% stateless, by relying on chain notifications to figure out if something has been mined or not. The nursery might need to be queried distinctly though .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! I also realized this after reviewing #9062. What's more, because the sweeper now handles re-offered inputs by first checking the db and mempool to grab the existing tx, it means it's safe to re-offer the same inputs during startup. And since deep down the nursery also uses the sweeper, it means we can easily make all the resolvers stateless.
Atm another usage of the outputIncubating
is to decide whether we should resolve the stage one or stage two HTLC, which can also be changed to rely on chain notifications. So the changes would be something like,
- subscribe the htlc tx spending at the very beginning when initializing the resolver
- when
Resolve
it, if the htlc tx is already spent, subscribe to the output of the htlc tx- if the output is spent, we are done here
- otherwise, wait for the spend to resolve the output spent.
Basically we just outsource the info provided by outputIncubating
to chain notification, further simplify the resolvers here. The only part I'm not sure is for neutrino backend, since there's no mempool, it can only rely on the sweeper's store to decide whether an input is new or not, which means we may need to store more info there (currently only the txid).
I actually plan to do this after the blockbeat
is in to limit the scope, rough idea is 1) refactor a bit to get rid of outputIncubating
; 2) do a db migration to remove the outputIncubating
(maybe not needed?); 3) completely remove the nursery! WDYT?
// | ||
// NOTE: Part of the ContractResolver interface. | ||
func (c *commitSweepResolver) IsResolved() bool { | ||
return c.resolved |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can move this method to the contract kit, that way we hide the underlying atomic details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -204,7 +204,7 @@ func (h *htlcIncomingContestResolver) Resolve() (ContractResolver, error) { | |||
log.Infof("%T(%v): HTLC has timed out (expiry=%v, height=%v), "+ | |||
"abandoning", h, h.htlcResolution.ClaimOutpoint, | |||
h.htlcExpiry, currentHeight) | |||
h.resolved = true | |||
h.resolved.Store(true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MarkResolved
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -1294,13 +1294,13 @@ func (h *htlcTimeoutResolver) resolveTimeoutTxOutput(op wire.OutPoint) error { | |||
// Launch creates an input based on the details of the outgoing htlc resolution | |||
// and offers it to the sweeper. | |||
func (h *htlcTimeoutResolver) Launch() error { | |||
if h.launched { | |||
if h.launched.Load() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here re read/write methods to encapsulate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I like it, done!
c.activeResolversLock.Lock() | ||
c.activeResolvers = resolvers | ||
c.activeResolversLock.Unlock() | ||
|
||
// Launch all resolvers. | ||
c.launchResolvers() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a recent PR, we eliminated a circular waiting condition related to tap channels (in the litd
context). With this method now blocking to launch resolvers, we'll need to make sure we don't reintroduce such a livelock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we know why it was blocking? In this version the Launch
only sends the input to the sweeper, and the waiting for spend is handled in the Resolve
.
contractcourt/channel_arbitrator.go
Outdated
var reports []*ContractReport | ||
for _, resolver := range c.activeResolvers { | ||
for _, resolver := range c.resolvers() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where was the race? At a glance, everything looks to be mutex proteted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm looks like it no longer existed due to the fix of resolved
and launched
, now removed.
1e58126
to
abbee3b
Compare
So we split it into two steps, step one is
Yeah this part too - so these sub-consumers, like chain watcher and channel arbitrator, are ephemeral and are not managed by the blockbeat dispatcher. Some previous discussion can be found here. Basically it's easier to let the |
9ed2522
to
5b3b60c
Compare
We now put the outpoint in the resolvers's logging so it's easier to debug. Also use the short channel ID whenever possible to shorten the log line length.
This commit adds a few helper methods to decide how the htlc output should be spent.
This commit is a pure refactor in which moves the sweep handling logic into the new methods.
This commit refactors the `Resolve` method by adding two resolver handlers to handle waiting for spending confirmations.
This commit adds new methods to handle making sweep requests based on the spending path used by the outgoing htlc output.
This commit adds checkpoint methods in `htlcTimeoutResolver`, which are similar to those used in `htlcSuccessResolver`.
This commit adds more methods to handle resolving the spending of the output based on different spending paths.
We will use this and its following commits to break the original `Resolve` methods into two parts - the first part is moved to a new method `Launch`, which handles sending a sweep request to the sweeper. The second part remains in `Resolve`, which is mainly waiting for a spending tx. Breach resolver currently doesn't do anything in its `Launch` since the sweeping of justice outputs are not handled by the sweeper yet.
This commit breaks the `Resolve` into two parts - the first part is moved into a `Launch` method that handles sending sweep requests, and the second part remains in `Resolve` which handles waiting for the spend. Since we are using both utxo nursery and sweeper at the same time, to make sure this change doesn't break the existing behavior, we implement the `Launch` as following, - zero-fee htlc - handled by the sweeper - direct output from the remote commit - handled by the sweeper - legacy htlc - handled by the utxo nursery
This commit breaks the `Resolve` into two parts - the first part is moved into a `Launch` method that handles sending sweep requests, and the second part remains in `Resolve` which handles waiting for the spend. Since we are using both utxo nursery and sweeper at the same time, to make sure this change doesn't break the existing behavior, we implement the `Launch` as following, - zero-fee htlc - handled by the sweeper - direct output from the remote commit - handled by the sweeper - legacy htlc - handled by the utxo nursery
When calling `NotifyExitHopHtlc` it is allowed to pass a chan to subscribe to the HTLC's resolution when it's settled. However, this method will also return immediately if there's already a resolution, which means it behaves like a notifier and a getter. If the caller decides to only use the getter to do a non-blocking lookup, it can pass a nil subscriber chan to bypass the notification.
A minor refactor is done to support implementing `Launch`.
This commit makes `resolved` an atomic bool to avoid data race. This field is now defined in `contractResolverKit` to avoid code duplication.
In this commit, we break the old `launchResolvers` into two steps - step one is to launch the resolvers synchronously, and step two is to actually waiting for the resolvers to be resolved. This is critical as in the following commit we will require the resolvers to be launched at the same blockbeat when a force close event is sent by the chain watcher.
abbee3b
to
aeafa63
Compare
We need to offer the outgoing htlc one block earlier to make sure when the expiry height hits, the sweeper will not miss sweeping it in the same block. This also means the outgoing contest resolver now only does one thing - watch for preimage spend till height expiry-1, which can easily be moved into the timeout resolver instead in the future.
Depends on
blockbeat
#8894NOTE: itest is fixed in the final PR
Turns out mounting
blockbeat
inChannelArbitrator
can be quite challenging (unit tests, itest, etc). This PR attempts to implement it in hopefully the least disruptive way - onlychainWatcher
implementsConsumer
, and the contract resolvers are kept stateless (in terms of blocks). The main changes are,Resolve
method is broken into two steps: 1)Launch
the resolver, which handles sending the sweep request, and 2)Resolve
the resolver, which handles monitoring the spending of the output.chainWatcher
implementsConsumer
in the following PR.Alternatives
The original attempt is to make the resolvers subscribe to a blockbeat chan, as implemented in #8717. The second attempt is to make the resolvers also blockbeat
Consumer
, as implemented here.This third approach is chosen as 1) it greatly limits the scope otherwise a bigger refactor of channel arbitrator may be needed, and 2) the resolvers can be made stateless in terms of blocks, and be fully managed by the channel arbitrator. In other words, when a new block arrives, the channel arbitrator decides whether to launch the resolvers or not, so the resolvers themselves don't need this block info.
In fact, there are only two resolvers that subscribe to blocks, the incoming contest resolver, which uses block height to decide whether to give up resolving an expired incoming htlc; and the outgoing contest resolver, which uses the block height to choose to transform itself into a timeout resolver. IMO if we can remove the inheritance pattern used in
contest resolver -> time/success resolver
and manage to transform resolvers in channel arbitrator, we can further remove those two block subscriptions. As for now, we can leave them there as they have little impact on the block consumption order enforced by the blockbeat.