-
Notifications
You must be signed in to change notification settings - Fork 266
Fix transaction filtering in hbbft handler #72
Conversation
This reverts commit 327a661.
The CI failure should go away once the core PR is merged |
@@ -79,13 +79,13 @@ handle_command({next_round, NextRound, TxnsToRemove, _Sync}, State=#state{hbbft= | |||
N when N > 0 -> | |||
Ledger0 = blockchain:ledger(Chain), | |||
Ledger1 = blockchain_ledger_v1:new_context(blockchain_ledger_v1:delete_context(Ledger0)), | |||
|
|||
NewChain = blockchain:ledger(Ledger1, Chain), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like some comments explaining the why of each of these operations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
ok == blockchain_txn:absorb(Txn, Chain) | ||
case blockchain_txn:is_valid(Txn, Chain) of | ||
ok -> | ||
case blockchain_txn:absorb(Txn, Chain) of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why no need for a new chain here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ledger here has had the context refreshed by the next_round
event, we're just re-adding the pending transactions to the context (if they're still valid).
Group = ct_rpc:call(Candidate, gen_server, call, [miner, consensus_group, infinity]), | ||
false = Group == undefined, | ||
ok = libp2p_group_relcast:handle_command(Group, SignedTxn2), | ||
ct_rpc:call(Candidate, sys, suspend, [Group]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seen any evidence that this is racy? I guess it'd be hard to fix if so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it shouldn't be, it should only race if the block timer fires between these 2 calls.
@@ -129,6 +129,73 @@ single_payment_test(Config) -> | |||
4000 = PayerBalance + Fee, | |||
6000 = PayeeBalance, | |||
|
|||
%% put the transaction into and then suspend one of the consensus group members | |||
Txn2 = ct_rpc:call(Payer, blockchain_txn_payment_v1, new, [PayerAddr, PayeeAddr, 1000, Fee, 2]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does this fail with just the absorb?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bug only triggers when there's transactions in the buffer that did not appear in the block. This is hard to provoke without adding latency so instead I put the transaction in the buffer of one hbbft worker, suspend it, make some blocks using the remaining consensus nodes and then resume the hbbft worker and check if the transaction gets incorrectly filtered.
This branch ended up being a bit exploratory, but it exposed some failures in how we were handling transactions. The actual fix is line 84 of hbbft handler, but along the way we improved some other things:
This depends on the corresponding blockchain-core PR helium/blockchain-core#94