-
Notifications
You must be signed in to change notification settings - Fork 130
Ibft notifies EthPeer when remote node has a better block #849
Conversation
There is a failure mode of IBFT whereby a validator fails to import a block, and also fails to receive the NewBlock message from its peers. This means said validator is unable to participate in subsequent rounds, and may cause the network to halt. To overcome this issue, if an IBFT validator receives messages from a future height, it will update the "BestEstimatedHeight" of the corresponding EthPeer object, such that the Synchroniser will (eventually) download the requisite blocks - thus allowing the IBFT network to continue to operate.
3bb8136
to
6663f3f
Compare
|
||
import tech.pegasys.pantheon.ethereum.p2p.api.PeerConnection; | ||
|
||
public interface BaseSynchronizerUpdater { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this a "base" synchronizer updater? Seems like this should be SynchronizerUpdater
and the current SynchronizerUpdater
should be EthSynchronizerUpdater
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was really hoping someone would help out with the naming.... :)
final long knownBlockNumber, final PeerConnection peerConnection) { | ||
final EthPeer ethPeer = ethPeers.peer(peerConnection); | ||
if (ethPeer == null) { | ||
LOG.warn("Received message from a peer with no corresponding EthPeer."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't log this at WARN level. It generally just means the peer disconnected between when the message was received and the processing got to this point. Debug is probably the right level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
} | ||
|
||
if (ethPeer.chainState().getEstimatedHeight() < knownBlockNumber) { | ||
ethPeer.chainState().updateHeightEstimate(knownBlockNumber); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need the if condition here, updateHeightEstimate
already checks the new height is higher than the previous. And it can do it safely inside a synchronized
block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
@@ -88,7 +88,7 @@ public void updateForAnnouncedBlock( | |||
} | |||
} | |||
|
|||
private void updateHeightEstimate(final long blockNumber) { | |||
public void updateHeightEstimate(final long blockNumber) { | |||
estimatedHeightKnown = true; | |||
if (blockNumber > estimatedHeight) { | |||
estimatedHeight = blockNumber; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will need a synchronized
block around the whole method content. Previously it was only called from other methods which already had synchronized
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
|
||
public class StubbedSynchronizerUpdater implements BaseSynchronizerUpdater { | ||
|
||
private Map<PeerConnection, ValidatorPeer> validatorNodes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What am I missing here, doesn't validatorNodes need initialising to avoid NPE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no NPE as the addNetworkPeers creates the map and assigns the created map to this variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
||
updater.updatePeerChainState(suppliedChainHeight, createAnonymousPeerConnection()); | ||
|
||
verify(chainState, times(0)).updateHeightEstimate(anyLong()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the anyLong()
be knownChainHeight
? ...unless any value is really acceptable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given this is a verify (rather than a "thenReturns") - I'd like to make sure that this function wasn't invoked at all - regardless of the parameter.
@@ -125,6 +126,10 @@ public void commitForNonProposing(final ConsensusRoundIdentifier roundId, final | |||
nonProposingPeers.forEach(peer -> peer.injectCommit(roundId, hash)); | |||
} | |||
|
|||
public void forNonProposing(final Consumer<ValidatorPeer> assertion) { | |||
nonProposingPeers.forEach(peer -> assertion.accept(peer)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could just do nonProposingPeers.forEach(assertion);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
way better.nice.
public void addNetworkPeers(final Collection<ValidatorPeer> nodes) { | ||
validatorNodes = | ||
nodes.stream() | ||
.collect(Collectors.toMap(ValidatorPeer::getPeerConnection, validator -> validator)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead validator -> validator why just use Function.identity() not really any shorter but it's clearer I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
|
||
updater.updatePeerChainState(suppliedChainHeight, createAnonymousPeerConnection()); | ||
|
||
verify(chainState, times(0)).updateHeightEstimate(anyLong()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
never() is nicer to use than times(0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
* Moving NodeWhitelistController to permissioning package * Refactoring unit tests * Fix string.format
* check the token if authentication enabled, regardless of user optional status * more tests
* o * add test * clean up * scaffolding * update * update * comments * add test * update * update * update
|
||
public class StubbedSynchronizerUpdater implements BaseSynchronizerUpdater { | ||
|
||
private Map<PeerConnection, ValidatorPeer> validatorNodes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
validatorNodes.get(peerConnection).updateEstimatedChainHeight(knownBlockNumber); | ||
} | ||
|
||
public void addNetworkPeers(final Collection<ValidatorPeer> nodes) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this an add
operations or a set
operation? ...what happens when its invoked twice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any thoughts? @rain-on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just saw this - create the map instantiation, then putAll in here. Thus will continue to grow as you call "add" - given its called once in the integration test framework setup, should be acceptable (i.e. doesn't require a "clear()" call atm.
@@ -248,5 +254,9 @@ private void addMessageToFutureMessageBuffer(final long chainHeight, final Messa | |||
futureMessages.put(chainHeight, Lists.newArrayList()); | |||
} | |||
futureMessages.get(chainHeight).add(rawMsg); | |||
|
|||
// Notify the synchroniser the transmitting peer must have the parent block to the received |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think this would make more sense being in processMessage after addMessageToFutureMessageBuffer call instead of being in the addMessageToFutureMessageBuffer method as notifying the synchroniser is not really part of updating the future message buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had assumed that when the "smart" FutureMessageBuffer came to fruition then this would move across into that class, and it would be the FutureMessageHandler - meaning it would buffer and notify other people, thus the IbftController doesn't need to know about this stuff...
...sus/ibft/src/main/java/tech/pegasys/pantheon/consensus/ibft/statemachine/IbftController.java
Outdated
Show resolved
Hide resolved
Fix for the _“The input line is too long”_ error on Windows. This fix makes Gradle generate a shortened but operational command in `pantheon.bat`. The modified `pantheon.bat`, instead of explicitly referencing each .jar individually when initializing the classpath, uses the wildcard character `*` instead.
…difficulty, not the highest chain. (PegaSysEng#899)
--> adapt `BlockBroadcaster` to use `send` method on `EthPeer` --> adapt `EthProtocolMamnager` to utilize `BlockBroadcaster` to disseminate newly mined block
…sEng#912) * [PAN-2117] Error when removing bootnodes from nodes whitelist * Fixing AT
Gives approximately a 10% perf improvement in isolation and in real-world tests allows the world state requests to actually reach the concurrent limit.
* check nodes or accounts permissioning is enabled if config file is set
There is a failure mode of IBFT whereby a validator fails to import
a block, and also fails to receive the NewBlock message from its
peers. This means said validator is unable to participate in
subsequent rounds, and may cause the network to halt.
To overcome this issue, if an IBFT validator receives messages from
a future height, it will update the "BestEstimatedHeight" of the
corresponding EthPeer object, such that the Synchroniser will
(eventually) download the requisite blocks - thus allowing the
IBFT network to continue to operate.