Skip to content
This repository has been archived by the owner on Sep 26, 2019. It is now read-only.

Ibft notifies EthPeer when remote node has a better block #849

Merged
merged 55 commits into from
Feb 20, 2019

Conversation

rain-on
Copy link
Contributor

@rain-on rain-on commented Feb 13, 2019

There is a failure mode of IBFT whereby a validator fails to import
a block, and also fails to receive the NewBlock message from its
peers. This means said validator is unable to participate in
subsequent rounds, and may cause the network to halt.

To overcome this issue, if an IBFT validator receives messages from
a future height, it will update the "BestEstimatedHeight" of the
corresponding EthPeer object, such that the Synchroniser will
(eventually) download the requisite blocks - thus allowing the
IBFT network to continue to operate.

There is a failure mode of IBFT whereby a validator fails to import
a block, and also fails to receive the NewBlock message from its
peers. This means said validator is unable to participate in
subsequent rounds, and may cause the network to halt.

To overcome this issue, if an IBFT validator receives messages from
a future height, it will update the "BestEstimatedHeight" of the
corresponding EthPeer object, such that the Synchroniser will
(eventually) download the requisite blocks - thus allowing the
IBFT network to continue to operate.

import tech.pegasys.pantheon.ethereum.p2p.api.PeerConnection;

public interface BaseSynchronizerUpdater {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this a "base" synchronizer updater? Seems like this should be SynchronizerUpdater and the current SynchronizerUpdater should be EthSynchronizerUpdater.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was really hoping someone would help out with the naming.... :)

final long knownBlockNumber, final PeerConnection peerConnection) {
final EthPeer ethPeer = ethPeers.peer(peerConnection);
if (ethPeer == null) {
LOG.warn("Received message from a peer with no corresponding EthPeer.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't log this at WARN level. It generally just means the peer disconnected between when the message was received and the processing got to this point. Debug is probably the right level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

}

if (ethPeer.chainState().getEstimatedHeight() < knownBlockNumber) {
ethPeer.chainState().updateHeightEstimate(knownBlockNumber);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need the if condition here, updateHeightEstimate already checks the new height is higher than the previous. And it can do it safely inside a synchronized block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@@ -88,7 +88,7 @@ public void updateForAnnouncedBlock(
}
}

private void updateHeightEstimate(final long blockNumber) {
public void updateHeightEstimate(final long blockNumber) {
estimatedHeightKnown = true;
if (blockNumber > estimatedHeight) {
estimatedHeight = blockNumber;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need a synchronized block around the whole method content. Previously it was only called from other methods which already had synchronized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.


public class StubbedSynchronizerUpdater implements BaseSynchronizerUpdater {

private Map<PeerConnection, ValidatorPeer> validatorNodes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What am I missing here, doesn't validatorNodes need initialising to avoid NPE?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no NPE as the addNetworkPeers creates the map and assigns the created map to this variable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


updater.updatePeerChainState(suppliedChainHeight, createAnonymousPeerConnection());

verify(chainState, times(0)).updateHeightEstimate(anyLong());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the anyLong() be knownChainHeight? ...unless any value is really acceptable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given this is a verify (rather than a "thenReturns") - I'd like to make sure that this function wasn't invoked at all - regardless of the parameter.

@@ -125,6 +126,10 @@ public void commitForNonProposing(final ConsensusRoundIdentifier roundId, final
nonProposingPeers.forEach(peer -> peer.injectCommit(roundId, hash));
}

public void forNonProposing(final Consumer<ValidatorPeer> assertion) {
nonProposingPeers.forEach(peer -> assertion.accept(peer));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could just do nonProposingPeers.forEach(assertion);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

way better.nice.

public void addNetworkPeers(final Collection<ValidatorPeer> nodes) {
validatorNodes =
nodes.stream()
.collect(Collectors.toMap(ValidatorPeer::getPeerConnection, validator -> validator));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead validator -> validator why just use Function.identity() not really any shorter but it's clearer I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.


updater.updatePeerChainState(suppliedChainHeight, createAnonymousPeerConnection());

verify(chainState, times(0)).updateHeightEstimate(anyLong());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

never() is nicer to use than times(0)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.


public class StubbedSynchronizerUpdater implements BaseSynchronizerUpdater {

private Map<PeerConnection, ValidatorPeer> validatorNodes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

validatorNodes.get(peerConnection).updateEstimatedChainHeight(knownBlockNumber);
}

public void addNetworkPeers(final Collection<ValidatorPeer> nodes) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an add operations or a set operation? ...what happens when its invoked twice?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any thoughts? @rain-on

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just saw this - create the map instantiation, then putAll in here. Thus will continue to grow as you call "add" - given its called once in the integration test framework setup, should be acceptable (i.e. doesn't require a "clear()" call atm.

@@ -248,5 +254,9 @@ private void addMessageToFutureMessageBuffer(final long chainHeight, final Messa
futureMessages.put(chainHeight, Lists.newArrayList());
}
futureMessages.get(chainHeight).add(rawMsg);

// Notify the synchroniser the transmitting peer must have the parent block to the received
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think this would make more sense being in processMessage after addMessageToFutureMessageBuffer call instead of being in the addMessageToFutureMessageBuffer method as notifying the synchroniser is not really part of updating the future message buffer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had assumed that when the "smart" FutureMessageBuffer came to fruition then this would move across into that class, and it would be the FutureMessageHandler - meaning it would buffer and notify other people, thus the IbftController doesn't need to know about this stuff...

Guillaume Lethuillier and others added 23 commits February 20, 2019 11:29
Fix for the _“The input line is too long”_ error on Windows.

This fix makes Gradle generate a shortened but operational command in `pantheon.bat`.

The modified `pantheon.bat`, instead of explicitly referencing each .jar individually when initializing the classpath, uses the wildcard character `*` instead.
--> adapt `BlockBroadcaster` to use `send` method on `EthPeer`

--> adapt `EthProtocolMamnager` to utilize `BlockBroadcaster` to disseminate newly mined block
…sEng#912)

* [PAN-2117] Error when removing bootnodes from nodes whitelist

* Fixing AT
Gives approximately a 10% perf improvement in isolation and in real-world tests allows the world state requests to actually reach the concurrent limit.
* check nodes or accounts permissioning is enabled if config file is set
@rain-on rain-on merged commit 8b5fada into PegaSysEng:master Feb 20, 2019
@rain-on rain-on deleted the eth_peer_best_block branch February 20, 2019 06:15
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.