Skip to content

Commit

Permalink
Change to at least 1/3 consensus and 13 node address book (#489)
Browse files Browse the repository at this point in the history
* Change to at least 1/3 consensus and 13 node address book

Signed-off-by: Steven Sheehy <[email protected]>

* Ensure new addressbook is copied during deployment

Signed-off-by: Steven Sheehy <[email protected]>
  • Loading branch information
steven-sheehy authored Jan 8, 2020
1 parent 84eb218 commit 79da58e
Show file tree
Hide file tree
Showing 11 changed files with 120 additions and 72 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,15 +36,15 @@ The beta mirror node works as follows:
- The signature and record files are then uploaded from the nodes to Amazon S3 and Google File Storage.

- This mirror node software downloads signature files from either S3 or Google File Storage.
- The signature files are validated to ensure more than 2/3 of the nodes in the address book (stored in a `0.0.102` file) have the same signature.
- The signature files are validated to ensure at least 1/3 of the nodes in the address book (stored in a `0.0.102` file) have the same signature.
- For each valid signature file, the corresponding record file is then downloaded from the cloud.
- Record files can then be processed and transactions and records processed for long term storage.

- Event files are handled in exactly the same manner.

- In addition, nodes regularly generate a balance file which contains the list of Hedera accounts and their corresponding balance which is also uploaded to S3 and Google File Storage.
- The files are also signed by the nodes.
- This mirror node software can download the balance files, validate 2/3rd of nodes have signed and then process the balance files for long term storage.
- This mirror node software can download the balance files, validate at least 1/3 of nodes have signed and then process the balance files for long term storage.

## Getting Started

Expand Down
72 changes: 36 additions & 36 deletions docs/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ Following is list of error messages and how to begin handling issues when they a
- There is no immediate fix. Bring to team's attention immediately (during reasonable hours, otherwise
next morning).

- `File could not be verified by more than 2/3 of nodes`
- `File could not be verified by at least 1/3 of nodes`
This can happen if

1. Some mainnet nodes are still in the process of uploading their signatures for the latest file (benign case).
Expand Down Expand Up @@ -248,41 +248,41 @@ Response: Requires human action at some point.

#### Log-based alerts

| Log Message | Default Priority | Conditional Priority |
| ------------------------------------------------------------------------ | ---------------- | ------------------------------- |
| `Error parsing record file` | HIGH | |
| `Error starting watch service` | HIGH | |
| `ERRORS processing account balances file` | HIGH | |
| `Expecting previous file hash, but found file delimiter` | HIGH | |
| `Failed to parse NodeAddressBook from` | HIGH | |
| `Hash mismatch for file` | HIGH | |
| `Long overflow when converting time to nanos timestamp` | HIGH | |
| `Previous file hash not available` | HIGH | |
| `Unable to copy address book from` | HIGH | |
| `Unable to extract hash and signature from file` | HIGH | |
| `Unable to guess correct transaction type since there's not exactly one` | HIGH | |
| `Unknown file delimiter` | HIGH | |
| `Unknown record file delimiter` | HIGH | |
| `Error processing balances files after` | MEDIUM | |
| `Exception processing account balances file` | MEDIUM | |
| `Encountered unknown transaction type` | LOW | HIGH (if 10 entries over 10 min |
| `Error closing connection` | LOW | HIGH (if 10 entries over 10 min |
| `Account balance dataset timestamp mismatch!` | LOW | |
| `Error decoding hex string` | LOW | |
| `Error reading previous file hash` | LOW | |
| `Failed to verify` | LOW | |
| `Input parameter is not a folder` | LOW | |
| `Failed to verify signature with public key` | LOW | |
| `Missing signature for file` | LOW | |
| `Error saving file in database` | NONE | HIGH (if 30 entries in 1 min) |
| `Failed downloading` | NONE | HIGH (if 30 entries in 1 min) |
| `File could not be verified by more than 2/3 of nodes` | NONE | HIGH (if 30 entries in 1 min) |
| `File watching events may have been lost or discarded` | NONE | HIGH (if 30 entries in 1 min) |
| `Signature verification failed` | NONE | HIGH (if 30 entries in 1 min) |
| `Unable to connect to database` | NONE | HIGH (if 30 entries in 1 min) |
| `Unable to fetch entity types` | NONE | HIGH (if 30 entries in 1 min) |
| `Unable to prepare SQL statements` | NONE | HIGH (if 30 entries in 1 min) |
| `Unable to set connection to not auto commit` | NONE | HIGH (if 30 entries in 1 min) |
| Log Message | Default Priority | Conditional Priority |
| ------------------------------------------------------------------------------------------- | ---------------- | ------------------------------- |
| `Error parsing record file` | HIGH | |
| `Error starting watch service` | HIGH | |
| `ERRORS processing account balances file` | HIGH | |
| `Expecting previous file hash, but found file delimiter` | HIGH | |
| `Failed to parse NodeAddressBook from` | HIGH | |
| `Hash mismatch for file` | HIGH | |
| `Long overflow when converting time to nanos timestamp` | HIGH | |
| `Previous file hash not available` | HIGH | |
| `Unable to copy address book from` | HIGH | |
| `Unable to extract hash and signature from file` | HIGH | |
| `Unable to guess correct transaction type since there's not exactly one` | HIGH | |
| `Unknown file delimiter` | HIGH | |
| `Unknown record file delimiter` | HIGH | |
| `Error processing balances files after` | MEDIUM | |
| `Exception processing account balances file` | MEDIUM | |
| `Encountered unknown transaction type` | LOW | HIGH (if 10 entries over 10 min |
| `Error closing connection` | LOW | HIGH (if 10 entries over 10 min |
| `Account balance dataset timestamp mismatch!` | LOW | |
| `Error decoding hex string` | LOW | |
| `Error reading previous file hash` | LOW | |
| `Failed to verify` | LOW | |
| `Input parameter is not a folder` | LOW | |
| `Failed to verify signature with public key` | LOW | |
| `Missing signature for file` | LOW | |
| `Error saving file in database` | NONE | HIGH (if 30 entries in 1 min) |
| `Failed downloading` | NONE | HIGH (if 30 entries in 1 min) |
| `File could not be verified by at least 1/3 of nodes | NONE | HIGH (if 30 entries in 1 min) |
| `File watching events may have been lost or discarded` | NONE | HIGH (if 30 entries in 1 min) |
| `Signature verification failed` | NONE | HIGH (if 30 entries in 1 min) |
| `Unable to connect to database` | NONE | HIGH (if 30 entries in 1 min) |
| `Unable to fetch entity types` | NONE | HIGH (if 30 entries in 1 min) |
| `Unable to prepare SQL statements` | NONE | HIGH (if 30 entries in 1 min) |
| `Unable to set connection to not auto commit` | NONE | HIGH (if 30 entries in 1 min) |

Anything that wakes up a human in the middle of the night should be immediately actionable. For all `HIGH` priority
alerts, there should be a section in the guide above listing immediate actionable steps someone can take to reduce
Expand Down
2 changes: 1 addition & 1 deletion hedera-mirror-grpc/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@
<dependency>
<groupId>net.devh</groupId>
<artifactId>grpc-spring-boot-starter</artifactId>
<version>2.6.1.RELEASE</version>
<version>2.6.2.RELEASE</version>
<exclusions>
<exclusion>
<groupId>org.springframework.boot</groupId>
Expand Down
3 changes: 3 additions & 0 deletions hedera-mirror-importer/scripts/deploy.sh
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,9 @@ if [[ -f "/usr/lib/mirror-node/mirror-node.jar" || -f "${usrlib}/${name}.jar" ]]
fi
fi

# One time removal of address book since it changed from 10 to 13 nodes. Remove in next release
rm -f "${varlib}/addressbook.bin"

if [[ ! -f "${usretc}/application.yml" ]]; then
echo "Fresh install of ${jarname}"
read -p "Bucket name: " bucketName
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,6 @@ public enum SignatureStatus {
DOWNLOADED, // Signature has been downloaded but not verified
PARSED, // Extracted hash and signature data from file
VERIFIED, // Signature has been verified against the node's public key
CONSENSUS_REACHED; // More than 2/3 of all nodes have been verified
CONSENSUS_REACHED // At least 1/3 of all nodes have been verified
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.concurrent.atomic.AtomicLong;
import java.util.stream.Collectors;
import org.apache.commons.lang3.StringUtils;
import org.apache.logging.log4j.LogManager;
Expand Down Expand Up @@ -152,7 +153,7 @@ private Multimap<String, FileStreamSignature> downloadSigFiles() throws Interrup
.maxKeys(listSize)
.build();
CompletableFuture<ListObjectsResponse> response = s3Client.listObjects(listRequest);
var pendingDownloads = new ArrayList<PendingDownload>(downloaderProperties.getBatchSize());
Collection<PendingDownload> pendingDownloads = new ArrayList<>(downloaderProperties.getBatchSize());
// Loop through the list of remote files beginning a download for each relevant sig file
// Note:
// lastValidSigFileName specified as marker above is not returned in these results by AWS S3.
Expand All @@ -171,13 +172,11 @@ private Multimap<String, FileStreamSignature> downloadSigFiles() throws Interrup
* With the list of pending downloads - wait for them to complete and add them to the list
* of downloaded signature files.
*/
var ref = new Object() {
int count = 0;
};
AtomicLong count = new AtomicLong();
pendingDownloads.forEach((pd) -> {
try {
if (pd.waitForCompletion()) {
ref.count++;
count.incrementAndGet();
File sigFile = pd.getFile();
FileStreamSignature fileStreamSignature = new FileStreamSignature();
fileStreamSignature.setFile(sigFile);
Expand All @@ -188,8 +187,8 @@ private Multimap<String, FileStreamSignature> downloadSigFiles() throws Interrup
log.warn("Failed downloading {} in {}", pd.getS3key(), pd.getStopwatch(), ex);
}
});
if (ref.count > 0) {
log.info("Downloaded {} signatures for node {} in {}", ref.count, nodeAccountId, stopwatch);
if (count.get() > 0) {
log.info("Downloaded {} signatures for node {} in {}", count.get(), nodeAccountId, stopwatch);
}
} catch (Exception e) {
log.error("Error downloading signature files for node {} after {}", nodeAccountId, stopwatch, e);
Expand Down Expand Up @@ -262,12 +261,12 @@ private boolean moveFile(File sourceFile, File destinationFile) {
}

/**
* For each group of signature Files with the same file name: (1) verify that the signature files are signed by
* corresponding node's PublicKey; (2) For valid signature files, we compare their Hashes to see if more than 2/3
* Hashes matches. If more than 2/3 Hashes matches, we download the corresponding data file from a node folder which
* has valid signature file. (3) compare the Hash of data file with Hash which has been agreed on by valid
* signatures, if match, move the data file into `valid` directory; else download the data file from other valid
* node folder, and compare the Hash until find a match one
* For each group of signature files with the same file name: (1) verify that the signature files are signed by
* corresponding node's PublicKey; (2) For valid signature files, we compare their Hashes to see if at least 1/3 of
* hashes match. If they do, we download the corresponding data file from a node folder which has valid signature
* file. (3) compare the hash of data file with Hash which has been agreed on by valid signatures, if match, move
* the data file into `valid` directory; else download the data file from other valid node folder and compare the
* hash until we find a match.
*
* @param sigFilesMap
*/
Expand Down Expand Up @@ -299,10 +298,8 @@ private void verifySigsAndDownloadDataFiles(Multimap<String, FileStreamSignature
File signedDataFile = downloadSignedDataFile(signature.getFile());
if (signedDataFile != null && Utility.hashMatch(signature.getHash(), signedDataFile)) {
log.debug("Downloaded data file {} corresponding to verified hash", signedDataFile.getName());
// Check that file is newer than last valid downloaded file.
// Additionally, if the file type uses prevFileHash based linking, verify that new file is
// next in
// the sequence.
// Check that file is newer than last valid downloaded file. Additionally, if the file type
// uses prevFileHash based linking, verify that new file is next in the sequence.
if (verifyHashChain(signedDataFile)) {
// move the file to the valid directory
File destination = validPath.resolve(signedDataFile.getName()).toFile();
Expand All @@ -328,7 +325,7 @@ private void verifySigsAndDownloadDataFiles(Multimap<String, FileStreamSignature
}

if (!valid) {
log.error("File could not be verified by more than 2/3 of nodes: {}", sigFileName);
log.error("File could not be verified by at least 1/3 of nodes: {}", sigFileName);
}
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -52,14 +52,14 @@ public NodeSignatureVerifier(NetworkAddressBook networkAddressBook) {
.collect(Collectors.toMap(NodeAddress::getId, NodeAddress::getPublicKeyAsObject));
}

private static boolean consensusReached(long n, long N) {
return n > N * 2 / 3.0;
private static boolean consensusReached(long actualNodes, long expectedNodes) {
return actualNodes >= Math.ceil(expectedNodes / 3.0);
}

/**
* Verifies that the signature files are signed by corresponding node's PublicKey. For valid signature files, we
* compare their hashes to see if more than 2/3 Hashes match. If a signature is valid, we put the hash in its
* content and its file to the map, to see if more than 2/3 valid signatures have the same hash.
* compare their hashes to see if at least 1/3 hashes match. If a signature is valid, we put the hash in its content
* and its file to the map, to see if at least 1/3 valid signatures have the same hash.
*
* @param signatures a list of a sig files which have the same timestamp
* @throws SignatureVerificationException
Expand Down
Loading

0 comments on commit 79da58e

Please sign in to comment.