diff --git a/_docs/README.md b/_docs/README.md index 62eba7e..5d53247 100644 --- a/_docs/README.md +++ b/_docs/README.md @@ -8,34 +8,34 @@ The architectural diagram The node consists of loosely coupled components with well defined "edges" -- protocols that are used between these components. -Its a reminiscence of [microservices architecture](https://en.wikipedia.org/wiki/Microservices), where each component has clearly defined reponsibilities and interface. Implementation might vary. In case of Erigon, we use gRPC/protobuf definitions, that allows the components to be written in different languages. +It's a reminiscence of [microservices architecture](https://en.wikipedia.org/wiki/Microservices), where each component has clearly defined responsibilities and interface. Implementation might vary. In case of Erigon, we use gRPC/protobuf definitions, that allow the components to be written in different languages. In our experience, each p2p blockchain node has more or less these components, even when those aren't explicitly set up. In that case we have a highly coupled system of the same components but with more resistance to changes. ## Advantages of loosely coupled architecture -* Less dependencies between components -- less side-effects of chaging one component is on another. +* Less dependencies between components -- less side-effects of changing one component is on another. -* Team scalability -- with well specified components, its easy to make sub-teams that work on each component with less coordination overhead. Most cross-team communication is around the interface definition and interpretation. +* Team scalability -- with well specified components, it's easy to make sub-teams that work on each component with less coordination overhead. Most cross-team communication is around the interface definition and interpretation. * Learning curve reduction -- it is not that easy to find a full-fledged blockchain node developer, but narrowing down the area of responsiblities, makes it easier to both find candidates and coach/mentor the right skillset for them. -* Innovation and improvements of each layer independently -- for specialized teams for each sub-component, its easier to find some more improvements or optimizations or innovative approaches than in a team that has to keep everything about the node in the head. +* Innovation and improvements of each layer independently -- for specialized teams for each sub-component, it's easier to find some more improvements or optimizations or innovative approaches than in a team that has to keep everything about the node in the head. -## Designing for upgradeabilty +## Designing for upgradeability One important part of the design of a node is to make sure that we leave ourselves a room to upgrade it in a simple way. That means a couple of things: -- protocols for each components should be versioned, to make sure that we can't run inconsistent versions together. [semver](https://semver.org) is a better approach there because it allows to parse even future versions and figure out how compatible they are based on a simple convention; +- protocols for each component should be versioned, to make sure that we can't run inconsistent versions together. [semver](https://semver.org) is a better approach there because it allows to parse even future versions and figure out how compatible they are based on a simple convention; -- trying to keep compatiblity as much as possible, unless there is a very good reason to break it, we will try to keep it. In practice that means: +- trying to keep compatibility as much as possible, unless there is a very good reason to break it, we will try to keep it. In practice that means: - adding new APIs is safe; - adding new parameters is safe, taking into account that we can always support them missing and revert to the old behaviour; - renaming parameters and methods considered harmful; - - removing paramters and methods considered harmful; + - removing parameters and methods considered harmful; - radically changing the behaviour of the method w/o any changes to the protocol considered harmful; -Tools for automatic checks about compabilitity are available for Protobuf: https://github.com/bufbuild/buf +Tools for automatic checks about compatibility are available for Protobuf: https://github.com/bufbuild/buf ## Implementation variants ### Microservices @@ -134,5 +134,5 @@ Erigon has the following interface for the consensus engine: ## 6. Downloader -Downloader component abstracts away the functionality of deliverying some parts of the database using "out of band" protocols like BitTorrent, +Downloader component abstracts away the functionality of delivering some parts of the database using "out of band" protocols like BitTorrent, IPFS, Swarm and others. diff --git a/_docs/staged-sync.md b/_docs/staged-sync.md index 88a9874..52205cf 100644 --- a/_docs/staged-sync.md +++ b/_docs/staged-sync.md @@ -25,7 +25,7 @@ Only ID and progress functions are required. Both progress and unwind functions can have side-effects. In practice, usually only progress do (downloader interaction). -Each function (progress, unwind, prune) have **input** DB buckets and **output** DB buckets. That allows to build a dependency graph and run them in order. +Each function (progress, unwind, prune) has **input** DB buckets and **output** DB buckets. That allows to build a dependency graph and run them in order. ![](./stages-ordering.png) @@ -51,7 +51,7 @@ That allows to group similar operations together and optimize each stage for thr That also allows DB inserts optimisations, see next part. -### ETL and optimial DB inserts +### ETL and optimal DB inserts ![](./stages-etl.png) @@ -61,7 +61,7 @@ That all is called **write amplification**. The more random stuff you insert int Luckily, if we insert keys in a sorted order, this effect is not there, we fill pages one by one. -That is where our ETL framework comes to the rescue. When batch processing data, instead of wrting it directly to a database, we first extract it to a temp folder (could be in ram if fits). When extraction happens, we generate the keys for insertion. Then, we load data from these data files in a sorted manner using a heap. That way, the keys are always inserted sorted. +That is where our ETL framework comes to the rescue. When batch processing data, instead of writing it directly to a database, we first extract it to a temp folder (could be in ram if fits). When extraction happens, we generate the keys for insertion. Then, we load data from these data files in a sorted manner using a heap. That way, the keys are always inserted sorted. This approach also allows us to avoid overwrites in certain scenarios, because we can specify the right strategy on loading data: do we want to keep only the latest data, convert it into a list or anything else. diff --git a/remote/kv.proto b/remote/kv.proto index eebbe9a..ffeef00 100644 --- a/remote/kv.proto +++ b/remote/kv.proto @@ -11,7 +11,7 @@ option go_package = "./remote;remote"; //Variables Naming: // ts - TimeStamp // tx - Database Transaction -// txn - Ethereum Transaction (and TxNum - is also number of Etherum Transaction) +// txn - Ethereum Transaction (and TxNum - is also number of Ethereum Transaction) // RoTx - Read-Only Database Transaction // RwTx - Read-Write Database Transaction // k - key @@ -102,14 +102,14 @@ message Pair { bytes v = 2; uint32 cursor_id = 3; // send once after new cursor open uint64 view_id = 4; // return once after tx open. mdbx's tx.ViewID() - id of write transaction in db - uint64 tx_id = 5; // return once after tx open. internal identifier - use it in other methods - to achieve consistant DB view (to read data from same DB tx on server). + uint64 tx_id = 5; // return once after tx open. internal identifier - use it in other methods - to achieve consistent DB view (to read data from same DB tx on server). } enum Action { STORAGE = 0; // Change only in the storage UPSERT = 1; // Change of balance or nonce (and optionally storage) CODE = 2; // Change of code (and optionally storage) - UPSERT_CODE = 3; // Change in (balance or nonce) and code (and optinally storage) + UPSERT_CODE = 3; // Change in (balance or nonce) and code (and optionally storage) REMOVE = 4; // Account is deleted } @@ -167,7 +167,7 @@ message SnapshotsReply { message RangeReq { uint64 tx_id = 1; // returned by .Tx() - // It's ok to query wide/unlilmited range of data, server will use `pagination params` + // It's ok to query wide/unlimited range of data, server will use `pagination params` // reply by limited batches/pages and client can decide: request next page or not // query params diff --git a/txpool/mining.proto b/txpool/mining.proto index 9e3286d..a31d3b7 100644 --- a/txpool/mining.proto +++ b/txpool/mining.proto @@ -98,6 +98,6 @@ service Mining { // HashRate returns the current hashrate for local CPU miner and remote miner. rpc HashRate(HashRateRequest) returns (HashRateReply); - // Mining returns an indication if this node is currently mining and it's mining configuration + // Mining returns an indication if this node is currently mining and its mining configuration rpc Mining(MiningRequest) returns (MiningReply); }