-
Notifications
You must be signed in to change notification settings - Fork 997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GossipSub: disabling unused From/Seqno/Signature #1981
Comments
In case of the golang implementation, this would look like:
This can be done without changing consensus itself, but still affects the network enough to take extra care, enough time to update, and iterate on the plan before committing to it. It's just an unused option in golang pubsub for now, after some initial feedback. Clients can help by:
Tracking table:
|
Here's Nimbus (nim-libp2p) current behavior:
|
@dryajov, sorry, maybe I should have added more questions. When you say "optional, disabled for eth2" for the signature, does that mean you reject messages with a signature, or just ignore the signature? |
It means that we don't require the signature to be present and if it is present we do not verify it for eth2. However, nim-libp2p allows to configure this, so one can enable it if desired. |
js-libp2p-gossipsub+lodestar behavior:
|
Teku:
|
@jrhea thanks for looking into the data! Those may be from the validators that got slashed, and were publishing double attestations. There's a chance you produce the same vote (not slashable, by luck) when running duplicate nodes, resulting in two messages with same contents, but different seqno. Can you show the distribution of seqno for those messages? We can narrow down which clients are affected based on the seqno range (prysm uses initial nanosecond time, teku inits with random and keeps incrementing, others always random) |
Most of ours looks correct. On altona we set a 0 multihash/PeerId as the author rather than randomizing it. We dont modify the sequence number when propagating. |
This dataset only contains blocks
Sure can...I had something like that in the works, but didn't get a chance to circle back to it today. Will take a look tonight. |
Whoops, my bad. The same can still happen with double block proposals though.
Awesome, thanks |
Creating a compelling visual to provide the distribution of seqnos by client was non-trival (or i am just too tired). Instead, I have a visual of the distribution of message id's with multiple seqnos by client. btw, I have issues staying connected with Teku 😶 that's why it is not in the data. To orient you, this is what the data looks like in a table after some massaging: Here is the same data visualized with a heatmap: At first, it looks like Lighthouse is the main culprit, but let's see what fraction of total messages were received by each client: The distribution isn't exactly uniform so I normalize it and attempt the heatmap visualization again: This normalized view of the data doesn't provide overwhelming evidence of anything, but it does indicate that Nimbus might be the culprit (I wouldn't exactly bet my life on it though). One more interesting fact, there isn't a single message-id with multiple sequence numbers that wasn't delivered by Nimbus at least one time. Remember the table I posted at the beginning of this comment? None of the rows have a 0 in the Nimbus column 🤷♂️. Otoh, these modified message-ids with modified seqno's only make up ~ 3% of the messages received from nimbus so I don't know how compelling this really is. That being said, it is all i have right now and maybe @dryajov wouldn't mind double checking the nimbus logic to see if it possibly is modifying seqno. In the meantime, I will be in a straight jacket in a padded room. |
@jrhea, any way we can check the timestamp of the message? We only recently (~2w ago) added proper message id generation, before it used to be the default Sorry for the vague dates and I'll try to get to the bottom of this in the morning, but I suspect that any or both of this changes might have had something to do with it. Thanks for the heads up 👍 |
Sure thing! Here are the min and max timestamps for nimbus messages that I collected min: Saturday, July 18, 2020 11:52:42.804 AM Not sure if that lines up or if the changes made it to Altona nodes. Lmk if you need more info about the nodes. |
Forgot to update lighthouse's side:
|
@AgeManning Thanks, updated the table Teku is now the only one in the table still requiring one of the fields: the seqno is still required to be present. @Nashatyrev is this still the case? Can you check if the table matches Teku? Once Teku allows the seqno to be missing, all clients do not require the fields anymore, and we can plan a grace period: we give users time to update their nodes, and then start leaving out the fields of the messages after the period is over. During this grace period we can do some testing on a devnet (maybe reuse https://github.com/protolambda/fafafa) to see if the new behavior works as expected. |
@protolambda Teku:
|
@Nashatyrev is there an option to omit the seqno yet? It looks like it is still setting it (a new one is generated if you ask it to not have a seqno): https://github.com/libp2p/jvm-libp2p/blob/c0cfb97e68f83f363063c15ca7614df95aa4ce55/src/main/kotlin/io/libp2p/pubsub/PubsubApiImpl.kt#L46 In the near future messages with seqno will be invalid, so it is important to coordinate the changes and current behavior of clients here. With testing soon hopefully as well, working on that. Edit: also, upon propagation, these fields should be omitted (completely, i.e. exclude the optional protobuf key-value completely, not just an empty value) |
Hey @protolambda, no Teku doesn't have this option yet. To make a PR I'd like to clarify a couple of questions below:
Does the same apply to
I see 2 options with this regard:
I thought we were going to stick to the option |
Yes, the intention is to omit all unused (w.r.t. eth2) optional (w.r.t. gossipsub) fields.
Intention is indeed option What I meant with "upon propagation" is that even if you didn't create the message, and are propagating it, that the fields should not be added to the message (assuming the message was valid, and didn't contain the fields to begin with). Also, to clarify just in case, there is a difference between an empty value, and a nil value: the gossip message fields are One point of concern though is that if fields are unrecognized (i.e. a gossipsub implementation using a protobuf definition that does not include these optional fields), and if the fields are still passed along, then we have an issue when it comes to peer-scoring: a peer may unknowingly propagate messages with unasked for fields. This may be the case with Nimbus. Also, now I wonder what happens if other unknown indices are used as keys for protobuf entries, and happen to be propagated (I think in Go this would happen through the I'll raise this during the networking call, some thought and questions from others definitely helps (thanks @Nashatyrev). And I am personally not a fan of protobuf for protocols like this. |
@protolambda just merged the Teku PR above. |
Opened a libp2p specs PR to standardize the policies around these fields and signature checking: libp2p/specs#294 |
## Issue Addressed N/A ## Proposed Changes This will consider all gossipsub messages that have either the `from`, `seqno` or `signature` field as invalid. ## Additional Info We should not merge this until all other clients have been sending empty fields for a while. See ethereum/consensus-specs#1981 for reference
Squashed commit of the following: commit 74ed1be Author: pawan <[email protected]> Date: Tue Sep 22 19:27:14 2020 +0530 fmt commit d61f956 Author: pawan <[email protected]> Date: Tue Sep 22 19:03:16 2020 +0530 Fix merge issues commit 282a973 Merge: 6759bd1 a97ec31 Author: pawan <[email protected]> Date: Tue Sep 22 18:30:10 2020 +0530 Merge branch 'master' into directory-restructure commit a97ec31 Author: Pawan Dhananjay <[email protected]> Date: Tue Sep 22 07:29:34 2020 +0000 Subscribe to subnets an epoch in advance (#1600) ## Issue Addressed N/A ## Proposed Changes Subscibe to subnet an epoch in advance of the attestation slot instead of 4 slots in advance. commit 7aceff4 Author: Michael Sproul <[email protected]> Date: Tue Sep 22 05:40:04 2020 +0000 Add `safe_sum` and use it in state_processing (#1620) ## Issue Addressed Closes #1098 ## Proposed Changes Add a `SafeArithIter` trait with a `safe_sum` method, and use it in `state_processing`. This seems to be the only place in `consensus` where it is relevant -- i.e. where we were using `sum` and the integer_arith lint is enabled. ## Additional Info This PR doesn't include any Clippy linting to prevent `sum` from being called. It seems there is no existing Clippy lint that suits our purpose, but I'm going to look into that and maybe schedule writing one as a lower-priority task. This theoretically _is_ a consensus breaking change, but it shouldn't impact Medalla (or any other testnet) because `slashings` shouldn't overflow! commit 4fca306 Author: Michael Sproul <[email protected]> Date: Tue Sep 22 05:40:02 2020 +0000 Update BLST, add force-adx support (#1595) ## Issue Addressed Closes #1504 Closes #1505 ## Proposed Changes * Update `blst` to the latest version, which is more portable and includes finer-grained compilation controls (see below). * Detect the case where a binary has been explicitly compiled with ADX support but it's missing at runtime, and report a nicer error than `SIGILL`. ## Known Issues * None. The previous issue with `make build-aarch64` (supranational/blst#27), has been resolved. ## Additional Info I think we should tweak our release process and our Docker builds so that we provide two options: Binaries: * `lighthouse`: compiled with `modern`/`force-adx`, for CPUs 2013 and newer * `lighthouse-portable`: compiled with `portable` for older CPUs Docker images: * `sigp/lighthouse:latest`: multi-arch image with `modern` x86_64 and vanilla aarch64 binary * `sigp/lighthouse:latest-portable`: multi-arch image with `portable` builds for x86_64 and aarch64 And relevant Docker images for the releases (as per #1574 (comment)), tagged `v0.x.y` and `v0.x.y-portable` commit d85d5a4 Author: Paul Hauner <[email protected]> Date: Tue Sep 22 04:45:15 2020 +0000 Bump to v0.2.11 (#1645) ## Issue Addressed NA ## Proposed Changes - Bump version to v0.2.11 - Run `cargo update`. ## Additional Info NA commit bd39cc8 Author: Paul Hauner <[email protected]> Date: Tue Sep 22 02:06:10 2020 +0000 Apply hotfix for inconsistent head (#1639) ## Issue Addressed - Resolves #1616 ## Proposed Changes If we look at the function which persists fork choice and the canonical head to disk: https://github.com/sigp/lighthouse/blob/1db8daae0c7bb34bf2e05644fa6bf313c2bea98e/beacon_node/beacon_chain/src/beacon_chain.rs#L234-L280 There is a race-condition which might cause the canonical head and fork choice values to be out-of-sync. I believe this is the cause of #1616. I managed to recreate the issue and produce a database that was unable to sync under the `master` branch but able to sync with this branch. These new changes solve the issue by ignoring the persisted `canonical_head_block_root` value and instead getting fork choice to generate it. This ensures that the canonical head is in-sync with fork choice. ## Additional Info This is hotfix method that leaves some crusty code hanging around. Once this PR is merged (to satisfy the v0.2.x users) we should later update and merge #1638 so we can have a clean fix for the v0.3.x versions. commit 14ff385 Author: Pawan Dhananjay <[email protected]> Date: Tue Sep 22 01:12:36 2020 +0000 Add trusted peers (#1640) ## Issue Addressed Closes #1581 ## Proposed Changes Adds a new cli option for trusted peers who always have the maximum possible score. commit 5d17eb8 Author: Michael Sproul <[email protected]> Date: Mon Sep 21 11:53:53 2020 +0000 Update LevelDB to v0.8.6, removing patch (#1636) Removes our dependency on a fork of LevelDB now that skade/leveldb-sys#17 is merged commit 1db8daa Author: Age Manning <[email protected]> Date: Mon Sep 21 02:00:38 2020 +0000 Shift metadata to the global network variables (#1631) ## Issue Addressed N/A ## Proposed Changes Shifts the local `metadata` to `network_globals` making it accessible to the HTTP API and other areas of lighthouse. ## Additional Info N/A commit 7b97c4a Author: Pawan Dhananjay <[email protected]> Date: Mon Sep 21 01:06:25 2020 +0000 Snappy additional sanity checks (#1625) ## Issue Addressed N/A ## Proposed Changes Adds the following check from the spec > A reader SHOULD NOT read more than max_encoded_len(n) bytes after reading the SSZ length-prefix n from the header. commit 371e1c1 Author: Paul Hauner <[email protected]> Date: Fri Sep 18 06:41:29 2020 +0000 Bump version to v0.2.10 (#1630) ## Issue Addressed NA ## Proposed Changes Bump crate version so we can cut a new release with the fix from #1629. ## Additional Info NA commit a17f748 Author: Paul Hauner <[email protected]> Date: Fri Sep 18 05:14:31 2020 +0000 Fix bad assumption when checking finalized descendant (#1629) ## Issue Addressed - Resolves #1616 ## Proposed Changes Fixes a bug where we are unable to read the finalized block from fork choice. ## Detail I had made an assumption that the finalized block always has a parent root of `None`: https://github.com/sigp/lighthouse/blob/e5fc6bab485fa54d7e518b325f4eb9201ba5c6a1/consensus/fork_choice/src/fork_choice.rs#L749-L752 This was a faulty assumption, we don't set parent *roots* to `None`. Instead we *sometimes* set parent *indices* to `None`, depending if this pruning condition is satisfied: https://github.com/sigp/lighthouse/blob/e5fc6bab485fa54d7e518b325f4eb9201ba5c6a1/consensus/proto_array/src/proto_array.rs#L229-L232 The bug manifested itself like this: 1. We attempt to get the finalized block from fork choice 1. We try to check that the block is descendant of the finalized block (note: they're the same block). 1. We expect the parent root to be `None`, but it's actually the parent root of the finalized root. 1. We therefore end up checking if the parent of the finalized root is a descendant of itself. (note: it's an *ancestor* not a *descendant*). 1. We therefore declare that the finalized block is not a descendant of (or eq to) the finalized block. Bad. ## Additional Info In reflection, I made a poor assumption in the quest to obtain a probably negligible performance gain. The performance gain wasn't worth the risk and we got burnt. commit 49ab414 Author: Age Manning <[email protected]> Date: Fri Sep 18 02:05:36 2020 +0000 Shift gossipsub validation (#1612) ## Issue Addressed N/A ## Proposed Changes This will consider all gossipsub messages that have either the `from`, `seqno` or `signature` field as invalid. ## Additional Info We should not merge this until all other clients have been sending empty fields for a while. See ethereum/consensus-specs#1981 for reference commit 2074bec Author: Age Manning <[email protected]> Date: Fri Sep 18 02:05:34 2020 +0000 Gossipsub message id to shortened bytes (#1607) ## Issue Addressed ethereum/consensus-specs#2044 ## Proposed Changes Shifts the gossipsub message id to use the first 8 bytes of the SHA256 hash of the gossipsub message data field. ## Additional Info We should merge this in once the spec has been decided on. It will cause issues with gossipsub scoring and gossipsub propagation rates (as we won't receive IWANT) messages from clients that also haven't made this update. commit e5fc6ba Author: Michael Sproul <[email protected]> Date: Mon Sep 14 10:58:15 2020 +0000 Remove redundant decompression in process_deposit (#1610) ## Issue Addressed Closes #1076 ## Proposed Changes Remove an extra unnecessary decompression of the deposit public key from `process_deposit`. The key is decompressed and used to verify the signature in `verify_deposit_signature`, making this initial decompression redundant. ## Additional Info This is _not_ a consensus-breaking change because keys which previously failed the early decompression check will not be found in the pubkey cache (they are invalid), and will be checked and rejected as part of `verify_deposit_signature`. commit c9596fc Author: Age Manning <[email protected]> Date: Sun Sep 13 23:58:49 2020 +0000 Temporary Sync Work-Around (#1615) ## Issue Addressed #1590 ## Proposed Changes This is a temporary workaround that prevents finalized chain sync from swapping chains. I'm merging this in now until the full solution is ready. commit c6abc56 Author: Age Manning <[email protected]> Date: Fri Sep 11 02:33:36 2020 +0000 Prevent large step-size parameters (#1583) ## Issue Addressed Malicious users could request very large block ranges, more than we expect. Although technically legal, we are now quadraticaly weighting large step sizes in the filter. Therefore users may request large skips, but not a large number of blocks, to prevent requests forcing us to do long chain lookups. ## Proposed Changes Weight the step parameter in the RPC filter and prevent any overflows that effect us in the step parameter. ## Additional Info commit 7f1b936 Author: blacktemplar <[email protected]> Date: Fri Sep 11 01:43:15 2020 +0000 ignore too early / too late attestations instead of penalizing them (#1608) ## Issue Addressed NA ## Proposed Changes This ignores attestations that are too early or too late as it is specified in the spec (see https://github.com/ethereum/eth2.0-specs/blob/v0.12.1/specs/phase0/p2p-interface.md#global-topics first subpoint of `beacon_aggregate_and_proof`) commit 810de2f Author: Daniel Schonfeld <[email protected]> Date: Fri Sep 11 01:43:13 2020 +0000 Static testnet configs (#1603) ## Issue Addressed #1431 ## Proposed Changes Added an archived zip file with required files manually ## Additional Info 1) Used zip, instead of tar.gz to add a single dependency instead of two. 2) I left the download from github code for now, waiting to hear if you'd like it cleaned up or left to be used for some tooling needs. commit 0525876 Author: Pawan Dhananjay <[email protected]> Date: Fri Sep 11 00:52:27 2020 +0000 Dial cached enr's before making subnet discovery query (#1376) ## Issue Addressed Closes #1365 ## Proposed Changes Dial peers in the `cached_enrs` who aren't connected, aren't banned and satisfy the subnet predicate before making a subnet discovery query. commit d79366c Author: Age Manning <[email protected]> Date: Thu Sep 10 04:43:22 2020 +0000 Prevent printing binary in RPC errors (#1604) ## Issue Addressed #1566 ## Proposed Changes Prevents printing binary characters in the RPC error response from peers. commit b19cf02 Author: Age Manning <[email protected]> Date: Thu Sep 10 03:51:06 2020 +0000 Penalise bad peer behaviour (#1602) ## Issue Addressed #1386 ## Proposed Changes Penalises peers in our scoring system that produce invalid attestations or blocks. commit dfe5077 Author: Paul Hauner <[email protected]> Date: Thu Sep 10 00:24:41 2020 +0000 Remove references to rust-docs (#1601) ## Issue Addressed - Resolves #897 - Resolves #821 ## Proposed Changes Removes references to the rust docs that we're no long maintaining. ## Additional Info NA commit 0821e6b Author: Paul Hauner <[email protected]> Date: Wed Sep 9 02:28:35 2020 +0000 Bump version to v0.2.9 (#1598) ## Issue Addressed NA ## Proposed Changes - Bump version tags - Run `cargo update` ## Additional Info NA commit 6759bd1 Author: pawan <[email protected]> Date: Tue Sep 8 12:39:22 2020 +0530 Add `strict-slashing-protection` flag commit 1644717 Author: pawan <[email protected]> Date: Fri Sep 4 16:46:43 2020 +0530 Remove `get_default_base_dir` commit e4783bf Author: pawan <[email protected]> Date: Fri Sep 4 16:24:45 2020 +0530 Address review comments commit 89b832c Author: pawan <[email protected]> Date: Thu Sep 3 00:30:53 2020 +0530 Fix tests commit 5940a52 Author: pawan <[email protected]> Date: Wed Sep 2 21:27:57 2020 +0530 Print out path information in account manager commit 5ca1341 Author: pawan <[email protected]> Date: Wed Sep 2 20:49:29 2020 +0530 Use datadir flag for account manager commit 8c82769 Author: pawan <[email protected]> Date: Wed Sep 2 20:10:44 2020 +0530 Rename BASE_DIR_FLAG to WALLET_DIR_FLAG commit ad34a0f Author: pawan <[email protected]> Date: Wed Sep 2 18:17:45 2020 +0530 Add validator-dir cli option commit 04fd9f7 Merge: 0e3ccba 8718120 Author: pawan <[email protected]> Date: Wed Sep 2 14:53:27 2020 +0530 Merge branch 'master' into directory-restructure commit 0e3ccba Author: pawan <[email protected]> Date: Wed Sep 2 14:40:36 2020 +0530 Minor fixes commit ad22178 Author: pawan <[email protected]> Date: Tue Sep 1 19:14:44 2020 +0530 Migration for ValidatorDefinitions commit 1022f58 Author: pawan <[email protected]> Date: Tue Sep 1 16:02:07 2020 +0530 Fix lints and udeps commit 4ddbbfe Author: pawan <[email protected]> Date: Tue Sep 1 12:56:12 2020 +0530 Remove more hardocoded values commit 4170e4c Author: pawan <[email protected]> Date: Tue Sep 1 12:50:43 2020 +0530 Remove some hardcoded values commit 5c46823 Merge: ae533f3 8301a98 Author: pawan <[email protected]> Date: Tue Sep 1 12:02:35 2020 +0530 Merge branch 'master' into directory-restructure commit ae533f3 Author: pawan <[email protected]> Date: Mon Aug 17 20:47:20 2020 +0530 Docs fixes commit fe13b1b Author: pawan <[email protected]> Date: Mon Aug 17 19:58:09 2020 +0530 lint commit e4a7e8c Merge: 01e6166 9a97a0b Author: pawan <[email protected]> Date: Mon Aug 17 19:47:27 2020 +0530 Merge branch 'master' into directory-restructure commit 01e6166 Author: pawan <[email protected]> Date: Mon Aug 17 19:23:57 2020 +0530 All binaries get directory info from common::directory crate commit 8a7c130 Author: pawan <[email protected]> Date: Mon Aug 17 17:18:44 2020 +0530 Add migration code for restructure commit a4c33cd Author: pawan <[email protected]> Date: Mon Aug 17 15:51:24 2020 +0530 Fix bug in secrets dir commit f4b8529 Author: pawan <[email protected]> Date: Mon Aug 17 13:05:28 2020 +0530 Restrucure validator directory paths commit 54b4ae6 Author: pawan <[email protected]> Date: Mon Aug 17 12:20:40 2020 +0530 Restructure account_manager directory paths commit 53892af Author: pawan <[email protected]> Date: Thu Aug 13 19:43:44 2020 +0530 Move get_testnet_dir to clap utils commit d1d269a Author: pawan <[email protected]> Date: Thu Aug 13 19:33:34 2020 +0530 Move beacon directory
This was implemented in all clients, as option in libp2p, and included in the libp2p pubsub spec. Closing this old issue. |
For initial introduction of options in GossipSub, please have a look at: libp2p/go-libp2p-pubsub#359
The TLDR:
What now?
optional
in protobuf definition)cc @jrhea @AgeManning @raulk
The text was updated successfully, but these errors were encountered: