Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Use the database to persist the pubkey cache #2234

Closed
wants to merge 1 commit into from

Conversation

michaelsproul
Copy link
Member

@michaelsproul michaelsproul commented Mar 1, 2021

Issue Addressed

Closes #1787

Proposed Changes

  • Abstract the ValidatorPubkeyCache over a "backing" which is either a file (legacy), or the database.
  • Implement a migration from schema v2 to schema v3, whereby the contents of the cache file are copied to the DB, and then the file is deleted. The next release to include this change must be a minor version bump, and we will need to warn users of the inability to downgrade (this is our first DB schema change since mainnet genesis).
  • Move the schema migration code from the store crate into the beacon_chain crate so that it can access the datadir and the ValidatorPubkeyCache, etc. It gets injected back into the store via a closure (similar to what we do in fork choice).

@michaelsproul michaelsproul added the ready-for-review The code is ready for review label Mar 1, 2021
@michaelsproul michaelsproul requested a review from paulhauner March 1, 2021 04:25
Copy link
Member

@paulhauner paulhauner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, assuming my comment is correct :)

I'm going to start running this on Pyrmont soon, since I'm going to use this in #2243.

@@ -45,13 +44,6 @@ pub fn get_config<E: EthSpec>(
.ok_or("Failed to get freezer db path")?,
)
.map_err(|err| format!("Failed to remove chain_db: {}", err))?;

// Remove the pubkey cache file if it exists
let pubkey_cache_file = client_config.data_dir.join(PUBKEY_CACHE_FILENAME);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm, if someone runs lighthouse bn --purge-db the first time they use schema v3, then they'll get a v3 DB and never run the v2 migration code and leave the pubkey cache file dangling.

This seems slightly undesired, but harmless. Not worth extra code IMO.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's exactly right

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps worth noting that even if they revert to an old version of Lighthouse, it might try to read the undeleted pubkey cache but it will fail on the schema check

@michaelsproul michaelsproul added ready-for-merge This PR is ready to merge. and removed ready-for-review The code is ready for review labels Mar 4, 2021
@michaelsproul
Copy link
Member Author

bors r+

bors bot pushed a commit that referenced this pull request Mar 4, 2021
## Issue Addressed

Closes #1787

## Proposed Changes

* Abstract the `ValidatorPubkeyCache` over a "backing" which is either a file (legacy), or the database.
* Implement a migration from schema v2 to schema v3, whereby the contents of the cache file are copied to the DB, and then the file is deleted. The next release to include this change must be a minor version bump, and we will need to warn users of the inability to downgrade (this is our first DB schema change since mainnet genesis).
* Move the schema migration code from the `store` crate into the `beacon_chain` crate so that it can access the datadir and the `ValidatorPubkeyCache`, etc. It gets injected back into the `store` via a closure (similar to what we do in fork choice).
@bors bors bot changed the title Use the database to persist the pubkey cache [Merged by Bors] - Use the database to persist the pubkey cache Mar 4, 2021
@bors bors bot closed this Mar 4, 2021
@michaelsproul michaelsproul deleted the rm-pubkey-cache branch March 4, 2021 02:39
michaelsproul added a commit that referenced this pull request Mar 10, 2021
## Issue Addressed

Closes #1787

## Proposed Changes

* Abstract the `ValidatorPubkeyCache` over a "backing" which is either a file (legacy), or the database.
* Implement a migration from schema v2 to schema v3, whereby the contents of the cache file are copied to the DB, and then the file is deleted. The next release to include this change must be a minor version bump, and we will need to warn users of the inability to downgrade (this is our first DB schema change since mainnet genesis).
* Move the schema migration code from the `store` crate into the `beacon_chain` crate so that it can access the datadir and the `ValidatorPubkeyCache`, etc. It gets injected back into the `store` via a closure (similar to what we do in fork choice).
michaelsproul added a commit that referenced this pull request Mar 10, 2021
bors bot pushed a commit that referenced this pull request Mar 17, 2021
## Issue Addressed

Closes #2052

## Proposed Changes

- Refactor the attester/proposer duties endpoints in the BN
    - Performance improvements
    - Fixes some potential inconsistencies with the dependent root fields.
    - Removes `http_api::beacon_proposer_cache` and just uses the one on the `BeaconChain` instead.
    - Move the code for the proposer/attester duties endpoints into separate files, for readability.
- Refactor the `DutiesService` in the VC
    - Required to reduce the delay on broadcasting new blocks.
    - Gets rid of the `ValidatorDuty` shim struct that came about when we adopted the standard API.
    - Separate block/attestation duty tasks so that they don't block each other when one is slow.
- In the VC, use `PublicKeyBytes` to represent validators instead of `PublicKey`. `PublicKey` is a legit crypto object whilst `PublicKeyBytes` is just a byte-array, it's much faster to clone/hash `PublicKeyBytes` and this change has had a significant impact on runtimes.
    - Unfortunately this has created lots of dust changes.
 - In the BN, store `PublicKeyBytes` in the `beacon_proposer_cache` and allow access to them. The HTTP API always sends `PublicKeyBytes` over the wire and the conversion from `PublicKey` -> `PublickeyBytes` is non-trivial, especially when queries have 100s/1000s of validators (like Pyrmont).
 - Add the `state_processing::state_advance` mod which dedups a lot of the "apply `n` skip slots to the state" code.
    - This also fixes a bug with some functions which were failing to include a state root as per [this comment](https://github.com/sigp/lighthouse/blob/072695284f7eff82c51f79bc921ad942fea7483a/consensus/state_processing/src/state_advance.rs#L69-L74). I couldn't find any instance of this bug that resulted in anything more severe than keying a shuffling cache by the wrong block root.
 - Swap the VC block service to use `mpsc` from `tokio` instead of `futures`. This is consistent with the rest of the code base.
    
~~This PR *reduces* the size of the codebase 🎉~~ It *used* to reduce the size of the code base before I added more comments. 

## Observations on Prymont

- Proposer duties times down from peaks of 450ms to consistent <1ms.
- Current epoch attester duties times down from >1s peaks to a consistent 20-30ms.
- Block production down from +600ms to 100-200ms.

## Additional Info

- ~~Blocked on #2241~~
- ~~Blocked on #2234~~

## TODO

- [x] ~~Refactor this into some smaller PRs?~~ Leaving this as-is for now.
- [x] Address `per_slot_processing` roots.
- [x] Investigate slow next epoch times. Not getting added to cache on block processing?
- [x] Consider [this](https://github.com/sigp/lighthouse/blob/072695284f7eff82c51f79bc921ad942fea7483a/beacon_node/store/src/hot_cold_store.rs#L811-L812) in the scenario of replacing the state roots


Co-authored-by: pawan <[email protected]>
Co-authored-by: Michael Sproul <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-for-merge This PR is ready to merge.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants