-
Notifications
You must be signed in to change notification settings - Fork 796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Merged by Bors] - Fix HTTP state API bug and add --epochs-per-migration
#4236
Conversation
0764ad2
to
e96c874
Compare
e96c874
to
f746b96
Compare
--epochs-per-migration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice and tidy! I just had a little comment, nothing blocking.
.help("The number of epochs to wait between running the migration of data from the \ | ||
hot DB to the cold DB. Less frequent runs can be useful for minimizing disk \ | ||
writes") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the downside of setting this feature too high (e.g., higher than 8096) would be that we start doing state reads from the hybrid iterators because we don't have the linear history in the cold DB yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think performance might not be great with super long durations. The reasonable max would be around 8192 slots / 32 = 256 epochs I think.
I know @ariskk was interested in a mode where Lighthouse stores more of the recent history in the hot DB, so maybe we can ask him to test it out and inform us of performance deficits. I suspect just using tree-states with all states reconstructed will be faster and more disk efficient
bors r+ |
## Issue Addressed Fix an issue observed by `@zlan` on Discord where Lighthouse would sometimes return this error when looking up states via the API: > {"code":500,"message":"UNHANDLED_ERROR: ForkChoiceError(MissingProtoArrayBlock(0xc9cf1495421b6ef3215d82253b388d77321176a1dcef0db0e71a0cd0ffc8cdb7))","stacktraces":[]} ## Proposed Changes The error stems from a faulty assumption in the HTTP API logic: that any state in the hot database must have its block in fork choice. This isn't true because the state's hot database may update much less frequently than the fork choice store, e.g. if reconstructing states (where freezer migration pauses), or if the freezer migration runs slowly. There could also be a race between loading the hot state and checking fork choice, e.g. even if the finalization migration of DB+fork choice were atomic, the update could happen between the 1st and 2nd calls. To address this I've changed the HTTP API logic to use the finalized block's execution status as a fallback where it is safe to do so. In the case where a block is non-canonical and prior to finalization (permanently orphaned) we default `execution_optimistic` to `true`. ## Additional Info I've also added a new CLI flag to reduce the frequency of the finalization migration as this is useful for several purposes: - Spacing out database writes (less frequent, larger batches) - Keeping a limited chain history with high availability, e.g. the last month in the hot database. This new flag made it _substantially_ easier to test this change. It was extracted from `tree-states` (where it's called `--db-migration-period`), which is why this PR also carries the `tree-states` label.
This PR was included in a batch that timed out, it will be automatically retried |
## Issue Addressed Fix an issue observed by `@zlan` on Discord where Lighthouse would sometimes return this error when looking up states via the API: > {"code":500,"message":"UNHANDLED_ERROR: ForkChoiceError(MissingProtoArrayBlock(0xc9cf1495421b6ef3215d82253b388d77321176a1dcef0db0e71a0cd0ffc8cdb7))","stacktraces":[]} ## Proposed Changes The error stems from a faulty assumption in the HTTP API logic: that any state in the hot database must have its block in fork choice. This isn't true because the state's hot database may update much less frequently than the fork choice store, e.g. if reconstructing states (where freezer migration pauses), or if the freezer migration runs slowly. There could also be a race between loading the hot state and checking fork choice, e.g. even if the finalization migration of DB+fork choice were atomic, the update could happen between the 1st and 2nd calls. To address this I've changed the HTTP API logic to use the finalized block's execution status as a fallback where it is safe to do so. In the case where a block is non-canonical and prior to finalization (permanently orphaned) we default `execution_optimistic` to `true`. ## Additional Info I've also added a new CLI flag to reduce the frequency of the finalization migration as this is useful for several purposes: - Spacing out database writes (less frequent, larger batches) - Keeping a limited chain history with high availability, e.g. the last month in the hot database. This new flag made it _substantially_ easier to test this change. It was extracted from `tree-states` (where it's called `--db-migration-period`), which is why this PR also carries the `tree-states` label.
Timed out. |
bors r- |
bors r+ |
## Issue Addressed Fix an issue observed by `@zlan` on Discord where Lighthouse would sometimes return this error when looking up states via the API: > {"code":500,"message":"UNHANDLED_ERROR: ForkChoiceError(MissingProtoArrayBlock(0xc9cf1495421b6ef3215d82253b388d77321176a1dcef0db0e71a0cd0ffc8cdb7))","stacktraces":[]} ## Proposed Changes The error stems from a faulty assumption in the HTTP API logic: that any state in the hot database must have its block in fork choice. This isn't true because the state's hot database may update much less frequently than the fork choice store, e.g. if reconstructing states (where freezer migration pauses), or if the freezer migration runs slowly. There could also be a race between loading the hot state and checking fork choice, e.g. even if the finalization migration of DB+fork choice were atomic, the update could happen between the 1st and 2nd calls. To address this I've changed the HTTP API logic to use the finalized block's execution status as a fallback where it is safe to do so. In the case where a block is non-canonical and prior to finalization (permanently orphaned) we default `execution_optimistic` to `true`. ## Additional Info I've also added a new CLI flag to reduce the frequency of the finalization migration as this is useful for several purposes: - Spacing out database writes (less frequent, larger batches) - Keeping a limited chain history with high availability, e.g. the last month in the hot database. This new flag made it _substantially_ easier to test this change. It was extracted from `tree-states` (where it's called `--db-migration-period`), which is why this PR also carries the `tree-states` label.
Build failed (retrying...): |
This PR was included in a batch that timed out, it will be automatically retried |
## Issue Addressed Fix an issue observed by `@zlan` on Discord where Lighthouse would sometimes return this error when looking up states via the API: > {"code":500,"message":"UNHANDLED_ERROR: ForkChoiceError(MissingProtoArrayBlock(0xc9cf1495421b6ef3215d82253b388d77321176a1dcef0db0e71a0cd0ffc8cdb7))","stacktraces":[]} ## Proposed Changes The error stems from a faulty assumption in the HTTP API logic: that any state in the hot database must have its block in fork choice. This isn't true because the state's hot database may update much less frequently than the fork choice store, e.g. if reconstructing states (where freezer migration pauses), or if the freezer migration runs slowly. There could also be a race between loading the hot state and checking fork choice, e.g. even if the finalization migration of DB+fork choice were atomic, the update could happen between the 1st and 2nd calls. To address this I've changed the HTTP API logic to use the finalized block's execution status as a fallback where it is safe to do so. In the case where a block is non-canonical and prior to finalization (permanently orphaned) we default `execution_optimistic` to `true`. ## Additional Info I've also added a new CLI flag to reduce the frequency of the finalization migration as this is useful for several purposes: - Spacing out database writes (less frequent, larger batches) - Keeping a limited chain history with high availability, e.g. the last month in the hot database. This new flag made it _substantially_ easier to test this change. It was extracted from `tree-states` (where it's called `--db-migration-period`), which is why this PR also carries the `tree-states` label.
This PR was included in a batch that timed out, it will be automatically retried |
## Issue Addressed Fix an issue observed by `@zlan` on Discord where Lighthouse would sometimes return this error when looking up states via the API: > {"code":500,"message":"UNHANDLED_ERROR: ForkChoiceError(MissingProtoArrayBlock(0xc9cf1495421b6ef3215d82253b388d77321176a1dcef0db0e71a0cd0ffc8cdb7))","stacktraces":[]} ## Proposed Changes The error stems from a faulty assumption in the HTTP API logic: that any state in the hot database must have its block in fork choice. This isn't true because the state's hot database may update much less frequently than the fork choice store, e.g. if reconstructing states (where freezer migration pauses), or if the freezer migration runs slowly. There could also be a race between loading the hot state and checking fork choice, e.g. even if the finalization migration of DB+fork choice were atomic, the update could happen between the 1st and 2nd calls. To address this I've changed the HTTP API logic to use the finalized block's execution status as a fallback where it is safe to do so. In the case where a block is non-canonical and prior to finalization (permanently orphaned) we default `execution_optimistic` to `true`. ## Additional Info I've also added a new CLI flag to reduce the frequency of the finalization migration as this is useful for several purposes: - Spacing out database writes (less frequent, larger batches) - Keeping a limited chain history with high availability, e.g. the last month in the hot database. This new flag made it _substantially_ easier to test this change. It was extracted from `tree-states` (where it's called `--db-migration-period`), which is why this PR also carries the `tree-states` label.
Timed out. |
bors r- |
Did you mean "r+"? |
bors r+ |
## Issue Addressed Fix an issue observed by `@zlan` on Discord where Lighthouse would sometimes return this error when looking up states via the API: > {"code":500,"message":"UNHANDLED_ERROR: ForkChoiceError(MissingProtoArrayBlock(0xc9cf1495421b6ef3215d82253b388d77321176a1dcef0db0e71a0cd0ffc8cdb7))","stacktraces":[]} ## Proposed Changes The error stems from a faulty assumption in the HTTP API logic: that any state in the hot database must have its block in fork choice. This isn't true because the state's hot database may update much less frequently than the fork choice store, e.g. if reconstructing states (where freezer migration pauses), or if the freezer migration runs slowly. There could also be a race between loading the hot state and checking fork choice, e.g. even if the finalization migration of DB+fork choice were atomic, the update could happen between the 1st and 2nd calls. To address this I've changed the HTTP API logic to use the finalized block's execution status as a fallback where it is safe to do so. In the case where a block is non-canonical and prior to finalization (permanently orphaned) we default `execution_optimistic` to `true`. ## Additional Info I've also added a new CLI flag to reduce the frequency of the finalization migration as this is useful for several purposes: - Spacing out database writes (less frequent, larger batches) - Keeping a limited chain history with high availability, e.g. the last month in the hot database. This new flag made it _substantially_ easier to test this change. It was extracted from `tree-states` (where it's called `--db-migration-period`), which is why this PR also carries the `tree-states` label.
bors r- |
Canceled. |
## Issue Addressed Fix an issue observed by `@zlan` on Discord where Lighthouse would sometimes return this error when looking up states via the API: > {"code":500,"message":"UNHANDLED_ERROR: ForkChoiceError(MissingProtoArrayBlock(0xc9cf1495421b6ef3215d82253b388d77321176a1dcef0db0e71a0cd0ffc8cdb7))","stacktraces":[]} ## Proposed Changes The error stems from a faulty assumption in the HTTP API logic: that any state in the hot database must have its block in fork choice. This isn't true because the state's hot database may update much less frequently than the fork choice store, e.g. if reconstructing states (where freezer migration pauses), or if the freezer migration runs slowly. There could also be a race between loading the hot state and checking fork choice, e.g. even if the finalization migration of DB+fork choice were atomic, the update could happen between the 1st and 2nd calls. To address this I've changed the HTTP API logic to use the finalized block's execution status as a fallback where it is safe to do so. In the case where a block is non-canonical and prior to finalization (permanently orphaned) we default `execution_optimistic` to `true`. ## Additional Info I've also added a new CLI flag to reduce the frequency of the finalization migration as this is useful for several purposes: - Spacing out database writes (less frequent, larger batches) - Keeping a limited chain history with high availability, e.g. the last month in the hot database. This new flag made it _substantially_ easier to test this change. It was extracted from `tree-states` (where it's called `--db-migration-period`), which is why this PR also carries the `tree-states` label.
This PR was included in a batch that timed out, it will be automatically retried |
## Issue Addressed Fix an issue observed by `@zlan` on Discord where Lighthouse would sometimes return this error when looking up states via the API: > {"code":500,"message":"UNHANDLED_ERROR: ForkChoiceError(MissingProtoArrayBlock(0xc9cf1495421b6ef3215d82253b388d77321176a1dcef0db0e71a0cd0ffc8cdb7))","stacktraces":[]} ## Proposed Changes The error stems from a faulty assumption in the HTTP API logic: that any state in the hot database must have its block in fork choice. This isn't true because the state's hot database may update much less frequently than the fork choice store, e.g. if reconstructing states (where freezer migration pauses), or if the freezer migration runs slowly. There could also be a race between loading the hot state and checking fork choice, e.g. even if the finalization migration of DB+fork choice were atomic, the update could happen between the 1st and 2nd calls. To address this I've changed the HTTP API logic to use the finalized block's execution status as a fallback where it is safe to do so. In the case where a block is non-canonical and prior to finalization (permanently orphaned) we default `execution_optimistic` to `true`. ## Additional Info I've also added a new CLI flag to reduce the frequency of the finalization migration as this is useful for several purposes: - Spacing out database writes (less frequent, larger batches) - Keeping a limited chain history with high availability, e.g. the last month in the hot database. This new flag made it _substantially_ easier to test this change. It was extracted from `tree-states` (where it's called `--db-migration-period`), which is why this PR also carries the `tree-states` label.
This PR was included in a batch that was canceled, it will be automatically retried |
bors r- |
Canceled. |
pls bors r+ |
## Issue Addressed Fix an issue observed by `@zlan` on Discord where Lighthouse would sometimes return this error when looking up states via the API: > {"code":500,"message":"UNHANDLED_ERROR: ForkChoiceError(MissingProtoArrayBlock(0xc9cf1495421b6ef3215d82253b388d77321176a1dcef0db0e71a0cd0ffc8cdb7))","stacktraces":[]} ## Proposed Changes The error stems from a faulty assumption in the HTTP API logic: that any state in the hot database must have its block in fork choice. This isn't true because the state's hot database may update much less frequently than the fork choice store, e.g. if reconstructing states (where freezer migration pauses), or if the freezer migration runs slowly. There could also be a race between loading the hot state and checking fork choice, e.g. even if the finalization migration of DB+fork choice were atomic, the update could happen between the 1st and 2nd calls. To address this I've changed the HTTP API logic to use the finalized block's execution status as a fallback where it is safe to do so. In the case where a block is non-canonical and prior to finalization (permanently orphaned) we default `execution_optimistic` to `true`. ## Additional Info I've also added a new CLI flag to reduce the frequency of the finalization migration as this is useful for several purposes: - Spacing out database writes (less frequent, larger batches) - Keeping a limited chain history with high availability, e.g. the last month in the hot database. This new flag made it _substantially_ easier to test this change. It was extracted from `tree-states` (where it's called `--db-migration-period`), which is why this PR also carries the `tree-states` label.
Pull request successfully merged into unstable. Build succeeded! The publicly hosted instance of bors-ng is deprecated and will go away soon. If you want to self-host your own instance, instructions are here. If you want to switch to GitHub's built-in merge queue, visit their help page.
|
--epochs-per-migration
--epochs-per-migration
## Issue Addressed Fix a deadlock introduced in #4236 which was caught during the v4.4.0 release testing cycle (with thanks to @paulhauner and `gdb`). ## Proposed Changes Avoid re-locking the fork choice read lock when querying a state by root in the HTTP API. This avoids a deadlock due to the lock already being held. ## Additional Info The [RwLock docs](https://docs.rs/lock_api/latest/lock_api/struct.RwLock.html#method.read) explicitly advise against re-locking: > Note that attempts to recursively acquire a read lock on a RwLock when the current thread already holds one may result in a deadlock.
Fix an issue observed by `@zlan` on Discord where Lighthouse would sometimes return this error when looking up states via the API: > {"code":500,"message":"UNHANDLED_ERROR: ForkChoiceError(MissingProtoArrayBlock(0xc9cf1495421b6ef3215d82253b388d77321176a1dcef0db0e71a0cd0ffc8cdb7))","stacktraces":[]} The error stems from a faulty assumption in the HTTP API logic: that any state in the hot database must have its block in fork choice. This isn't true because the state's hot database may update much less frequently than the fork choice store, e.g. if reconstructing states (where freezer migration pauses), or if the freezer migration runs slowly. There could also be a race between loading the hot state and checking fork choice, e.g. even if the finalization migration of DB+fork choice were atomic, the update could happen between the 1st and 2nd calls. To address this I've changed the HTTP API logic to use the finalized block's execution status as a fallback where it is safe to do so. In the case where a block is non-canonical and prior to finalization (permanently orphaned) we default `execution_optimistic` to `true`. I've also added a new CLI flag to reduce the frequency of the finalization migration as this is useful for several purposes: - Spacing out database writes (less frequent, larger batches) - Keeping a limited chain history with high availability, e.g. the last month in the hot database. This new flag made it _substantially_ easier to test this change. It was extracted from `tree-states` (where it's called `--db-migration-period`), which is why this PR also carries the `tree-states` label.
## Issue Addressed Fix a deadlock introduced in sigp#4236 which was caught during the v4.4.0 release testing cycle (with thanks to @paulhauner and `gdb`). ## Proposed Changes Avoid re-locking the fork choice read lock when querying a state by root in the HTTP API. This avoids a deadlock due to the lock already being held. ## Additional Info The [RwLock docs](https://docs.rs/lock_api/latest/lock_api/struct.RwLock.html#method.read) explicitly advise against re-locking: > Note that attempts to recursively acquire a read lock on a RwLock when the current thread already holds one may result in a deadlock.
Fix an issue observed by `@zlan` on Discord where Lighthouse would sometimes return this error when looking up states via the API: > {"code":500,"message":"UNHANDLED_ERROR: ForkChoiceError(MissingProtoArrayBlock(0xc9cf1495421b6ef3215d82253b388d77321176a1dcef0db0e71a0cd0ffc8cdb7))","stacktraces":[]} The error stems from a faulty assumption in the HTTP API logic: that any state in the hot database must have its block in fork choice. This isn't true because the state's hot database may update much less frequently than the fork choice store, e.g. if reconstructing states (where freezer migration pauses), or if the freezer migration runs slowly. There could also be a race between loading the hot state and checking fork choice, e.g. even if the finalization migration of DB+fork choice were atomic, the update could happen between the 1st and 2nd calls. To address this I've changed the HTTP API logic to use the finalized block's execution status as a fallback where it is safe to do so. In the case where a block is non-canonical and prior to finalization (permanently orphaned) we default `execution_optimistic` to `true`. I've also added a new CLI flag to reduce the frequency of the finalization migration as this is useful for several purposes: - Spacing out database writes (less frequent, larger batches) - Keeping a limited chain history with high availability, e.g. the last month in the hot database. This new flag made it _substantially_ easier to test this change. It was extracted from `tree-states` (where it's called `--db-migration-period`), which is why this PR also carries the `tree-states` label.
## Issue Addressed Fix a deadlock introduced in sigp#4236 which was caught during the v4.4.0 release testing cycle (with thanks to @paulhauner and `gdb`). ## Proposed Changes Avoid re-locking the fork choice read lock when querying a state by root in the HTTP API. This avoids a deadlock due to the lock already being held. ## Additional Info The [RwLock docs](https://docs.rs/lock_api/latest/lock_api/struct.RwLock.html#method.read) explicitly advise against re-locking: > Note that attempts to recursively acquire a read lock on a RwLock when the current thread already holds one may result in a deadlock.
commit 8e2f7c5 Author: Michael Sproul <[email protected]> Date: Thu Feb 8 09:20:28 2024 +1100 Delete unused epoch processing code (sigp#5170) * Delete unused epoch processing code * Compare total deltas * Remove unnecessary apply_pending * cargo fmt * Remove newline commit ae6620e Author: Michael Sproul <[email protected]> Date: Fri Feb 2 17:47:14 2024 +1100 Delete `lighthouse db diff` (sigp#5171) * Delete `lighthouse db diff` * Fix help text commit 25bcd2a Author: Michael Sproul <[email protected]> Date: Wed Jan 31 10:13:18 2024 +1100 Tree states v4.6.222-exp (sigp#5147) commit 68a9a2e Merge: 8e68926 0f345c7 Author: Michael Sproul <[email protected]> Date: Tue Jan 30 17:13:57 2024 +1100 Merge remote-tracking branch 'origin/unstable' into tree-states commit 8e68926 Author: Michael Sproul <[email protected]> Date: Tue Jan 30 17:08:43 2024 +1100 `fsync` during backfill to prevent DB corruption (sigp#5144) commit 7862c71 Author: Michael Sproul <[email protected]> Date: Tue Jan 30 15:56:48 2024 +1100 Fix tree-states sub-epoch diffs (sigp#5097) commit 11461d8 Author: Michael Sproul <[email protected]> Date: Tue Jan 30 09:59:25 2024 +1100 Fix new CLI tests for tree-states (sigp#5132) commit 262e5f2 Merge: 6262be7 1be5253 Author: Michael Sproul <[email protected]> Date: Thu Jan 25 15:10:19 2024 +1100 Merge remote-tracking branch 'origin/unstable' into tree-states commit 6262be7 Author: Michael Sproul <[email protected]> Date: Sun Jan 14 09:41:42 2024 +1100 Don't error on inactive indices in att. rewards commit 9cd9243 Author: Michael Sproul <[email protected]> Date: Fri Jan 12 10:50:34 2024 +1100 Tree states release v4.6.111-exp commit 664a778 Author: Michael Sproul <[email protected]> Date: Thu Jan 11 17:13:43 2024 +1100 Add cache for parallel HTTP requests (sigp#4879) commit 8db17da Merge: c8dc082 2e8e160 Author: Michael Sproul <[email protected]> Date: Thu Jan 11 13:15:06 2024 +1100 Merge remote-tracking branch 'origin/unstable' into tree-states commit c8dc082 Merge: 4741bf1 a3a3703 Author: Michael Sproul <[email protected]> Date: Thu Dec 14 09:59:43 2023 +1100 Merge remote-tracking branch 'origin/unstable' into tree-states commit 4741bf1 Author: Michael Sproul <[email protected]> Date: Tue Dec 5 09:23:24 2023 +1100 Remove stray println commit cefe9fd Author: Michael Sproul <[email protected]> Date: Mon Dec 4 17:15:25 2023 +1100 Restore crash safety for database pruning (sigp#4975) * Add some DB sanity checks * Restore crash safety for database pruning commit 66d30bc Merge: e880d9d 44aaf13 Author: Michael Sproul <[email protected]> Date: Fri Dec 1 12:02:21 2023 +1100 Merge remote-tracking branch 'origin/unstable' into tree-states commit e880d9d Author: Michael Sproul <[email protected]> Date: Fri Dec 1 11:06:27 2023 +1100 Fix cache initialization in block rewards API (sigp#4960) commit 9cdc4b9 Merge: d36ebba 051c3e8 Author: Michael Sproul <[email protected]> Date: Sun Nov 12 22:19:07 2023 +0300 Merge remote-tracking branch 'origin/unstable' into tree-states commit d36ebba Author: Michael Sproul <[email protected]> Date: Thu Oct 26 18:09:28 2023 +1100 Handle out-of-order forks in epoch cache (sigp#4881) commit 1aca484 Author: Michael Sproul <[email protected]> Date: Fri Oct 20 12:31:41 2023 +1100 Tree states release v4.5.444-exp - Update xdelta3 to remove dodgy build steps - Fix asset paths for draft release commit 6b4f154 Author: Michael Sproul <[email protected]> Date: Thu Oct 19 15:57:11 2023 +1100 Tree states release v4.5.333-exp commit 0cb8fdf Author: Michael Sproul <[email protected]> Date: Thu Oct 19 14:59:29 2023 +1100 Various small tree-states fixes (sigp#4861) * Fix block backfill with genesis skip slots * Fix freezer upper limit * Fix: write post state in lcli skip-slots (sigp#4843) * Added CARGO_USE_GIT_CLI to the Dockerfile (sigp#4828) * chore: replace deprecated hub with gh for releases (sigp#4839) * Put schema version back to 24 (ignore Deneb) * Minimise diff --------- Co-authored-by: Joe Clapis <[email protected]> Co-authored-by: Dustin Brickwood <[email protected]> commit 72d8c38 Merge: bb6675e 98cac2b Author: Michael Sproul <[email protected]> Date: Thu Oct 19 12:07:35 2023 +1100 Merge remote-tracking branch 'origin/unstable' into tree-states commit bb6675e Author: Michael Sproul <[email protected]> Date: Fri Oct 13 16:45:56 2023 +1100 Clean up progressive balance slashings further (sigp#4834) * Clean up progressive balance slashings further * Fix Rayon deadlock in test utils * Fix cargo-fmt commit b121e69 Author: Michael Sproul <[email protected]> Date: Fri Oct 13 16:45:35 2023 +1100 Fix cache logic for epoch boundary skips (sigp#4833) commit b77de69 Author: Michael Sproul <[email protected]> Date: Wed Oct 11 14:37:46 2023 +1100 Re-enable ARM builds commit dfa3b43 Author: Michael Sproul <[email protected]> Date: Wed Oct 11 14:03:11 2023 +1100 Fix Clippy for 1.73 commit 6ae4c22 Author: Michael Sproul <[email protected]> Date: Wed Oct 11 11:57:39 2023 +1100 Fix merge snafu in tests commit e63d02e Merge: d4f87ef 4ad7e15 Author: Michael Sproul <[email protected]> Date: Wed Oct 11 11:52:39 2023 +1100 Merge remote-tracking branch 'origin/deneb-free-blobs' into tree-states-deneb commit d4f87ef Author: Michael Sproul <[email protected]> Date: Wed Oct 11 11:18:22 2023 +1100 Fix three consensus bugs! commit ca1abfe Author: Michael Sproul <[email protected]> Date: Wed Oct 11 10:43:49 2023 +1100 Support iterables in compare_fields commit e2a60a6 Merge: 9446fc8 203ac65 Author: Michael Sproul <[email protected]> Date: Fri Oct 6 11:11:36 2023 +1100 Merge remote-tracking branch 'origin/deneb-free-blobs' into tree-states-deneb commit 9446fc8 Author: Michael Sproul <[email protected]> Date: Tue Oct 3 16:07:25 2023 +1100 Fix semantic Deneb <> tree-states conflicts commit 109c4a5 Merge: f1f76f2 57edc0f Author: Michael Sproul <[email protected]> Date: Fri Sep 29 16:34:29 2023 +1000 Merge remote-tracking branch 'origin/deneb-free-blobs' into tree-states commit f1f76f2 Author: Michael Sproul <[email protected]> Date: Tue Sep 26 12:23:28 2023 +1000 Tree states release v4.5.222-exp (sigp#4782) commit cae73a4 Merge: 364074d 441fc16 Author: Michael Sproul <[email protected]> Date: Tue Sep 26 11:21:44 2023 +1000 Merge tag 'v4.5.0' into tree-states v4.5.0 commit 364074d Author: Michael Sproul <[email protected]> Date: Fri Sep 22 15:52:23 2023 +1000 Tree states release v4.5.111-exp (sigp#4769) commit d24875f Merge: cd23c89 69c39ad Author: Michael Sproul <[email protected]> Date: Fri Sep 22 15:11:42 2023 +1000 Merge remote-tracking branch 'origin/unstable' into tree-states commit cd23c89 Author: Michael Sproul <[email protected]> Date: Fri Sep 22 14:49:15 2023 +1000 Improve state cache eviction and reduce mem usage (sigp#4762) * Improve state cache eviction and reduce mem usage * Fix epochs_per_state_diff tests commit 1b4bc88 Author: Michael Sproul <[email protected]> Date: Thu Sep 14 10:07:26 2023 +1000 Release v4.4.111-exp (sigp#4729) commit 5cb2ed3 Author: Michael Sproul <[email protected]> Date: Wed Sep 13 14:43:02 2023 +1000 Restore custom image for Cross commit f7c6b7d Author: Michael Sproul <[email protected]> Date: Wed Sep 13 14:00:28 2023 +1000 Bump schema version to v24 commit 68f80cc Author: Michael Sproul <[email protected]> Date: Wed Sep 13 13:56:53 2023 +1000 Change default epochs-per-state-diff to 16 This should make replaying diffs during non-finality a bit quicker. commit 838e104 Author: Michael Sproul <[email protected]> Date: Wed Sep 13 13:54:03 2023 +1000 Attempt to fix flaky test commit d961d2c Author: Michael Sproul <[email protected]> Date: Wed Sep 13 12:51:20 2023 +1000 Disable ARM docker builds commit b8e04ce Merge: 1e4ee7a 35f47f4 Author: Michael Sproul <[email protected]> Date: Wed Sep 13 11:25:18 2023 +1000 Merge remote-tracking branch 'origin/unstable' into tree-states commit 1e4ee7a Author: Jimmy Chen <[email protected]> Date: Mon Sep 11 10:19:40 2023 +1000 Tree states to support per-slot state diffs (sigp#4652) * Support per slot state diffs * Store HierarchyConfig on disk. Support storing hdiffs at per slot level. * Revert HierachyConfig change for testing. * Add validity check for the hierarchy config when opening the DB. * Update HDiff tests. * Fix `get_cold_state` panic when the diff for the slot isn't stored. * Use slots instead of epochs for storing snapshots in freezer DB. * Add snapshot buffer to `diff_buffer_cache` instead of loading it from db every time. * Add `hierarchy-exponents` cli flag to beacon node. * Add test for `StorageStrategy::ReplayFrom` and ignore a flaky test. * Drop hierarchy_config in tests for more frequent snapshot and fix an issue where hdiff wasn't stored unless it's a epoch boundary slot. commit e373e9a Author: Michael Sproul <[email protected]> Date: Wed Aug 9 19:42:14 2023 +1000 Fix genesis state storage for genesis sync (sigp#4589) commit bba1526 Author: Michael Sproul <[email protected]> Date: Tue Aug 8 13:57:05 2023 +1000 Fix deadlock in finalization migration (sigp#4576) commit 18e64e6 Author: Michael Sproul <[email protected]> Date: Tue Aug 8 11:25:26 2023 +1000 Optimise mutations in single-pass epoch processing (sigp#4573) * Optimise mutations in single-pass epoch processing * Use safer Cow::make_mut * Update to upstream milhouse commit 8423e9f Merge: 5d2063d fc7f1ba Author: Michael Sproul <[email protected]> Date: Wed Jul 19 11:23:52 2023 +1000 Merge remote-tracking branch 'origin/unstable' into tree-states commit 5d2063d Author: Michael Sproul <[email protected]> Date: Tue Jul 18 16:59:55 2023 +1000 Single-pass epoch processing (sigp#4483) commit 079cd67 Merge: 0291998 835fa70 Author: Michael Sproul <[email protected]> Date: Mon Jul 3 15:03:54 2023 +1000 Merge remote-tracking branch 'origin/tree-states' into tree-states commit 0291998 Merge: b414c32 46be05f Author: Michael Sproul <[email protected]> Date: Mon Jul 3 15:01:21 2023 +1000 Merge remote-tracking branch 'origin/unstable' into tree-states commit b414c32 Author: Michael Sproul <[email protected]> Date: Mon Jul 3 12:03:14 2023 +1000 Implement activation queue cache commit 835fa70 Author: Michael Sproul <[email protected]> Date: Sat Jul 1 09:53:06 2023 +1000 Fix EpochCache handling in ef-tests (sigp#4454) commit f631b51 Author: Michael Sproul <[email protected]> Date: Fri Jun 30 22:57:36 2023 +1000 Fix EpochCache handling in ef-tests commit 2df714e Author: Jimmy Chen <[email protected]> Date: Fri Jun 30 11:25:51 2023 +1000 Tree states optimization using `EpochCache` (sigp#4429) * Relocate epoch cache to BeaconState * Optimize per block processing by pulling previous epoch & current epoch calculation up. * Revert `get_cow` change (no performance improvement) * Initialize `EpochCache` in epoch processing and load it from state when getting base rewards. * Initialize `EpochCache` at start of block processing if required. * Initialize `EpochCache` in `transition_blocks` if `exclude_cache_builds` is enabled * Fix epoch cache initialization logic * Remove FIXME comment. * Cache previous & current epochs in `consensus_context.rs`. * Move `get_base_rewards` from `ConsensusContext` to `BeaconState`. * Update Milhouse version commit 160bbde Author: Michael Sproul <[email protected]> Date: Fri Jun 30 10:29:34 2023 +1000 Fix db-migration-period default (sigp#4441) * Fix db-migration-period default * Fix version regex commit 6954de6 Author: Michael Sproul <[email protected]> Date: Tue Jun 27 17:34:41 2023 +1000 Tree states alpha release v4.2.990-exp commit 8dc374e Author: Michael Sproul <[email protected]> Date: Tue Jun 27 17:33:25 2023 +1000 Temporarily disable ARM builds commit af5fb20 Author: Michael Sproul <[email protected]> Date: Tue Jun 27 15:10:52 2023 +1000 Tree states alpha release v4.2.99-exp commit 56c7a52 Author: Michael Sproul <[email protected]> Date: Tue Jun 27 16:52:24 2023 +1000 Install Clang 5 in Cross builder image commit 7c2eb96 Author: Michael Sproul <[email protected]> Date: Tue Jun 27 15:06:43 2023 +1000 Set epochs per migration to 1 Workaround for sigp#4236 commit 88e30b6 Author: Michael Sproul <[email protected]> Date: Tue Jun 27 15:04:39 2023 +1000 Fix failing tests (sigp#4423) * Get tests passing * Get benchmarks compiling * Fix EF withdrawals test * Remove unused deps * Fix tree_hash panic in tests * Fix slasher compilation * Fix ssz_generic test * Get more tests passing * Fix EF tests for real * Fix local testnet scripts commit ca412ab Author: Michael Sproul <[email protected]> Date: Wed Jun 21 11:05:09 2023 +1000 Use rebasing to minimise BeaconState mem usage (sigp#4416) * Use "rebasing" to minimise BeaconState mem usage * Update metastruct * Use upstream milhouse, update cargo lock * Rebase caches for extra memory savings commit 6eb1513 Author: Michael Sproul <[email protected]> Date: Tue Jun 20 19:10:05 2023 +1000 Configurable diff buffer cache size (sigp#4420) commit d56cec8 Author: Paul Hauner <[email protected]> Date: Tue Jun 20 11:47:52 2023 +1000 Address clippy lints in `tree-states` (sigp#4414) * Address some clippy lints * Box errors to fix error size lint * Add Default impl for Validator * Address more clippy lints * Re-implement `check_state_diff` * Fix misc test compile errors commit 23db089 Author: Michael Sproul <[email protected]> Date: Mon Jun 19 10:14:47 2023 +1000 Implement tree states & hierarchical state DB
Issue Addressed
Fix an issue observed by
@zlan
on Discord where Lighthouse would sometimes return this error when looking up states via the API:Proposed Changes
The error stems from a faulty assumption in the HTTP API logic: that any state in the hot database must have its block in fork choice. This isn't true because the state's hot database may update much less frequently than the fork choice store, e.g. if reconstructing states (where freezer migration pauses), or if the freezer migration runs slowly. There could also be a race between loading the hot state and checking fork choice, e.g. even if the finalization migration of DB+fork choice were atomic, the update could happen between the 1st and 2nd calls.
To address this I've changed the HTTP API logic to use the finalized block's execution status as a fallback where it is safe to do so. In the case where a block is non-canonical and prior to finalization (permanently orphaned) we default
execution_optimistic
totrue
.Additional Info
I've also added a new CLI flag to reduce the frequency of the finalization migration as this is useful for several purposes:
This new flag made it substantially easier to test this change. It was extracted from
tree-states
(where it's called--db-migration-period
), which is why this PR also carries thetree-states
label.