Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validator fallback does not work when the execution engine is offline or unavailable #3641

Closed
Spacesider opened this issue Oct 14, 2022 · 2 comments

Comments

@Spacesider
Copy link

Description

My validator has multiple consensus node endpoints configured. The first being a Lighthouse-Nethermind pair, and the second being a Teku-Besu pair. The Lighthouse validator points to Lighthouse node as the primary, and the Teku node as the fallback.

While performing an upgrade on my Nethermind node, I ran into issues and Nethermind was offline for roughly an hour. After it was back up, I discovered that I had missed attestations the entire time, this was despite having a Teku-Besu fallback both configured and running.

It appears that when the execution engine that is paired with Lighthouse goes offline (Or is otherwise unavailable), the lighthouse validator still attempts to use the lighthouse node despite it not being functional. (Side note, I have only tested this with Nethermind as the execution engine, however I would be happy to perform tests with all execution engines, but I will need some time to do this).

Pre-merge this wouldn't be a problem because you didn't need the execution engine to attest, but post-merge this has changed.

Version

Running https://github.com/sigp/lighthouse/releases/tag/v3.1.2 > lighthouse-v3.1.2-x86_64-unknown-linux-gnu.tar.gz

Present Behaviour

At current, the Lighthouse node presents itself as being available for a validator node when the execution node is offline. This prevents the validator from switching over to the other configured endpoint.

Expected Behaviour

The Lighthouse node should report itself as being something along the lines of "offline" or "not in sync" or "unavailable" when the paired execution engine is offline or unavailable. Because while it is in this state, it is unable to process attestations or block proposals. So for validating purposes, it is offline.

Steps to resolve

N/A

@michaelsproul
Copy link
Member

Sorry you ran into this, it's a known problem with our fallback mechanism post-merge. We are tracking these issues via this tracking issue: #3613.

@michaelsproul
Copy link
Member

Closing as dupe of #3613. The scenario described will be fixed by #4291

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants