Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle unknown head block during attestation publishing #4699

Closed
michaelsproul opened this issue Sep 5, 2023 · 14 comments
Closed

Handle unknown head block during attestation publishing #4699

michaelsproul opened this issue Sep 5, 2023 · 14 comments
Assignees
Labels
enhancement New feature or request HTTP-API

Comments

@michaelsproul
Copy link
Member

Description

Presently if an attestation is POSTed to the beacon node for a head block that hasn't finished being imported, Lighthouse will log a nasty ERRO:

ERRO Failure verifying attestation for gossip, attestation_slot: XX, committee_index: XX, request_index: X, error: UnknownHeadBlock { beacon_block_root: 0xa98ce8635c5493907d6536905b46eb09db54f8465a82d95aae90650509884f4a }

It will also return a 400 BAD REQUEST to the VC, which creates a knock-on effect. There's an old issue about this here #1971, which was closed because we started queueing attestations from gossip in #635. We still don't have any queueing for attestations from the HTTP API, which is why this error occurs.

In practice, this will often be encountered by users running Vouch – which broadcasts attestations to all nodes, regardless of whether they already know the head block.

Version

Lighthouse v4.4.1 and older

Present Behaviour

  • Lighthouse rejects any attestation received on the HTTP API that doesn't immediately pass gossip validation. This includes attestations for blocks that are in the process of being imported.

Expected Behaviour

  • Lighthouse queues attestations received on the HTTP API so that it can:
    • Return a 200 OK to the caller, and
    • Not log an error message

Steps to resolve

In terms of implementation I think we could use the reprocessing queue. The hard part is blocking the HTTP handler waiting for the attestation to be dequeued and processed. I had a go at implementing something similar for blocks in #4643, which wasn't merged because it ended up being poorly suited to the problem it was trying to solve. We could probably adapt that code, particularly the use of a channel to send back the completion message, and use it for attestations. We could even extend the existing attestation reprocessing logic with an optional channel that gets initialized (and waited on) only for HTTP attestations.

@michaelsproul michaelsproul added enhancement New feature or request HTTP-API labels Sep 5, 2023
@michaelsproul michaelsproul added the v4.6.0 ETA Q1 2024 label Sep 28, 2023
@eserilev
Copy link
Collaborator

eserilev commented Oct 5, 2023

I'd like to take a stab at this one

@ricardolyn
Copy link

@michaelsproul will this issue be fixed in the 4.6.0 release, by a future PR or other unrelated one?

@michaelsproul
Copy link
Member Author

@ricardolyn We don't have an implementation yet, so probably not in 4.6

@eserilev
Copy link
Collaborator

@michaelsproul i didn't end up getting a chance to work on this, feel free to take it. thanks!

@michaelsproul michaelsproul removed the v4.6.0 ETA Q1 2024 label Dec 13, 2023
@michaelsproul michaelsproul self-assigned this Dec 13, 2023
@michaelsproul
Copy link
Member Author

Thanks. I'll start on this now. Will aim to get it in for 4.7.0/5.0.0 ( 👀 )

@michaelsproul
Copy link
Member Author

Implemented, but I haven't had a chance to test yet and probably won't get to until after the holidays. We are all hands on deck trying to ship v4.6.0.

#5010

@koraykoska
Copy link

@michaelsproul Is it scheduled to be included in v4.6.0? Obol users face this issue.

@michaelsproul
Copy link
Member Author

No it won't make 4.6. Needs more testing

Obol users should still be able to attest successfully via the BN that created the attestation, or am I mistaken?

@koraykoska
Copy link

@michaelsproul Obol is a middleware between the CL and the VC. So basically Obol is the "BN that created the attestation", but they are not distributing. So if Lighthouse drops it, the attestation is missed. Which is what we have seen.

@michaelsproul
Copy link
Member Author

michaelsproul commented Jan 17, 2024

@koraykoska Where does Obol get the attestation from though? My point is that at least one of the BNs in the cluster must have produced it, and that BN should succeed in publishing it. Obol should not be attesting to blocks that aren't validated by any of the BNs in its cluster.

To be clear, we have something like this, don't we?:

VC1 -> Obol1 -> BN1
VC2 -> Obol2 -> BN2
VC3 -> Obol3 -> BN3

If BN2 produces the attestation, but BN1 and BN3 haven't processed the block yet, then when it gets published can't BN2 publish the full attestation with signatures from VC1+VC2+VC3? It shouldn't matter if BN1 and BN3 can't publish

CC @xenowits

@koraykoska
Copy link

koraykoska commented Jan 17, 2024 via email

@michaelsproul
Copy link
Member Author

Thanks @koraykoska. That seems strange to me, so I'll ask the Obol guys to clarify. My understanding was that once the cluster had produced a "full" attestation with aggregated signatures from all participants, it would be able to publish it using any single beacon node.

@michaelsproul
Copy link
Member Author

Confirmed with Obol that the complete attestation is published via all BNs, so the attestation should succeed as long as at least 1 node has finished processing the block and is capable of validating the attestation.

@koraykoska If you check your validator's performance for the slots where your BN fails, do you notice attestation penalties?

Either way, we will fix this shortly after 4.6 with #5010

@michaelsproul
Copy link
Member Author

Closed by #5010 which is in the upcoming v5.0.0 release 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request HTTP-API
Projects
None yet
Development

No branches or pull requests

4 participants