Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prune validation callbacks from queues if they are expired #6924

Merged
merged 5 commits into from
Dec 4, 2020

Conversation

nholland94
Copy link
Member

@nholland94 nholland94 commented Dec 3, 2020

This is to address #6882. With this change, the daemon will now skip processing broadcasted messages which have expired validation callbacks. This should help the daemon catch up with the message queue when libp2p is receiving too many broadcast messages.

I want to do more with the validation callback to make this more robust, but this was the minimal change I could make that I felt somewhat confident of. Please review carefully (validation is tricky in our system).

@nholland94 nholland94 requested a review from a team as a code owner December 3, 2020 20:58
Copy link
Member

@mrmr1993 mrmr1993 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but please see/address mutex comment before merging.

if !ok {
app.P2p.Logger.Errorf("no deadline set on validation context")
defer app.ValidatorMutex.Unlock()
delete(app.Validators, seqno)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this delete not precede the Unlock? It seems like the validator mutex ought to be held when we are modifying app.Validators. (Similarly in the if above, if so.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because of what defer does in Go. It's honestly a bizarre language construct (but it is kind of useful). It basically stages app.ValidationMutex.Unlock() so that it will be called before the function returns to the caller. This follows the pattern for locking and unlocking in other parts of the libp2p_helper code.

handle_validation_error ~logger ~trust_system ~sender
~state_hash:(With_hash.hash transition_with_hash)
~delta:genesis_constants.protocol.delta error )
if not (Coda_net2.Validation_callback.is_expired valid_cb) then (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be using an Interruptible.t for this, interrupted by the expiration signal, to avoid doing extra work if it expires while we are validating?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea, let me try and set that up.

@nholland94 nholland94 added ci-build-me Add this label to trigger a circle+buildkite build for this branch not-ready-to-merge don't merge this yet labels Dec 3, 2020
in
let%bind () =
Interruptible.lift Deferred.unit
(Coda_net2.Validation_callback.await valid_cb)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make more sense to add an await_timeout method for this, since we don't want the callback being called to trigger the interrupt. In the Errorbranch below, I think the current behaviour may prevent the handle_validation_error from being called.

@nholland94 nholland94 added ready-to-merge-into-develop and removed not-ready-to-merge don't merge this yet labels Dec 4, 2020
@mergify mergify bot merged commit a87ddcf into develop Dec 4, 2020
@mergify mergify bot deleted the fix/validation-queue-pruning branch December 4, 2020 01:54
lk86 added a commit that referenced this pull request Dec 5, 2020
…ue-pruning"

This reverts commit a87ddcf, reversing
changes made to 92ea2c0.
bkase added a commit that referenced this pull request Dec 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-build-me Add this label to trigger a circle+buildkite build for this branch ready-to-merge-into-develop
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants