-
Notifications
You must be signed in to change notification settings - Fork 550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prune validation callbacks from queues if they are expired #6924
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but please see/address mutex comment before merging.
if !ok { | ||
app.P2p.Logger.Errorf("no deadline set on validation context") | ||
defer app.ValidatorMutex.Unlock() | ||
delete(app.Validators, seqno) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this delete not precede the Unlock
? It seems like the validator mutex ought to be held when we are modifying app.Validators
. (Similarly in the if
above, if so.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's because of what defer
does in Go. It's honestly a bizarre language construct (but it is kind of useful). It basically stages app.ValidationMutex.Unlock()
so that it will be called before the function returns to the caller. This follows the pattern for locking and unlocking in other parts of the libp2p_helper code.
handle_validation_error ~logger ~trust_system ~sender | ||
~state_hash:(With_hash.hash transition_with_hash) | ||
~delta:genesis_constants.protocol.delta error ) | ||
if not (Coda_net2.Validation_callback.is_expired valid_cb) then ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be using an Interruptible.t
for this, interrupted by the expiration signal, to avoid doing extra work if it expires while we are validating?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea, let me try and set that up.
in | ||
let%bind () = | ||
Interruptible.lift Deferred.unit | ||
(Coda_net2.Validation_callback.await valid_cb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might make more sense to add an await_timeout
method for this, since we don't want the callback being called to trigger the interrupt. In the Error
branch below, I think the current behaviour may prevent the handle_validation_error
from being called.
…tion-queue-pruning"" This reverts commit 1b966b2.
This is to address #6882. With this change, the daemon will now skip processing broadcasted messages which have expired validation callbacks. This should help the daemon catch up with the message queue when libp2p is receiving too many broadcast messages.
I want to do more with the validation callback to make this more robust, but this was the minimal change I could make that I felt somewhat confident of. Please review carefully (validation is tricky in our system).