-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
volumewatcher: set maximum batch size for raft update #7907
Conversation
The `volumewatcher` has a 250ms batch window so claim updates will not typically be large enough to risk exceeding the maximum raft message size. But large jobs might have enough volume claims that this could be a danger. Set a maximum batch size of 100 messages per batch (roughly 33K), as a very conservative safety/robustness guard.
nomad/volumewatcher/batcher.go
Outdated
// Reset the claims list and timer | ||
claims = make(map[string]structs.CSIVolumeClaimRequest) | ||
// Reset the batches list and timer | ||
batches = batches[1:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if i'm reading this correctly, it looks like the loop relies on work coming in on b.workCh
to reset the timer, same as before. however, with batching, it seems like it's possible that enough work could come in during a single batchDuration
to create multiple batches. if that happens, the timerCh
will signal a claim request for only the first batch of claims. the rest will not be dispatched until more work comes in to reset the timer (which could hypothetically be never).
so either this loop needs to send off all batches or it needs to reset timerCh
(to batchDuration
or perhaps some appropriate fraction thereof) if len(batches) > 0
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aside: while i appreciate the elegance of this approach, why just not use a Ticker
for timerCh
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so either this loop needs to send off all batches or it needs to reset timerCh (to batchDuration or perhaps some appropriate fraction thereof) if len(batches) > 0.
Good catch, that wasn't a problem when we were sending off the whole batch at once!
aside: while i appreciate the elegance of this approach, why just not use a Ticker for timerCh?
Hm, that's a good point. In the original design (copied right out of deploymentwatcher
), a no-op pass would potentially modify some state by swapping out the future. A ticker would make it a little easier to understand at the negligible cost of ticking over and checking the length each pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated with these changes. How's that look?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a concern that all batches may not be sent (or sent in a timely manner)
(will mark approve and "take my answer off the air")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 (minor change to stop the ticker)
Co-authored-by: Chris Baker <[email protected]>
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
For #7838
The
volumewatcher
has a 250ms batch window so claim updates will not typically be large enough to risk exceeding the maximum raft message size. But large jobs might have enough volume claims that this could be a danger. Set a maximum batch size of 100 messages per batch (roughly 31K), as a very conservative safety/robustness guard.In the future I'd like to factor this same logic out of the
deploymentwatcher
batcher so we could share implementations.