volumewatcher: set maximum batch size for raft update #7907

tgross · 2020-05-08T18:51:07Z

The volumewatcher has a 250ms batch window so claim updates will not typically be large enough to risk exceeding the maximum raft message size. But large jobs might have enough volume claims that this could be a danger. Set a maximum batch size of 100 messages per batch (roughly 31K), as a very conservative safety/robustness guard.

In the future I'd like to factor this same logic out of the deploymentwatcher batcher so we could share implementations.

The `volumewatcher` has a 250ms batch window so claim updates will not typically be large enough to risk exceeding the maximum raft message size. But large jobs might have enough volume claims that this could be a danger. Set a maximum batch size of 100 messages per batch (roughly 33K), as a very conservative safety/robustness guard.

cgbaker · 2020-05-08T19:39:51Z

nomad/volumewatcher/batcher.go

-			// Reset the claims list and timer
-			claims = make(map[string]structs.CSIVolumeClaimRequest)
+			// Reset the batches list and timer
+			batches = batches[1:]


if i'm reading this correctly, it looks like the loop relies on work coming in on b.workCh to reset the timer, same as before. however, with batching, it seems like it's possible that enough work could come in during a single batchDuration to create multiple batches. if that happens, the timerCh will signal a claim request for only the first batch of claims. the rest will not be dispatched until more work comes in to reset the timer (which could hypothetically be never).

so either this loop needs to send off all batches or it needs to reset timerCh (to batchDuration or perhaps some appropriate fraction thereof) if len(batches) > 0.

aside: while i appreciate the elegance of this approach, why just not use a Ticker for timerCh?

so either this loop needs to send off all batches or it needs to reset timerCh (to batchDuration or perhaps some appropriate fraction thereof) if len(batches) > 0.

Good catch, that wasn't a problem when we were sending off the whole batch at once!

aside: while i appreciate the elegance of this approach, why just not use a Ticker for timerCh?

Hm, that's a good point. In the original design (copied right out of deploymentwatcher), a no-op pass would potentially modify some state by swapping out the future. A ticker would make it a little easier to understand at the negligible cost of ticking over and checking the length each pass.

I've updated with these changes. How's that look?

cgbaker

a concern that all batches may not be sent (or sent in a timely manner)
(will mark approve and "take my answer off the air")

nomad/volumewatcher/batcher.go

cgbaker

👍 (minor change to stop the ticker)

Co-authored-by: Chris Baker <[email protected]>

github-actions · 2023-01-06T02:17:31Z

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

tgross requested review from langmartin and cgbaker May 8, 2020 18:51

tgross added the theme/storage label May 8, 2020

tgross added this to the 0.11.2 milestone May 8, 2020

cgbaker reviewed May 8, 2020

View reviewed changes

cgbaker approved these changes May 8, 2020

View reviewed changes

resolve comments from code review

ee6054b

cgbaker reviewed May 8, 2020

View reviewed changes

nomad/volumewatcher/batcher.go Show resolved Hide resolved

cgbaker approved these changes May 8, 2020

View reviewed changes

Update nomad/volumewatcher/batcher.go

5915937

Co-authored-by: Chris Baker <[email protected]>

tgross merged commit a334ba3 into master May 8, 2020

tgross deleted the volumewatcher_batch_size branch May 8, 2020 20:54

tgross mentioned this pull request May 8, 2020

add batch maximum to volume/deployment watcher #7838

Closed

github-actions bot locked as resolved and limited conversation to collaborators Jan 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

volumewatcher: set maximum batch size for raft update #7907

volumewatcher: set maximum batch size for raft update #7907

tgross commented May 8, 2020 •

edited

Loading

cgbaker May 8, 2020

cgbaker May 8, 2020

tgross May 8, 2020

tgross May 8, 2020

cgbaker left a comment

cgbaker left a comment •

edited

Loading

github-actions bot commented Jan 6, 2023

volumewatcher: set maximum batch size for raft update #7907

volumewatcher: set maximum batch size for raft update #7907

Conversation

tgross commented May 8, 2020 • edited Loading

cgbaker May 8, 2020

Choose a reason for hiding this comment

cgbaker May 8, 2020

Choose a reason for hiding this comment

tgross May 8, 2020

Choose a reason for hiding this comment

tgross May 8, 2020

Choose a reason for hiding this comment

cgbaker left a comment

Choose a reason for hiding this comment

cgbaker left a comment • edited Loading

Choose a reason for hiding this comment

github-actions bot commented Jan 6, 2023

tgross commented May 8, 2020 •

edited

Loading

cgbaker left a comment •

edited

Loading