-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
liveness: run sync disk write in a stopper task #81813
Conversation
Release note: None
7f36c79
to
8dc745f
Compare
8dc745f
to
acc8755
Compare
Looks like this will resolve #81827, should have looked before opening that one. Thanks! |
CI failures appear to be unrelated flake. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for jumping on this! I suppose if we were worried about the impact of a disk stall on shutdown, we could do something where we give up on the operation after some amount of time. But, I buy the paragraph you wrote about mitigations.
Reviewed 3 of 3 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker, @stevendanna, and @tbg)
pkg/kv/kvserver/liveness/liveness.go
line 1277 at r2 (raw file):
resultCs[i], _ = nl.engineSyncs.DoChan(strconv.Itoa(i), func() (interface{}, error) { var taskErr error if err := nl.stopper.RunTask(ctx, "liveness-hb-diskwrite", func(ctx context.Context) {
Looks like there is also RunTaskWithErr
that would cut down on some of the boilerplate here, but it's fine as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @erikgrinaker, @stevendanna, and @tbg)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status:complete! 2 of 0 LGTMs obtained (waiting on @erikgrinaker and @tbg)
acc8755
to
3271df3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose if we were worried about the impact of a disk stall on shutdown, we could do something where we give up on the operation after some amount of time. But, I buy the paragraph you wrote about mitigations.
Yeah, but I don't know if that's always going to be safe -- for example, in this case that means that we'll be closing Pebble while we still have in-flight writes, which is what prompted this in the first place. There's no way to cancel these writes afaik.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @nicktrav, @stevendanna, and @tbg)
pkg/kv/kvserver/liveness/liveness.go
line 1277 at r2 (raw file):
Previously, stevendanna (Steven Danna) wrote…
Looks like there is also
RunTaskWithErr
that would cut down on some of the boilerplate here, but it's fine as is.
Oh nice, that's better -- thanks! Definitely not "fine" as is, more like functional. :)
This patch runs the sync disk write during node heartbeats in a stopper task. The write is done in a goroutine, so that we can respect the caller's context cancellation (even though the write itself won't). However, this could race with engine shutdown when stopping the node, violating the Pebble contract and triggering the race detector. Running it as a stopper task will cause the node to wait for the disk write to complete before closing the engine. Of course, if the disk stalls then node shutdown will now never complete. This is very unfortunate, since stopping the node is often the only mitigation to recover stuck ranges with stalled disks. This is mitigated by Pebble panic'ing the node on stalled disks, and Kubernetes and other orchestration tools killing the process after some time. Release note: None
3271df3
to
b2c4afe
Compare
the write can't? I think go supports this via deadlines on the os.File handle. |
Yes, but that applies to all writes, not just this one. We'd also have to propagate that through Pebble. Adding deadlines for all Pebble writes is a riskier change than I want to make here. |
TFTRs! bors r=stevendanna,nicktrav |
Build succeeded: |
liveness: move stopper to
NodeLivenessOptions
Release note: None
liveness: run sync disk write in a stopper task
This patch runs the sync disk write during node heartbeats in a stopper
task. The write is done in a goroutine, so that we can respect the
caller's context cancellation (even though the write itself won't).
However, this could race with engine shutdown when stopping the node,
violating the Pebble contract and triggering the race detector. Running
it as a stopper task will cause the node to wait for the disk write to
complete before closing the engine.
Of course, if the disk stalls then node shutdown will now never
complete. This is very unfortunate, since stopping the node is often
the only mitigation to recover stuck ranges with stalled disks. This is
mitigated by Pebble panic'ing the node on stalled disks, and Kubernetes
and other orchestration tools killing the process after some time.
Touches #81786.
Resolves #81511.
Resolves #81827.
Release note: None