-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: rollback snapshot to prevent bolt deadlock #1326
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Could you sign your commit with Signed-off-by
?
snapshot/snapshot.go
Outdated
@@ -331,7 +331,8 @@ func (o *snapshotter) commit(ctx context.Context, isRemote bool, name, key strin | |||
} | |||
|
|||
if !isRemote { // skip diskusage for remote snapshots for allowing lazy preparation of nodes | |||
du, err := fs.DiskUsage(ctx, o.upperPath(id)) | |||
var du fs.Usage | |||
du, err = fs.DiskUsage(ctx, o.upperPath(id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment line of notifying that we need to expose err
to the outer scope?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ktock Will do !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ktock I went a bit further and used a pattern you used in the createSnapshot
function. I like what you did there as it has an explicit variable rather than having an err with double duty.
stargz-snapshotter/snapshot/snapshot.go
Lines 517 to 524 in 9e9e7a5
rollback := true | |
defer func() { | |
if rollback { | |
if rerr := t.Rollback(); rerr != nil { | |
log.G(ctx).WithError(rerr).Warn("failed to rollback transaction") | |
} | |
} | |
}() |
Signed-off-by: Chris Goller <[email protected]>
8c56c70
to
3486b7f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
The defer function above checks for the err in the outer scope to determine if it should rollback.
The error in the inner-scope of isRemote is a shadow and when it is an error will not cause a rollback of the transaction.
This causes the bolt transaction to keep the write lock open forever. Any other attempts at taking a write lock will never succeed.
It is not uncommon for
fs.DiskUsage
to be context canceled as it can take a long time to walk the filesystem.