Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

file: Fix race in closing #31

Closed
wants to merge 1 commit into from
Closed

Conversation

samuelkarp
Copy link

Hello!

It looks like there's a race condition created in (*win32File).closeHandle() where multiple concurrent goroutines are accessing the (*win32File).closing field. This shows up in the go race detector. This change adds a lock around the closing field to remove the race.

Thanks!
Sam

fsouza added a commit to fsouza/go-dockerclient that referenced this pull request Sep 19, 2016
@samuelkarp samuelkarp mentioned this pull request Sep 22, 2016
@lowenna
Copy link
Contributor

lowenna commented Sep 26, 2016

@jstarks

Copy link
Member

@jstarks jstarks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this change! This is definitely something we should take... see my other comments on a few tweaks that we need before merging this.

alreadyClosing := f.closing
f.closing = true
f.closingLock.Unlock()
if !alreadyClosing {
Copy link
Member

@jstarks jstarks Sep 26, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If alreadyClosing is true, then closeHandle() will no longer block waiting for all IO to complete... Of course this may be better than the current state of things, but I think it might be better to add a wait group or channel to wait for the goroutine that is performing the close.

I would also suggest changing the mutex to just be an atomic.SwapUint32() call in this path, and atomic.LoadUint32() call in isClosing(). We don't need the extra synchronization of the mutex since we are already relying on ordering between the wg.Add(1) call and the read of closing; the atomic makes this ordering contract more explicit and correct without relying on mutual exclusion.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If alreadyClosing is true, that implies that closeHandle() was called from another goroutine. This is the same behavior as today, since the existing code sets f.closing = true before waiting.

Does closeHandle() need to wait when called from multiple goroutines concurrently (does each call to closeHandle() need to wait)?

@samuelkarp
Copy link
Author

@jstarks I've pushed a new commit that fixes an error in my previous one. As I mentioned, if alreadyClosing is true, that implies that closeHandle() was called from another goroutine earlier. The goroutine that called earlier is waiting for the existing wait group in order to block waiting for IO to complete. If the desired behavior is that any goroutine calling closeHandle() should block until IO is complete that can be changed.

I looked a bit at removing the sync.RWMutex in favor of just using atomic.SwapUint32()/atomic.LoadUint32(), but that made the race detector complain about concurrent reads/writes involving f.wg.

@samuelkarp
Copy link
Author

@jstarks @jhowardmsft ping?

1 similar comment
@samuelkarp
Copy link
Author

@jstarks @jhowardmsft ping?

@lowenna
Copy link
Contributor

lowenna commented Nov 11, 2016

One for @jstarks.

@lowenna
Copy link
Contributor

lowenna commented Nov 15, 2016

@samuelkarp Looking back, it seems @jstarks suggested some changes. Are you able to address those? Thanks.

@samuelkarp
Copy link
Author

@jhowardmsft I looked at @jstarks's suggestion and it didn't seem to make the race detector happy (unless I misunderstood what was he was asking?), see #31 (comment).

@schmichael
Copy link

We've been hitting a go-winio related panic in Nomad on Windows for some time, and I'm afraid we still hit a panic when building against this PR:

Stack Trace
fatal error: unexpected signal during runtime execution
[signal 0xc0000005 code=0x0 addr=0x2ba pc=0x40e892]

goroutine 245 [running]:
runtime.throw(0x12a448a, 0x2a)
        /usr/local/go/src/runtime/panic.go:596 +0x9c fp=0xc042165e18 sp=0xc042165df8
runtime.sigpanic()
        /usr/local/go/src/runtime/signal_windows.go:155 +0x18b fp=0xc042165e48 sp=0xc042165e18
runtime.unlock(0xc04216ca18)
        /usr/local/go/src/runtime/lock_sema.go:107 +0x72 fp=0xc042165e70 sp=0xc042165e48
runtime.chansend(0x1054200, 0xc04216c9c0, 0xc042165fb0, 0xc042165f01, 0xac71ea, 0xc042165f9c)
        /usr/local/go/src/runtime/chan.go:179 +0x8af fp=0xc042165f20 sp=0xc042165e70
runtime.chansend1(0x1054200, 0xc04216c9c0, 0xc042165fb0)
        /usr/local/go/src/runtime/chan.go:113 +0x4d fp=0xc042165f60 sp=0xc042165f20
github.com/hashicorp/nomad/vendor/github.com/Microsoft/go-winio.ioCompletionProcessor(0x30c)
        /home/schmichael/go/src/github.com/hashicorp/nomad/vendor/github.com/Microsoft/go-winio/file.go:142 +0xea fp=0xc042165fd8 sp=0xc042165f60
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:2197 +0x1 fp=0xc042165fe0 sp=0xc042165fd8
created by github.com/hashicorp/nomad/vendor/github.com/Microsoft/go-winio.initIo
        /home/schmichael/go/src/github.com/hashicorp/nomad/vendor/github.com/Microsoft/go-winio/file.go:55 +0x87

schmichael added a commit to hashicorp/nomad that referenced this pull request Apr 27, 2017
schmichael added a commit to hashicorp/nomad that referenced this pull request May 4, 2017
@darstahl
Copy link
Contributor

This was fixed by #59. Closing.

@darstahl darstahl closed this Jul 25, 2017
fsouza added a commit to fsouza/go-dockerclient that referenced this pull request Jul 26, 2017
Looks like microsoft/go-winio#31 has been fixed, let's try.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants