Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

time: Timer.Stop documentation example easily leads to deadlocks #27169

Closed
palsivertsen opened this issue Aug 23, 2018 · 13 comments
Closed

time: Timer.Stop documentation example easily leads to deadlocks #27169

palsivertsen opened this issue Aug 23, 2018 · 13 comments
Labels
Documentation Issues describing a change to documentation. NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@palsivertsen
Copy link

I needed timeout functionality for one of my projects, so I looked in the time package. My timeouts where fallbacks in case a channel receive took too long. Most of the time the channel would receive before the timeout and I wanted to release the timeout resources when they where no longer needed. Documentation for time.After() says:

[...] If efficiency is a concern, use NewTimer instead and call Timer.Stop if the timer is no longer needed.

So I used a time.Timer and according to the documentation for time.Timer.Stop() one should drain the channel if time.Timer.Stop() returns false:

if !t.Stop() {
	<-t.C
}

I later discovered that my threads got stuck on receive like in this playground example when timer where triggered before I called stop:

t := time.NewTimer(time.Second * 3)
defer func() {
	if !t.Stop() {
		<-t.C
	}
}()
<-t.C

Wrapping the drain in a select seems to do the trick:

t := time.NewTimer(time.Second * 3)
defer func() {
	t.Stop()
	select {
	case <-t.C:
	default:
	}
}()
<-t.C

Documentation should make it clear how to safely drain the channel.

@gopherbot gopherbot added the Documentation Issues describing a change to documentation. label Aug 23, 2018
@FMNSSun
Copy link

FMNSSun commented Aug 23, 2018

TL/DR: This is incorrect usage and the documentation kinda mentions it but it takes a while to understand it correctly so while it's documented it could be documented better.

For example, assuming the program has not received from t.C already

t := time.NewTimer(time.Second * 3)
defer func() {
	if !t.Stop() {
		<-t.C
	}
}()
<-t.C

Isn't this incorrect usage because you've already received from t.C. Isn't the point of the timer to fire after the delay except if you call Stop on it? <-t.C already waits for the timer to fire so the stop in the defered function is entirely useless because the timer has already fired anyway?

The way it actually works is that Stop() returns false in case the timer has already fired which means UNLESS you haven't ALREADY read from it then there's a value in t.C you might want to read. Obviously this doesn't work if you've already read from t.C. Stop will return false regardless (as the timer has already fired) but you've already read from t.C earlier thus you deadlock on <-t.C.

t := time.NewTimer(time.Second * 3)
defer func() {
	t.Stop()
	select {
	case <-t.C:
	default:
	}
}()
<-t.C

This prevents the deadlock, sure and it's always safe to do that because if Stop() returns true you enter the default case and if it returns false you enter the default case as well because t.C is empty because you've already read it but since you enter the default case anyway in this example you might as well just remove the whole select. Still, this isn't the intended usage of Stop().

@FMNSSun
Copy link

FMNSSun commented Aug 23, 2018

FWIW: This would be an example of proper usage:

package main

import "time"
import "fmt"

func main() {
	t := time.NewTimer(time.Second * 3)
	foo := make(chan int)
	go func() { foo <- 1 }()
	select {
	case <-t.C:
		fmt.Println("timeout")
	case <-foo:
		fmt.Println("foo")
		if !t.Stop() {
			<-t.C
		}
	}
}

@artyom
Copy link
Member

artyom commented Aug 23, 2018

My timeouts where fallbacks in case a channel receive took too long. Most of the time the channel would receive before the timeout and I wanted to release the timeout resources when they where no longer needed.

I believe you don't really need to drain the timer channel for this, calling Timer.Stop will suffice:

timer := time.NewTimer(3 * time.Second)
defer timer.Stop()
select {
case res := <- workChannel:
    return res, nil
case <-timer.C:
    return nil, ErrTimeout
}

You may find such pattern in use in standard library.

@bradfitz bradfitz changed the title time/Timer.Stop documentation example easily leads to deadlocks time: Timer.Stop documentation example easily leads to deadlocks Aug 23, 2018
@palsivertsen
Copy link
Author

@FMNSSun
Thanks for the explanation. Your example looks somewhat like what I did in the first place. But I had more channels in my select and didn't like all the extra t.Stop() calls:

t := time.NewTimer(time.Second * 3)
bar := make(chan int)
chicken := make(chan int)
egg := make(chan int)
go func() { foo <- 1 }()
select {
case <-t.C:
	fmt.Println("timeout")
case <-foo:
	fmt.Println("foo")
	if !t.Stop() {
		<-t.C
	}
case <-chicken:
	fmt.Println("chicken")
	if !t.Stop() {
		<-t.C
	}
case <-egg:
	fmt.Println("egg")
	if !t.Stop() {
		<-t.C
	}
}

@artyom

I believe you don't really need to drain the timer channel for this, calling Timer.Stop will suffice

Looks scary. What if timer triggers between the select block and the defer? Won't you have a thread stuck on channel send?

@FMNSSun
Copy link

FMNSSun commented Aug 24, 2018

@palsivertsen No because timer uses a buffered channel with capacity 1 exactly for this reason: that it can't get stuck if there's nobody reading from it.

Also... it might make sense in that case to move the t.Stop past the select instead of repeating it in every case.

@palsivertsen
Copy link
Author

timer uses a buffered channel with capacity 1 exactly for this reason: that it can't get stuck if there's nobody reading from it.

Cool. I did not know that.

Also... it might make sense in that case to move the t.Stop past the select instead of repeating it in every case.

Wouldn't that deadlock if case <-t.C: happens?

@FMNSSun
Copy link

FMNSSun commented Aug 24, 2018

@palsivertsen it would but you could set a flag in the case <-t.C case and then only invoke stop if that flag isn't already set. But probably matter of personal taste.

@ran-eh
Copy link

ran-eh commented Mar 20, 2019

+1 this issue just cost me an hour :( I also have the keep-alive scenario. Perhaps the the docs should link to this discussion?

@ianlancetaylor ianlancetaylor added this to the Go1.13 milestone Mar 20, 2019
@ianlancetaylor ianlancetaylor added the NeedsFix The path to resolution is known, but the work has not been done. label Mar 20, 2019
@ianlancetaylor
Copy link
Member

I don't think we want to link to this discussion.

Does anyone have specific improvements to suggest? Anyone want to send a pull request? Thanks.

@palsivertsen
Copy link
Author

Some suggestions/thoughts:

  • Update the doc to underline that ignoring the value of the time.Timer.C channel is safe because the channel has a buffer length of one.
  • Remove channel draining from time.Timer.Stop documentation. Are there any usecases where one would need/want to drain the channel when stopping (not resetting) the timer?
  • Make time.Timer.Reset do the draining internally, thus no need to expose the drain concept in the docs. This sadly changes the behaviour of time.Timer.Reset, possibly breaking existing code. Adding time.Timer.ResetAndDrain might be an alternative.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/185245 mentions this issue: time: clarify when draining a Timer's channel is needed

darkfeline added a commit to darkfeline/go that referenced this issue Jul 9, 2019
darkfeline added a commit to darkfeline/go that referenced this issue Jul 10, 2019
darkfeline added a commit to darkfeline/go that referenced this issue Jul 11, 2019
gopherbot pushed a commit that referenced this issue Jul 18, 2019
Updates #27169

Change-Id: I22a6194c06529ba70b1ec648e3188c191224e321
GitHub-Last-Rev: 457b2a6
GitHub-Pull-Request: #32996
Reviewed-on: https://go-review.googlesource.com/c/go/+/185245
Reviewed-by: Rob Pike <[email protected]>
@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
searKing added a commit to searKing/golang that referenced this issue Nov 1, 2019
searKing added a commit to searKing/golang that referenced this issue Nov 1, 2019
ddiss added a commit to ddiss/icyci that referenced this issue Nov 14, 2019
If stateTransTimer has already been stopped (without reset) then drain
will deadlock. Avoid this by using a select + default case, as done in
golang/go#27169 .

Signed-off-by: David Disseldorp <[email protected]>
searKing added a commit to searKing/golang that referenced this issue Jan 6, 2021
infraweavers added a commit to infraweavers/monitoring-agent that referenced this issue Feb 19, 2021
…eading to a memory leak.

The documentation implied something which doesn't seem to actually be true when looked at in the detail: golang/go#27169
malud pushed a commit to coupergateway/couper that referenced this issue May 4, 2022
alex-schneider pushed a commit to coupergateway/couper that referenced this issue May 9, 2022
* Add missing bytes field to backend response log

* (docs) remove unknown utf8 chars and reformat tables

* (docs) add response.bytes field to backend log documentation

* Add content-length fallback

* rm obsolete key

* more checks in test

* handle timeout cancel select case

* fmt

* Fix missing token request within custom log context

* Fix possible timer deadlock

golang/go#27169

* cancel while reading results

* fire collected backend logStack at the end

endpoint handler has to many exits; even with panic recovers

* Add changelog entry

Co-authored-by: Alex Schneider <[email protected]>
kriszabala added a commit to RadiusNetworks/pusher-ws-go that referenced this issue Mar 22, 2023
attempting to drain an already drained timer causes a deadlock and prevents activityTimer from being reset therefore heartbeat pings are not periodically sent. Issue documented here: golang/go#27169
@MichaelSnowden
Copy link

@palsivertsen No because timer uses a buffered channel with capacity 1 exactly for this reason: that it can't get stuck if there's nobody reading from it.

Also... it might make sense in that case to move the t.Stop past the select instead of repeating it in every case.

@FMNSSun do you have a source for this?

@rsc
Copy link
Contributor

rsc commented Mar 15, 2024

The documentation no longer gives this example, because it is no longer necessary.

@rsc rsc closed this as completed Mar 15, 2024
Wessie added a commit to Wessie/girc that referenced this issue Mar 16, 2024
searKing added a commit to searKing/golang that referenced this issue Sep 3, 2024
…tation example no longer leads to deadlocks, as Timer/Ticker channels not receivable with old values after Stop or Reset returns.

golang/go#27169
golang/go#37196
golang/go#14383
searKing added a commit to searKing/golang that referenced this issue Sep 18, 2024
…tation example no longer leads to deadlocks, as Timer/Ticker channels not receivable with old values after Stop or Reset returns.

golang/go#27169
golang/go#37196
golang/go#14383
searKing added a commit to searKing/golang that referenced this issue Sep 28, 2024
…tation example no longer leads to deadlocks, as Timer/Ticker channels not receivable with old values after Stop or Reset returns.

golang/go#27169
golang/go#37196
golang/go#14383
searKing added a commit to searKing/golang that referenced this issue Oct 5, 2024
…tation example no longer leads to deadlocks, as Timer/Ticker channels not receivable with old values after Stop or Reset returns.

golang/go#27169
golang/go#37196
golang/go#14383
searKing added a commit to searKing/golang that referenced this issue Oct 9, 2024
…tation example no longer leads to deadlocks, as Timer/Ticker channels not receivable with old values after Stop or Reset returns.

golang/go#27169
golang/go#37196
golang/go#14383
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation Issues describing a change to documentation. NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

9 participants