Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

staticcheck: detect incorrectly used pointers to loop variables #163

Open
ainar-g opened this issue Aug 24, 2017 · 11 comments
Open

staticcheck: detect incorrectly used pointers to loop variables #163

ainar-g opened this issue Aug 24, 2017 · 11 comments
Labels
aggressive A set of checks that is more prone to false positives but is helpful during code review new-check pta Issues that require points-to analysis

Comments

@ainar-g
Copy link
Contributor

ainar-g commented Aug 24, 2017

I was bitten today by a bug that can be summarised by this code:

type Interval struct {
	Begin int64
	End   int64
}

func main() {
	is := []Interval{{1, 2}, {3, 4}}
	ps := []*Interval{}
	for _, i := range is {
		ps = append(ps, &i)
	}
	fmt.Println(ps[0], ps[1])
}

https://play.golang.org/p/MI_CKwgKQ5

One might incorrectly expect that this prints &{1 2} &{3 4}, while in fact it's &{3 4} &{3 4}. This if of course because we take the address of the loop variable, which is updated on each iteration, and in the end what we have is a slice of pointers, all pointing to the same (last) value. It would be nice to have a check that would look for suspicious addressing of loop variables.

@dominikh
Copy link
Owner

This issue is a specialised form of a check for taking addresses of loop variables that outlive a single loop iteration¹. That may be storing addresses in a slice (like here), passing addresses to functions that retain the address, or closures that close over loop variables (which implicitly takes and stores the address.)

We have to solve the same problem for all cases: accurately track the lifetime of the pointer. In particular, we have to prove that the pointer outlives the loop iteration, i.e. closing over a loop variable is not an error as long as the closure only gets called during the iteration. Similarly, appending the pointer to a slice is fine if that slice is only used during the iteration².

In some cases, we cannot reliably prove this. For example, when concurrency is involved, we would have to model said concurrency, which is non-trivial. Due to the way we treat slices², there is also potential for false positives that needs to be taken into consideration. In other cases, we could add specialised checks. For example, the example provided in this issue is easier to check for than others: we can easily deduct that the slice is alive before and after the loop, and that it contains addresses to the loop variable. However, I'd prefer not adding individual code paths for these special cases. Instead, I want to write a general solution to the problem.

However, writing a solid check is time intensive, touching on multiple analyses (lifetime, PTA, inter-procedural analysis). It's on my todo list, but I reckon it will be a long time before a production-ready check is written.

Anyway, tl;dr: yes, we should check for it. No, it's not easy.

¹: Technically, it has to live long enough to be overwritten. That is different from outliving an iteration when it is the last iteration.
²: More accurately, if the specific slot in the slice is only used during the iteration. However, in analysis we tend to track slices as a single value, and all slots in the slice are treated as one.

Related: #51, #149 and other issues that talk about closing over loop variables.

@dominikh dominikh changed the title staticcheck: detect append of the address of the loop variable staticcheck: detect incorrectly used pointers to loop variables Aug 24, 2017
@dominikh
Copy link
Owner

dominikh commented Aug 28, 2017

Another example of a false positive due to the way we treat slices:

s := make([]string, 1)
for _, x := range xs {
  s[0] = &x
  fn(s...)
}
println(*s[0])

This code behaves correctly (if fn doesn't retain the slice or pointer), yet we would flag the code. We would see the slice s as containing retained pointers.

It might be easier to wait for Go 2 to fix this language wart :)

@ainar-g
Copy link
Contributor Author

ainar-g commented Aug 28, 2017

@dominikh Did you mean to write

s := make([]*string, 1)

? Otherwise it wouldn't compile. Either way, your example seems rather contrived to me. In fact, I would thank staticcheck for pointing to such code, because it needs a rewrite. (Also, Rust doesn't allow such code; although it's kinda apples to oranges.)

All in all, it would be interesting to go through all examples of taking an address of a loop variable in a big code base (e.g., the Go source code), and see how it is actually used. (And maybe find some bugs :))

It might be easier to wait for Go 2 to fix this language wart :)

Do you have something in mind? Lifetimes à la Rust? Or simply forbidding taking the address of the loop variables all together?

@dominikh
Copy link
Owner

Did you mean to write

Yes, I did. I just quickly scribbled it down on GitHub.

Either way, your example seems rather contrived to me

One of the primary design goals of staticcheck is to minimize false positives. Contrived but correct code should, if possible, not be flagged as incorrect. But yes, we'd have to look at existing code to determine the rate of false positives.

Do you have something in mind? Lifetimes à la Rust? Or simply forbidding taking the address of the loop variables all together?

Much simpler than that: each iteration of the loop gets its own memory locations for the loop variables. That way, taking the address would work as expected. And in the common case, where variables don't escape, it can be optimized to sharing a single memory location.

@ainar-g
Copy link
Contributor Author

ainar-g commented Apr 27, 2018

One of my colleagues has been bitten again today by a piece of code that went something like this:

func bulkSend(ps []payload) {
    for _, p := range ps {
        ch <- &p
    }
}

I would really like a check that warns against something like this.

@dominikh
Copy link
Owner

I can appreciate that this is a common mistake, but I am still trying to figure out a way to reliably check for this.

As a stop-gap, there may soon(tm) be (optional) checks in stylecheck (our version of golint) and possibly quickfix (IDE backend that suggests code changes) for specific uses of pointers in range loops.

@bcmills
Copy link

bcmills commented Jan 21, 2020

golang/go#20733 (comment) demonstrates a couple of other interesting edge-cases:

  1. A call to a pointer method on the loop variable implicitly takes its address, so any pointer method that allows the receiver to escape is a potential problem.
  2. A method call in a defer statement implicitly closes over its receiver. Since a defer call within a loop body always executes after the loop, the receiver always escapes the loop body.

@fiber
Copy link

fiber commented Jul 8, 2020

there is a related issue when variadic parameters for func(args ...int64) are passed as a slice (e.g. []int64{1,2,3,4}...) that slice is passed as the variadic parameter. This changes semantics, since args[] will be reference the same underlying array as a slice, and leak that slice, whereas that would be unexpected cf. https://golang.org/ref/spec#Passing_arguments_to_..._parameters

@dgryski
Copy link

dgryski commented Jul 8, 2020

https://github.com/kyoh86/exportloopref attempts to solve this.

@kalexmills
Copy link

I've been thinking about this as part of this project, and it is indeed hard to do this with a fairly low rate of false-positives.

The best approach I've been able to come up so far involves callgraph analysis. It seems that in principle one could trace through the callgraph (once it's constructed) and try to determine if a pointer that starts its life in a range-loop 'lands' in any asynchronous call like a go or defer, or if it is stored anywhere. Unfortunately, that analysis is fairly complex to do, especially once third-party dependencies come into play. Even so, go has tooling already in place, in the ssa and callgraph packages.

Anyway, I thought I might float this approach. I don't claim it would cover all possible cases (I think that a general solution is basically equivalent to Rust's lifetime checker), but I'm fairly certain whatever it catches would be worth warning about.

HTH

@dominikh dominikh added the aggressive A set of checks that is more prone to false positives but is helpful during code review label Oct 17, 2021
@dominikh dominikh added this to the Staticcheck 2022.2 milestone Jan 12, 2022
halkeye added a commit to halkeye/go-bayeux-client that referenced this issue Jun 30, 2022
@seiyab
Copy link
Contributor

seiyab commented Mar 31, 2024

Language change in Go 1.22 looks to resolve this.
https://go.dev/play/p/0X_nne8kIb_Y
https://go.dev/blog/go1.22

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aggressive A set of checks that is more prone to false positives but is helpful during code review new-check pta Issues that require points-to analysis
Projects
None yet
Development

No branches or pull requests

7 participants