Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/vet: analyze 'loopclosure' func literals that preceed some well-understood statements #57173

Closed
thepudds opened this issue Dec 8, 2022 · 9 comments
Assignees
Labels
Analysis Issues related to static analysis (vet, x/tools/go/analysis) NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@thepudds
Copy link
Contributor

thepudds commented Dec 8, 2022

What version of Go are you using (go version)?

tip.

Does this issue reproduce with the latest release?

yes.

What did you do?

Run the 'loopclosure' analyzer via go vet on code that has go or defer statements with a function literal that captures a loop iteration variable, but has trailing statements after the go or defer statement, such as this example from #21412:

func handleBatch(ctx context.Context, batch *Batch) error {
    var chans []chan error
    for _, request := range batch.Requests {
        c := make(chan error)
        go func(c chan error) {
            c <- handle(ctx, request) // wrong: request is captured loop iteration variable
        }(c)
        chans = append(chans, c)
    }
    // ... read chans ...
}

What did you see?

No vet message.

What would you like to see instead?

vet reporting:

loop variable request captured by func literal

Recent background

https://go.dev/cl/452155 recently added the ability to handle additional moderate complexity leading up to loop variables being captured by a function literal in a go or defer statement, as well as certain errgroup.Group.Go calls.

While working on that CL, I also did a quick prototype of handling moderate complexity after the go or defer statement, where a limited number of well-understood trailing statements and expressions could be handled (recursively). Also, during the review of that CL, @timothy-king also asked if the definition of "last" could be modified to handle a trailing i++ for example.

Handling well-understood trailing statements

I have a new WIP CL https://go.dev/cl/455195, which modifies the 'loopclosure' analyzer to handle examples like the one above.

The basic approach is there are now a number of well-understood statements (and expressions within statements) that the analyzer now allows to trail a to-be-analyzed go or defer statement, but only if the analyzer believes those trailing statements cannot derail the loop variable from being modified when returning to the top of the loop.

Because statements trailing a go or defer statement are examined recursively, this loop capture is also now properly flagged:

	var processMore []string
	for _, s := range []string{"one", "two", "three"} {
		go func() { process(s) }() // wrong: s is a captured loop variable
		switch s {
		case "two":
			continue
		default:
			processMore = append(processMore, s)
		}
	}

The analyzer gives up on statements and expressions that it does not understand, which hopefully makes the approach tractable while maintaining no false positives.

For example, the analyzer properly gives up on the following, which is also an example of why we do not report captured loop variables with a go statement trailed by something that waits, might wait, or we can't prove won't wait. (Example adapted from #21412 (comment)).

	for i := range "loop" {
		wg := &sync.WaitGroup{}
		wg.Add(1)
		go func() {
			defer wg.Done()
			print(i) // use of i is correctly not flagged
		}()
		wg.Wait()
	}

In that example, the func literal always completes execution prior to the loop variable changing. The analyzer does not have any explicit knowledge of WaitGroups in order to disallow them. Instead, it gives up when it sees the wg.Wait call, which is not something it understands.

Panics

For now, it does not allow trailing panics.

It is reasonable to debate whether or not it should flag:

	for i := range "loop" {
		go func() { print(i) }()
		panic("this panic means the loop variable will never change")
	}

or:

func _(q int) {
	for i := range "loop" {
		go func() { print(i) }()
		_ = i / q // possible panic means loop var might never change
	}
}

Right now, the analyzer purposefully will not flag either of those because of the explicit panic and the possible panic.

Mainly in the interests of incremental progress with a more conservative approach to start, my vote would be to defer to a later issue the question of whether it should or should not allow any trailing explicit panics or possible panics (and if so, in which circumstances), though of course open to discussing now if people prefer.

CC @adonovan, @findleyr, @timothy-king

@thepudds
Copy link
Contributor Author

thepudds commented Dec 8, 2022

Question: proposal process?

Should this go through the proposal process?

In #16520, Russ replied to a related question:

Given that this is improving the precision of an existing vet check, without introducing any new false positives, I don't believe it needs to go through the proposal process.

The context of that answer was https://go.dev/cl/452155. This new CL might be viewed as a larger change than CL 452155, so happy to send this through the proposal review process if that is better.

@timothy-king
Copy link
Contributor

timothy-king commented Dec 8, 2022

An additional detail is the roll out plan. For 1-2 release cycles it would be good to gate this new feature behind a flag. That allows infrastructure to catch up and for additional testing/adjustments to be done. Related #57001 but mostly out of scope.

For panics, I feel like the challenging case to decide what to do with p.i++ when p is a pointer. The loopclosure does not try to know if a pointer may be nil or not and adding that reasoning seems like a bit too much. So are we going to stop at any dereference or may panic expression? "Must reach the iteration of the loop or panics without form of synchronization" seems like a plausible criteria with low false positives. This criteria can fail when panicking is intended to achieve control flow (weird but well defined). But I think we can "go backwards" across implicit panics but not calls to the panic builtin. This allows for indexing into slices, printing fields from a struct pointer, etc.

This leads to my opinion for whether a proposal is needed, if we go back across any may panics we probably want to have this be a proposal. If the proposal committee feels this is not needed they can always remove it from the process.

The analyzer does not have any explicit knowledge of WaitGroups in order to disallow them. Instead, it gives up when it sees the wg.Wait call, which is not something it understands.

To be a bit more explicit, any function or method call expression that is not modeled by the Analyzer should probably not be stepped through. This would be an allowlist style approach.

This brings us to fmt.Print functions. These trailing at the end of a loop is a common and important case. Do we just trust that String() functions are reasonable, i.e. terminate, are doing no weird lock synchronization, and only have implicit panics? That sounds okay on the surface to me. I am hesitant to try to reason more and only do this to on String() methods we deduce are okay. vet checks need to make sense to users and why one fmt.Print is okay on a *p but not on an any(*p) is not is too insider knowledge.

@thepudds
Copy link
Contributor Author

thepudds commented Dec 8, 2022

An additional detail is the roll out plan. For 1-2 release cycles it would be good to gate this new feature behind a flag.

In the current WIP CL, it is gated behind a flag. My understanding based on what @findleyr did for loopclosure analysis of t.Parallel is it can be disabled in vet, but enabled in gopls, which then gives an incremental rollout, which would be nice. That said, it could also be off in vet for 1-2 release cycles. (The WIP code currently has a TODO that mentions enabling for Go 1.21, which might be too fast).

For panics, I feel like the challenging case to decide what to do with p.i++ when p is a pointer. The loopclosure does not try to know if a pointer may be nil or not and adding that reasoning seems like a bit too much.

FWIW, the current WIP CL attempts to allow p.i++ when p is not a pointer, and attempts to disallow p.i++ when p is a pointer.

This leads to my opinion for whether a proposal is needed, if we go back across any may panics we probably want to have this be a proposal. If the proposal committee feels this is not needed they can always remove it from the process.

I was somewhat hoping to defer the question for now about panics in the hopes of possibly first landing something more conservative. That said, perhaps better to discuss holistically first before landing anything.

To be a bit more explicit, any function or method call expression that is not modeled by the Analyzer should probably not be stepped through. This would be an allowlist style approach.

Yes, the current approach in the WIP CL is an "allow list" approach.

This brings us to fmt.Print functions. These trailing at the end of a loop is a common and important case. Do we just trust that String() functions are reasonable, i.e. terminate, are doing no weird lock synchronization, and only have implicit panics?

The WIP CL currently has a set of TODOs related to that:

  // TODO: consider a possibly large "allow list" of stdlib funcs, such as strings.Contains.
  // TODO: consider allowing println, print, although this would require updating tests that use print now.
  // TODO: consider allowing fmt.Println and similar when arguments are all basic types, or otherwise 
  // restricted from having String, Error, Format, [...] methods

I think those could be viable approaches. Not mentioned explicitly there, but the intent at least at first would be to disallow pointers that could panic under any use of those functions.

@gopherbot
Copy link

Change https://go.dev/cl/455195 mentions this issue: go/analysis/passes/loopclosure: allow certain well-understood statements to trail problematic go and defer statements

@findleyr
Copy link
Contributor

findleyr commented Dec 8, 2022

Should this go through the proposal process?

Without completely absorbing all of the proposed changes, my understanding is that as long as we can preserve a ~0% false positive rate, we can make these changes without a proposal (though if they get really complicated, perhaps we should consider escalating, if only for discussion).

My concern with these extensions to the loopclosure analyzer is that they make it harder and harder to describe precisely what the analyzer checks. In particular, it is harder to explain why one bug was caught and another was not. With that said, as long as the analyzer avoids false positives this is less important: it either finds a bug, or it doesn't.

In the current WIP CL, it is gated behind a flag.

FWIW, this is also important to make it easier to roll out these changes inside Google. When introducing new checks like this, we should leave the flag even after its default value changes to "true". (I didn't do that for the parallel subtest analysis, and it's been a bit harder to roll that out as a result).

@seankhliao seankhliao added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Dec 8, 2022
@thepudds thepudds self-assigned this Dec 9, 2022
@thepudds
Copy link
Contributor Author

thepudds commented Dec 16, 2022

Related -- given the new machinery, I think it is also likely tractable in the future to handle basic versions of some additional cases like the following:

Example 1: loop variable captured in func literal via method with pointer receiver

From #56010:

	for _, a := range alarms {
 		go a.Monitor(b)
 	}

(That is one of two visually similar examples from #56010. The intent would be to properly handle both cases).

Example 2: func literal stored in slice with captured loop var

From #16520:

	fn := []func(){}
	for _, s := range []string{"a", "b", "c"} {
		fn = append(fn, func() { fmt.Println(s) })
	}
	for _, f := range fn {
		f()
	}

Example 3: pointer to loop variable captured in slice

From #56010:

var all []*Item
for _, item := range items {
	all = append(all, &item)
}

Example 4: pointer to loop variable captured in map

From https://go.dev/cl/40937, which was a x/crypto/ssh/knownhosts bug:

	knownKeys := map[string]*KnownKey{}
	for _, l := range db.lines {
		if l.match(addrs) {
			typ := l.knownKey.Key.Type()
			if _, ok := knownKeys[typ]; !ok {
				knownKeys[typ] = &l.knownKey
			}
		}
	}

Without solving the general case, it could be following a similar approach of handling examples that are simple enough to understand explicitly while giving up on more complex examples. For example, slices can be "well understood" data structures, but only analyzed if they are used simply prior to determining they have been misused.

Aspirationally, I would like to take a run at also handling these examples, but I think those likely should be discussed in other issues.

@timothy-king
Copy link
Contributor

#57173 (comment) Example 1 is very doable.

Aspirationally, I would like to take a run at also handling these examples, but I think those likely should be discussed in other issues.

+1 to discussing in another issue. Examples 2, 3 and 4 all require an additional escape[/pointer] analysis to conclude that the lifetime of what you are storing the into always exceeds the lifetime of the loop. Even in the "basic" form you do need to worry about this. The current loopclosure analysis is using that go stores a global reference, defer stores a reference on the stack frame, etc. You might be okay if you are only storing into variables of known types that are scoped outside of the loop and no other write may have happened to a value of the same type in the "skippable tail".

FWIW examples 3 and 4 are not "loopclosure" (no closure) so vet would likely have these in another checker.

@adonovan
Copy link
Member

I propose we close this issue given that go1.22 will fix the longstanding problem that this analyzer detects. Of course it might be a year or two before every project has caught up, but it doesn't seem work investing more in loopclosure.

@thepudds
Copy link
Contributor Author

Hi @alandonovan, I definitely agree. Thanks again for the feedback on https://go.dev/cl/455195!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Analysis Issues related to static analysis (vet, x/tools/go/analysis) NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
6 participants