cmd/vet: add check for sync.WaitGroup abuse #18022

dsnet · 2016-11-22T20:45:25Z

The API for WaitGroup is very easy for people to misuse. The documentation for Add says:

calls to Add should execute before the statement creating the goroutine or other event to be waited for.

However, it is very common to see this incorrect pattern:

var wg sync.WaitGroup
defer wg.Wait()

go func() {
	wg.Add(1)
	defer wg.Done()
}()

This usage is fundamentally racy and probably does not do what the user wanted. Worse yet, is that it is not detectable by the race detector.

Since the above pattern is common, I propose that we add a method Go that essentially does the Add(1) and subsequent call to Done in the correct way. That is:

func (wg *WaitGroup) Go(f func()) {
	wg.Add(1)
	go func() {
		defer wg.Done()
		f()
	}()
}

The text was updated successfully, but these errors were encountered:

cznic · 2016-11-23T07:17:22Z

I don't like the idea. It's IMHO perhaps a task for go vet, if not implemented already. Also, the proposed method would add a(nother) closure layer when the go-ed function has parameters.

minux · 2016-11-23T08:11:12Z

I think this has been proposed before (probably on the mailing lists) and rejected.

dsnet · 2016-11-23T08:18:57Z

A go vet check seems pretty reasonable. I just tried it right now on the following:

func main() {
	var wg sync.WaitGroup
	defer wg.Wait()
	go func() {
		wg.Add(1)
		defer wg.Done()
	}()
}

and vet doesn't report anything.

In terms of vet's requirements:

Correctness: The problem is clearly a race.
Frequency: My gut feeling is that this is fairly common. A number of internal bugs that I've fixed is of this nature. So it subjective feels common enough. I don't have hard numbers.
Precision: I don't have an algorithm in mind that can accurately identify the pattern and I can imagine some number of false positives.

@dominikh , staticcheck does report a problem:
main.go:10:3: should call wg.Add(1) before starting the goroutine to avoid a race (SA2000)
I'm wondering how accurate the check is and whether it is worth adding to vet.

mvdan · 2016-11-23T10:21:17Z

As far as the proposal goes, I don't like how it limits the func signature to func().

dominikh · 2016-11-23T16:57:41Z

@dsnet The check in staticcheck has no (known) false positives. It shouldn't have a significant number of false negatives, either. The implementation is a simple pattern-based check, detecting go with function literals where the first statement is a call to wg.Add – this avoids flagging wg.Add calls further down the goroutine, which tend to be valid uses.

I'm -1 on the proposed Go function. I'd prefer not having to read code with an unnecessary level of nesting that looks callback-esque and reminds me of JavaScript.

mvdan · 2016-11-23T17:02:43Z

@dominikh I don't see any extra level of nesting here, though (assuming any func signature is allowed).

To be nitpicky, another thing that stands out from the proposal is how wg.Go() will create a goroutine even though go is never directly used by the user. I don't know if the standard library does this anywhere else, but I would prefer if it was left explicit.

dominikh · 2016-11-23T17:06:56Z

@mvdan The extra level of nesting would come from a predicted usage that looks something like this:

wg.Go(func() {
  // do stuff
})

as opposed to

go func() {
  // do stuff
}()

Admittedly the same level of indentation, but syntactically it's one extra level of nesting.

mvdan · 2016-11-23T17:10:05Z

Ah yes, I was thinking indentation there.

cznic · 2016-11-23T17:12:33Z

The problematic case is that

wg.Add(1)
go func(i int) { ... }(42)

// becomes

wg.Go(func() {
        go func(i int) { ... }(42)
})

mvdan · 2016-11-23T17:14:23Z

@cznic if you mean without the extra go, this would be solved if the restriction on the func() signature was removed.

dominikh · 2016-11-23T17:17:06Z

@mvdan Do you mean by allowing something like the following?

wg.Go(func(x, y int) { ...}, v1, v2)

IMHO that's way too much interface{} and not enough type safety.

mattn · 2016-11-23T17:18:13Z

panic is recovered?

mvdan · 2016-11-23T17:18:56Z

@dominikh true; I was simply pointing at the issue without contemplating a solution :)

rsc · 2016-11-28T21:18:57Z

The API change here has the problems identified above with argument evaluation. Also, in general the libraries do not typically phrase functionality in terms of callbacks. If we're going to start using callbacks broadly, that should be a separate decision (and not one to make today). For both these reasons, it seems like .Go is not a clear win.

It would be nice to have a vet check that we trust (no false positives). Perhaps it is enough to look for two statements

wg.Add(1)
defer wg.Done()

back to back and reject that always. Thoughts about how to make vet catch this reliably?

renannprado · 2016-12-04T04:28:11Z

I agree that vet is better place for this. The proposed API reminds of JavaScript, which will force us many times to wrap the code or function within a function with no arguments, while you could have just go func()....
Still nothing can stop you from creating such a helper methid, even though I don't see the need for it.

rsc · 2016-12-05T21:07:51Z

It sounds like we are deciding to make go vet check this and not add new API here. Any arguments against that?

dsnet · 2016-12-05T21:09:19Z

SGTM

rsc · 2020-08-05T17:36:24Z

I've added this proposal to the proposal process bin, but it's blocked on someone figuring out how to implement a useful check. Is anyone interested in doing that?

dominikh · 2020-08-06T08:04:16Z

Staticcheck has a fairly trivial check: for a GoStmt of a FuncLit, if the first statement in the FuncLit is a call to (*sync.WaitGroup).Add, we flag it. That has potential for false positives, but none have been reported in all the years that the check has existed.

The check could be trivially hardened by

looking for an immediately following defer of (*sync.WaitGroup).Done and comparing the two receivers.
checking that the argument to Add is 1, not some other number.

Edit: which is pretty much what you have suggested in #18022 (comment)

rsc · 2020-08-12T17:41:33Z

Thanks for the info @dominikh.
Would it be OK with you for vet to do the same thing?
Would you want to send the code?

lpxz · 2024-12-12T18:42:19Z

https://go-mod-viewer.appspot.com/github.com/getlantern/eventual@v1.0.0/eventual_test.go#L102 is a delightful muddle! ;-)

ok, the improved linter will still false positive on this, until we use inspect() to recursively look into all sibling nodes (may not be worthy the machine cost)

But I'm not sure that all of the new positives are true: Consider this one:
https://go-mod-viewer.appspot.com/github.com/status-im/status-go@v1.1.0/protocol/messenger_handler.go#L1762

m.shutdownWaitGroup.Add(1)
defer m.shutdownWaitGroup.Done()

I think it is true positive, given this pattern, the goroutine may not start yet, when the shutdownWaitGroup.Wait() is reached, leaving that goroutine leaky. Did I miss some subtlety here?

A relevant question, what is the next step then? :)

adonovan · 2024-12-12T18:52:46Z

https://go-mod-viewer.appspot.com/github.com/getlantern/eventual@v1.0.0/eventual_test.go#L102 is a delightful muddle

To be clear, I meant that the code here is wrong in more ways than I can count; your linter algorithm is absolutely right to flag it.

https://go-mod-viewer.appspot.com/github.com/status-im/status-go@v1.1.0/protocol/messenger_handler.go#L1762
m.shutdownWaitGroup.Add(1)
defer m.shutdownWaitGroup.Done()
I think it is true positive, given this pattern, the goroutine may not start yet, when the shutdownWaitGroup.Wait() is reached, leaving that goroutine leaky. Did I miss some subtlety here?

A WaitGroup is a counting semaphore. The most common use is to count unfinished goroutines, but in this case I think it is counting various "holds" that prevent the program from completing its shutdown (like electricians each padlocking the main breaker so that they know they are safe while they are working). That is, the goroutines are already running, and any of them may temporarily increment the counter, far from where they are created. Presumably at the end the main goroutine will wait for the counter to fall to zero.

The clue to recognizing this pattern is that there is no nearby Add; go func() { ... Done; } (); Wait structure (or bungled attempt at that structure).

A relevant question, what is the next step then? :)

We should probably reclassify the new positives in light of these observations. Ideally the false positive rate should be 5% or less.

Groxx · 2024-12-13T02:24:56Z

re https://go-mod-viewer.appspot.com/github.com/status-im/status-go@v1.1.0/protocol/messenger_handler.go#L1762 :

Since it's a semaphore that is incorrect to use to Wait() and then Add() and nothing seems to guarantee that goroutine is scheduled before shutdown, I think that can be called "true". Nothing related to that goroutine is passed anywhere after it is created, nor any references to it (afaict) in that method that could lead to "wait for it to start m. downloadAndImportHistoryArchives before allowing shutdown".

(obviously this is worth verifying, but it seems extremely unlikely to be guaranteed-safe to me)

lpxz · 2024-12-13T04:23:44Z

I agree with @Groxx on this,

If the goroutines want to declare they are live, why do not they do so at the beginning or even better before goroutine starts. I am not sure why they do so in the middle.

By flagging this, we suggest developers to move add to an earlier point. This can only improve quality, at least will not degrade quality.

I may miss some points, please let me know.

egonelbre · 2024-12-13T09:04:54Z

But I'm not sure that all of the new positives are true: Consider this one: https://go-mod-viewer.appspot.com/github.com/status-im/status-go@v1.1.0/protocol/messenger_handler.go#L1762

That's definitely a bug. (As other people already deduced). The easiest way to see this is to imagine a time.Sleep(10*time.Second) before line m.shutdownWaitGroup.Add(1) or task.Waiter.Add(1)... then any task.Waiter.Wait() or m.shutdownWaitGroup.Wait() may miss the first add and hence the wait is unnecessary.

adonovan · 2024-12-13T13:08:08Z

Thanks, you have all convinced me that this is indeed a bug (and thus a true positive of the new algorithm).

@lpxz, would you like to send me a CL to incorporate the new algorithm into the waitgroup analyzer (currently in gopls, but eventually to be promoted to vet too)? The CL should included a test for each distinct case you encountered (and false positives too, since they document the limitations). Thanks for improving this analyzer!

lpxz · 2024-12-13T13:57:00Z

@adonovan great, happy to make the contributions!

Will create a CL and add tests.
It may not be ready until early next January, let me know if this is more urgent.

Peng

aclements · 2025-01-16T00:54:59Z

Based on the discussion above, this proposal seems like a likely accept.

The proposal is to add a vet check for sync.WaitGroup misuse where Add can race with Wait, such as in:

var wg sync.WaitGroup
defer wg.Wait()

go func() {
        wg.Add(1)
        defer wg.Done()
}()

While complex heuristics are certainly possible here, for now we're going to limit this to checking for wg.Add on the first line of a closure. This has a zero false positive rate on the data we sampled, and is simple to implement.

lpxz · 2025-01-16T01:31:41Z

I am in progress of creating the cl, which has no false positive, fewer false negatives and is also simple.

Feel free to let me know if the final decision is to move on with the first line approach which is a fine approach).

adonovan · 2025-01-16T17:08:32Z

Feel free to let me know if the final decision is to move on with the first line approach which is a fine approach).

This proposal is about adding the basic functionality into vet, which can likely proceed for go1.25.

Independent of that, you should feel free to improve the algorithm; any improvements can be used immediately by gopls, and eventually by vet (without needing a separate proposal process), assuming they meet the usual criteria for precision and frequency.

aclements · 2025-01-23T03:00:54Z

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— aclements for the proposal review group

The proposal is to add a vet check for sync.WaitGroup misuse where Add can race with Wait, such as in:

var wg sync.WaitGroup
defer wg.Wait()

go func() {
        wg.Add(1)
        defer wg.Done()
}()

While complex heuristics are certainly possible here, for now we're going to limit this to checking for wg.Add on the first line of a closure. This has a zero false positive rate on the data we sampled, and is simple to implement.

adonovan · 2025-03-26T22:11:49Z

Related: #63796 takes up the proposal originally made in this issue before (in Dec 2016) the API change was rejected and this issue was repurposed for the vet check.

gopherbot · 2025-03-28T22:09:45Z

Change https://go.dev/cl/661518 mentions this issue: cmd/vendor: update golang.org/x/tools to v0.31.1-0.20250328151535-a857356d5cc5

gopherbot · 2025-03-28T22:09:46Z

Change https://go.dev/cl/661519 mentions this issue: cmd/vet: add waitgroup analyzer

…7356d5cc5 Also, sys@v0.31.1. Updates #18022 Change-Id: I15a6d1979cc1e71d3065bc50f09dc8d3f6c6cdc0 Reviewed-on: https://go-review.googlesource.com/c/go/+/661518 Auto-Submit: Alan Donovan <adonovan@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org> Commit-Queue: Alan Donovan <adonovan@google.com>

dsnet added the Proposal label Nov 22, 2016

dsnet added this to the Proposal milestone Nov 22, 2016

dsnet mentioned this issue Dec 2, 2016

testing: add T.Context() accessor #16221

Closed

dsnet changed the title ~~proposal: sync: add Go method to WaitGroup~~ cmd/vet: add check for sync.WaitGroup abuse Dec 5, 2016

dsnet removed the Proposal label Dec 9, 2016

dsnet mentioned this issue Nov 6, 2017

proposal: testing: add T.Context #18368

Closed

dsnet mentioned this issue Jan 24, 2018

sync: Proposal: Add a Go() function to sync.WaitGroup #23538

Closed

nhooyr mentioned this issue Mar 5, 2019

testing: add (*T).Deadline #28135

Closed

bcmills mentioned this issue Jun 26, 2020

proposal: sync: add WaitGroup.Go method #39863

Closed

rsc mentioned this issue Aug 5, 2020

proposal: review meeting minutes #33502

Open

aclements moved this from Active to Likely Accept in Proposals Jan 16, 2025

aclements added the Proposal-FinalCommentPeriod label Jan 16, 2025

aclements moved this from Likely Accept to Accepted in Proposals Jan 23, 2025

aclements changed the title ~~proposal: cmd/vet: add check for sync.WaitGroup abuse~~ cmd/vet: add check for sync.WaitGroup abuse Jan 23, 2025

aclements modified the milestones: Proposal, Backlog Jan 23, 2025

aclements added Proposal-Accepted and removed Proposal-FinalCommentPeriod labels Jan 23, 2025

dsnet mentioned this issue Jan 27, 2025

proposal: sync/v2: new package #71076

Open

dmitshur added the FixPending label Mar 29, 2025

dmitshur modified the milestones: Backlog, Go1.25 Mar 29, 2025

gopherbot closed this as completed in dceb77a Apr 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmd/vet: add check for sync.WaitGroup abuse #18022

cmd/vet: add check for sync.WaitGroup abuse #18022

dsnet commented Nov 22, 2016 •

edited

Loading

cznic commented Nov 23, 2016

minux commented Nov 23, 2016 via email

dsnet commented Nov 23, 2016

mvdan commented Nov 23, 2016

dominikh commented Nov 23, 2016

mvdan commented Nov 23, 2016

dominikh commented Nov 23, 2016

mvdan commented Nov 23, 2016

cznic commented Nov 23, 2016

mvdan commented Nov 23, 2016

dominikh commented Nov 23, 2016

mattn commented Nov 23, 2016

mvdan commented Nov 23, 2016

rsc commented Nov 28, 2016

renannprado commented Dec 4, 2016

rsc commented Dec 5, 2016

dsnet commented Dec 5, 2016

rsc commented Aug 5, 2020

dominikh commented Aug 6, 2020 •

edited

Loading

rsc commented Aug 12, 2020

lpxz commented Dec 12, 2024

adonovan commented Dec 12, 2024 •

edited

Loading

Groxx commented Dec 13, 2024 •

edited

Loading

lpxz commented Dec 13, 2024 •

edited

Loading

egonelbre commented Dec 13, 2024

adonovan commented Dec 13, 2024 •

edited

Loading

lpxz commented Dec 13, 2024

aclements commented Jan 16, 2025

lpxz commented Jan 16, 2025

adonovan commented Jan 16, 2025

aclements commented Jan 23, 2025

adonovan commented Mar 26, 2025 •

edited

Loading

gopherbot commented Mar 28, 2025

gopherbot commented Mar 28, 2025

cmd/vet: add check for sync.WaitGroup abuse #18022

cmd/vet: add check for sync.WaitGroup abuse #18022

Comments

dsnet commented Nov 22, 2016 • edited Loading

cznic commented Nov 23, 2016

minux commented Nov 23, 2016 via email

dsnet commented Nov 23, 2016

mvdan commented Nov 23, 2016

dominikh commented Nov 23, 2016

mvdan commented Nov 23, 2016

dominikh commented Nov 23, 2016

mvdan commented Nov 23, 2016

cznic commented Nov 23, 2016

mvdan commented Nov 23, 2016

dominikh commented Nov 23, 2016

mattn commented Nov 23, 2016

mvdan commented Nov 23, 2016

rsc commented Nov 28, 2016

renannprado commented Dec 4, 2016

rsc commented Dec 5, 2016

dsnet commented Dec 5, 2016

rsc commented Aug 5, 2020

dominikh commented Aug 6, 2020 • edited Loading

rsc commented Aug 12, 2020

lpxz commented Dec 12, 2024

adonovan commented Dec 12, 2024 • edited Loading

Groxx commented Dec 13, 2024 • edited Loading

lpxz commented Dec 13, 2024 • edited Loading

egonelbre commented Dec 13, 2024

adonovan commented Dec 13, 2024 • edited Loading

lpxz commented Dec 13, 2024

aclements commented Jan 16, 2025

lpxz commented Jan 16, 2025

adonovan commented Jan 16, 2025

aclements commented Jan 23, 2025

adonovan commented Mar 26, 2025 • edited Loading

gopherbot commented Mar 28, 2025

gopherbot commented Mar 28, 2025

dsnet commented Nov 22, 2016 •

edited

Loading

dominikh commented Aug 6, 2020 •

edited

Loading

adonovan commented Dec 12, 2024 •

edited

Loading

Groxx commented Dec 13, 2024 •

edited

Loading

lpxz commented Dec 13, 2024 •

edited

Loading

adonovan commented Dec 13, 2024 •

edited

Loading

adonovan commented Mar 26, 2025 •

edited

Loading