Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: Effective Go's "safelyDo" does not demonstrate safe use of recover #19070

Open
tombergan opened this issue Feb 13, 2017 · 6 comments
Open
Labels
Documentation NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@tombergan
Copy link
Contributor

Effective Go gives an example of a server that captures and logs panics so that a single panicking goroutine does not take down the entire server. Copying the example below:

func server(workChan <-chan *Work) {
    for work := range workChan {
        go safelyDo(work)
    }
}

func safelyDo(work *Work) {
    defer func() {
        if err := recover(); err != nil {
            log.Println("work failed:", err)
        }
    }()
    do(work)
}

This design is controversial and I recommend that the example be either removed or updated with caveats. In general, there is no guarantee that shared data structures are in a usable and safe state after a panic occurs. It is arguably better to let panics crash the program. Empirically, the above design has led to real bugs in production systems. Specific examples of how bugs can arise:

  1. A held mutex is not unlocked after the panic because the caller did not use defer m.Unlock(). This is a specific example of how do(work) may not be panic safe. Note that do(work) is especially vulnerable if it calls library code that is not panic safe.

  2. The panic is the result of an invariant violation in a shared data structure. Program execution should not continue because the shared data structure is in an invalid state.

  3. The panic is the result of a data race. Technically, program behavior is undefined from this point (especially if the data race involved an interface or slice).

Given these potential problems, I would argue that safelyDo is actually not safe. That said, the general design is not entirely invalid. It is possible to write a safe version of safelyDo, but to do so, one of the following properties must be guaranteed:

  1. safelyDo must only recover from explicitly-thrown panics that it knows about. An example of this is shown in the Compile regexp function just below the safelyDo example.

  2. The goroutines must not use any mutable shared state, i.e., safelyDo(a) touches completely disjoint state than safelyDo(b) for all a != b. This implies that safelyDo cannot invoke any libraries that use mutable shared state.

  3. The implementation of do(work) must be completely panic safe, including all library code used by the implementation.

@bradfitz bradfitz added this to the Go1.9Maybe milestone Feb 13, 2017
@bradfitz
Copy link
Contributor

/cc @robpike @rsc @broady

@bcmills
Copy link
Contributor

bcmills commented Feb 13, 2017

A held mutex is not unlocked after the panic because the caller did not use defer m.Unlock().

It's even more insidious than that. The program might be using channels for synchronization instead of mutexes, and while it's fairly easy to spot an m.Unlock() that ought to be deferred, that's much less obvious for many channel operations.

@minux
Copy link
Member

minux commented Feb 13, 2017 via email

@bcmills
Copy link
Contributor

bcmills commented Feb 13, 2017

Precedence for a bug-prone pattern does not make it any less bug-prone.

@minux
Copy link
Member

minux commented Feb 14, 2017 via email

@rsc rsc self-assigned this Feb 14, 2017
@rsc
Copy link
Contributor

rsc commented Feb 14, 2017

I will revise the docs (slightly). There's no need to bikeshed now. There will be plenty of bikeshedding on the CL itself I am sure.

@ianlancetaylor ianlancetaylor modified the milestones: Go1.10, Go1.9Maybe Aug 3, 2017
@rsc rsc modified the milestones: Go1.10, Go1.11 Dec 1, 2017
@ianlancetaylor ianlancetaylor modified the milestones: Go1.11, Unplanned Jun 23, 2018
@ianlancetaylor ianlancetaylor added the NeedsFix The path to resolution is known, but the work has not been done. label Jun 23, 2018
@rsc rsc removed their assignment Jun 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

6 participants