Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

database/sql: ExecContext leaks goroutine on panic. #21248

Closed
carl-mastrangelo opened this issue Aug 1, 2017 · 7 comments
Closed

database/sql: ExecContext leaks goroutine on panic. #21248

carl-mastrangelo opened this issue Aug 1, 2017 · 7 comments
Milestone

Comments

@carl-mastrangelo
Copy link
Contributor

What version of Go are you using (go version)?

go version devel +04c446f Mon Jul 3 15:39:44 2017 -0700 linux/amd64

What operating system and processor architecture are you using (go env)?

~

What did you do?

While playing around with the sql package, I tried putting a panic in namedValueToValue to see how it was called (I was investigating an unrelated bug). Instead of the panic unwinding the stack, the code hung and never returned. Looking at a stacktrace shows that sql.Tx.Rollback() tries to get write access to the closemu lock. It does this as part of a defer in my application code while the panic is propagating up.

The lock is never acquired because RUnlock has not yet been called on closemu. Investigating where this should have happened reveals Stmt.ExecContext. Unlike pretty much every other cleanup idiom used in Go, this one leavea a brief window before properly doing cleanup:

func (s *Stmt) ExecContext(ctx context.Context, args ...interface{}) (Result, error) {
	s.closemu.RLock()
	defer s.closemu.RUnlock()

	var res Result
	strategy := cachedOrNewConn
	for i := 0; i < maxBadConnRetries+1; i++ {
		if i == maxBadConnRetries {
			strategy = alwaysNewConn
		}
		dc, releaseConn, ds, err := s.connStmt(ctx, strategy)
		if err != nil {
			if err == driver.ErrBadConn {
				continue
			}
			return nil, err
		}

		res, err = resultFromStatement(ctx, dc.ci, ds, args...)
		releaseConn(err)
		if err != driver.ErrBadConn {
			return res, err
		}
	}
	return nil, driver.ErrBadConn
}

When there is a panic in resultFromStatement, releaseConn is never called. I put the panic there in my testing, but it's still pretty bad practice to not immediately schedule the cleanup after construction. Deferring cleanup immediately after checking the returned error from s.connStmt lets the panic propagate properly.

@gopherbot
Copy link

Change https://golang.org/cl/52370 mentions this issue: database/sql: clean up Tx when panicking

@odeke-em odeke-em added this to the Go1.10 milestone Aug 1, 2017
@kardianos
Copy link
Contributor

@carl-mastrangelo Thank you for the report. I apologize for the delay in getting to this issue.

I'd like to back up a bit. Why do you consider the current behavior a bug? Or more exactly, can you describe how this behavior would be triggered outside of manually adding a panic?

@carl-mastrangelo
Copy link
Contributor Author

@kardianos in the event of a panic, the connection is not properly released. I manually added one, but they can occur naturally. Better to program defensively.

@kardianos
Copy link
Contributor

I'm willing to listen to another opinion on this subject.

Unless I can see how a panic can happen in this situation I'm inclined to close this issue.

@carl-mastrangelo
Copy link
Contributor Author

@kardianos what? The code is already written. It brings the sql code in the style of the rest of Go, and fixes the possibility of a panic. What's your hesitation? Perhaps someone else can review it?

@kardianos
Copy link
Contributor

I'm not seeing an error to fix here.
I'm not seeing the "defense" her as something needing to be defended against.

The proposed CL adds another closure which is another allocation and another function call. I would also argue it is slightly more complicated then the original code (although I like the switch statement).

Let me explain my reasoning. The following is common Go code:

f, err := os.Open("myfile.txt")
if err != nil {
    return err
}
_, err = io.Copy(os.Stdout, f) // This will not panic.
f.Close() // So this is safe.
if err != nil {
   return err
}

Or this example:

mutex.Lock()
sharedMap["foo"] = 42 // This will not panic.
mutex.Unlock() // So this is safe.

Yes, we commonly use defer to ensure resources are closed. However a deeper style/convention is to ensure functions don't panic in the first place. Go makes this easy to check and ensure. Panics don't just "happen". One way the sql package ensures locks are properly used is to use defer when unlocking locks around driver (user package) code. So in principle I agree with you. But defer is cheep and not free. Allocation is cheep, but not free. I have no issue making an allocation when needed, but unlike other languages panics / exceptions are tightly controlled.

On another note, you are not responding to my questions, but just saying "Why? Just merge it already, it is already there and it is better." This is not a productive way forward in any conversation. I've already told you my hesitation. I've now expanded on my hesitation. You are free to refer to someone else to this issue, but I would prefer to first direct your attention and reply to my questions and statements. If we discover they are false or without merit I will happily reverse my decision.

@carl-mastrangelo
Copy link
Contributor Author

carl-mastrangelo commented Sep 11, 2017

I'm not seeing an error to fix here.

There is no error, just a sharp edge. That is what needs to be fixed.

I'm not seeing the "defense" her as something needing to be defended against.

Moving cleanup code as close to the creation point is good style, in any language.

However a deeper style/convention is to ensure functions don't panic in the first place.

I think you are intentionally missing the intent of this change. Unless you can statically prove there are no panics possible, why take the chance? Furthermore, it is closer stylistically to the rest of the sql package. Why be a cowboy?

The proposed CL adds another closure which is another allocation and another function call.

That is the worst possible reason. If you look at my other contributions to the Go project, they have all been performance related. The change as written will not have a measurable impact.

On another note, you are not responding to my questions, but just saying "Why? Just merge it already, it is already there and it is better."

Because you don't see the problem you act on behalf of the whole project. Did you look at the code at all? Even if it didn't match up with anything I said prior, you can see the code is structured better:

  • The retry logic is separate from the execution logic.
  • The code is now resilient to unforseen panics.
  • Even if panics are provably impossible today, code changes; they could be possible in the future.
  • The maxBadConnRetries + 1 wart is gone.

You are free to refer to someone else to this issue

I would actually. Who is the next most appropriate reviewer for the database/sql package?

@golang golang locked and limited conversation to collaborators Sep 11, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants