-
Notifications
You must be signed in to change notification settings - Fork 18k
database/sql: add extra locking when calling driver #21117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
/cc @kardianos @bradfitz |
In my case I have a pool of database connections. I open a transaction and run queries. If I use BeginTx() I get an crash panic overtime. Part of why I need a transaction is so I can use postgresql statement_timeout. The only way to use it with a connection pool is to use a transaction. If I just use Begin() without context it never crashes. In my case though I need to be able to cancel the transaction if conditions are met in another goroutine. |
Hey y'all, It looks like driver libs are still tossing out |
This might not be a documentation issue but a driver issue. In database/sql/sql.go
I need to do a full evaluation of this solution. But you could try it out and see if the issue goes away. |
I applied the patch and that seems to fix the panic and satisfy the race detector! Now that the log lines aren't mostly race detector output/panic stacktrace I've started noticing that quite a lot of the errors that come back are Here's some sample data after running the repro for a few minutes:
|
@blinsay Good catch. Next step will be to determine why it is reporting no rows sometimes. I suspect it may be driver issue of returning the wrong error (NoRows rather then ctx.Error), but I haven't looked. |
Change https://golang.org/cl/65731 mentions this issue: |
Once fixed can this be released in go 1.9.1 or 1.9.2? This is causing me problems in my application and waiting for 1.10 would be hard. Thank you |
I'm not that familiar with the Go release process. This issue appears to be fixed but sitting in Code Review. What comes next? Is there any hope of getting this in a 1.9.x release? |
I've marked this for consideration for a 1.9 release. |
Every time we touch the locking in database/sql it seems like we introduce a deadlock. What is the condition under which the problem being fixed happens? Are we sure that the expected benefit here outweighs the very real chance of introducing a deadlock in some other case? |
Also, is this a regression from Go 1.8? If not, then we really should leave it alone and let the ordinary Go 1.10 testing happen. |
@rsc I don't believe this is a regression from go1.8. The benefit is it prevents real races in real programs. In this case the risk of deadlock seems non-existent. Unlike previous issues, this just using the existing locking in *driverConn as it is meant to be used, locking for the short period of time around driver connection calls. The only way this could introduce a deadlock is if a driver calls some public sql API referencing the same connection from within the driver.Next, which would be a blatant driver issue. I offer no opinion on when it should be released. |
Change https://golang.org/cl/71450 mentions this issue: |
Change https://golang.org/cl/71433 mentions this issue: |
This reverts commit 897080d. Reason for revert: Fails to fix all the locking issues. Updates #21117 Change-Id: I6fc9cb7897244d6e1af78c089a2bf383258ec049 Reviewed-on: https://go-review.googlesource.com/71450 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Change https://golang.org/cl/71510 mentions this issue: |
@blinsay and @chrispassas and anyone else seeing these crashes: I am assuming you are using Linux. Can you please try https://golang-release-staging.storage.googleapis.com/go1.9.2/go1.9.2rc2.linux-amd64.tar.gz and see if the problem disappears? Thank you. |
@rsc I just tried with 1.9.1 and can reproduce the crash reliably in a few seconds. Using 1.9.2rc2 I've let it run for a few minutes 3 times with 0 crashes. |
@chrispassas, thanks very much. Based on that, I'm comfortable with including that fix in the final Go 1.9.2. |
@rsc Thank you, I'm looking forward to having it! |
OK, I am going to approve the cherry-pick to Go 1.9.2 over in #22314 and leave this bug open to track additional locking fixes in Go 1.10. |
…g dc in Next Database drivers should be called from a single goroutine to ease driver's design. If a driver chooses to handle context cancels internally it may do so. The sql package violated this agreement when calling Next or NextResultSet. It was possible for a concurrent rollback triggered from a context cancel to call a Tx.Rollback (which takes a driver connection lock) while a Rows.Next is in progress (which does not tack the driver connection lock). The current internal design of the sql package is each call takes roughly two locks: a closemu lock which prevents an disposing of internal resources (assigning nil or removing from lists) and a driver connection lock that prevents calling driver code from multiple goroutines. Fixes #21117 Change-Id: Ie340dc752a503089c27f57ffd43e191534829360 Reviewed-on: https://go-review.googlesource.com/65731 Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-on: https://go-review.googlesource.com/71510 Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Daniel Theophanes <kardianos@gmail.com>
I'm reopening this because the commit that auto-closed it was reverted afterwards by 292366e
AFAICT this issue has not been addressed for 1.10. Feel free to close if that's not correct. (I was brought here by some serious Tx+context races/crashes in both 1.9.2 and 1.10beta1.) |
The fix that went into version 1.9.2 did fix the crashes for QueryContext. If I use ExecContext and PrepareContext I do get crashes fairly often. It looked like right after 1.9.2 was tagged they went back and fixed the muteness for 1.10. If I can get some time this week I'll do some test with 1.10 beta 1 to see if it still crashes. |
The fix for this did get in shortly after that revert. Yes, please verify @chrispassas , but as far as I know this is fixed. |
@kardianos can you please link the fix here? |
@kardianos thanks, I missed that. After running my crashing code with on tip + CL 85175 (for #23208) I don't see any issues. I'm going to re-close this as we don't seem to have any reason to believe that there's still a locking issue here. We can revisit if @chrispassas finds something. |
I have tested version 1.9.2 and QueryContext is good but ExecContext will trigger a depending how often your calling it. I still need to test that 1.10-beta1 ExecContext is OK. |
What version of Go are you using (
go version
)?go version go1.8.3 darwin/amd64
What did you do?
Unlike
Conn
orStmt
there's no guidance onTx
indicating how the stdlib uses implementations. This seems to cause bugs in driver implementations, as aTx
is expected to handle concurrent calls from multiple goroutines unlike most of the other interfaces in the package.This came up as I was debugging a panic inside of lib/pq and discovered it was a data race due to the authors not assuming that their
Tx
would have to deal with calls from multiple goroutines (the repro is here. It slowly adjusts timeouts to try and make the Context time out as results are coming back). Out of curiosity, I looked atgo-sql-driver/mysql
and the lack of a race there seems like a happy accident - the Context expiring occasionally causes an internal error to bubble up, since the authors appear to aggressively check their invariants.I'm not sure whether this is a documentation problem or if there's something database/sql could do to make life easier for driver implementors.
The text was updated successfully, but these errors were encountered: