proposal: database/sql: add SetMaxBadConnRetries #52886

jinzhu · 2022-05-13T09:54:29Z

This method adds the ability to sets the number of maximum retries if the driver returns driver.ErrBadConn

Currently, database/sql has a hardcoded maxBadConnRetries, which is set to 2, and it will rerun the SQL if the driver returns a bad connection error, which is causing problems in many cases.

For example, if we have a long-running INSERT SQL, the first try might be killed due to TCP timeout or whatever reason, and the next retry will insert duplicated data, which is confusing to end-users.

With this option, we can set the max bad retry to 0, and write our retry logic to avoid the issue. and it also opens the opportunity to do some optimizations like https://cacm.acm.org/magazines/2013/2/160173-the-tail-at-scale/fulltext

Related issues:

#48309
#15608

Implementation:

I have tried to submit a possible implementation: https://go-review.googlesource.com/c/go/+/404935

gopherbot · 2022-05-13T10:29:38Z

Change https://go.dev/cl/404935 mentions this issue: database/sql: add SetMaxBadConnRetries

ianlancetaylor · 2022-05-13T20:05:12Z

CC @kardianos

kardianos · 2022-05-18T14:34:21Z

To prevent a lock on max retries, I would prefer this to be set in OpenDB from the Connector. Maybe the connector could be have a configuration optional interface. Unsure. However, the associated CL is not correct, it doesn't lock correctly. It also changes the BadConn error I think.

Should we do this? Probably. I had ideally wanted to create a v2 sql package, but realistically my time is very limited. So if this was set when the DB is created so no lock is required, I'd be alright.

Another option would be 2 retries or no retires.

I'm okay with accepting this, conditionally that a good design can be found.

rsc · 2022-05-18T17:47:01Z

We can avoid the lock by setting this value atomically, I think.
So the API should be OK either way.

rsc · 2022-05-18T18:00:29Z

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

jinzhu · 2022-05-19T05:25:45Z

Thank you for your review, fixed the lock issue with atomic.

MatthewJamesBoyle · 2022-05-21T11:38:20Z

This seems a good proposal to me.

A couple of thoughts:

Does it make sense to have an upper bound on the amount of retries we allow a user to configure?
As far as I can see, the retry strategies today do not perform any exponential back off/jitter. This was likely fine for most use cases since the maximum was 2, but as users may put higher numbers here, we may want to do more to prevent a thundering herd problem.

rsc · 2022-05-25T17:14:18Z

The CL would be a lot clearer if the code was not being refactored at the same time. Can you instead add a db.maxBadConnRetries() method and replace all the i < maxBadConnRetries with i < db.maxBadConnRetries() ?

jinzhu · 2022-05-26T01:53:30Z

The CL would be a lot clearer if the code was not being refactored at the same time. Can you instead add a db.maxBadConnRetries() method and replace all the i < maxBadConnRetries with i < db.maxBadConnRetries() ?

Ok, will do that.

jinzhu · 2022-05-26T01:54:53Z

The CL would be a lot clearer if the code was not being refactored at the same time. Can you instead add a db.maxBadConnRetries() method and replace all the i < maxBadConnRetries with i < db.maxBadConnRetries() ?

but in the old style of code, we can't change the maxBadConnRetries to 0, or it will never reuse connection from pool, is it ok?

jinzhu · 2022-06-03T14:59:10Z

The CL would be a lot clearer if the code was not being refactored at the same time. Can you instead add a db.maxBadConnRetries() method and replace all the i < maxBadConnRetries with i < db.maxBadConnRetries() ?

Hello @rsc , which connReuseStrategy should we use if the maxBadConnRetries set to 0? alwaysNewConn or cachedOrNewConn?

rsc · 2022-06-08T17:48:47Z

@kardianos Do you have any suggestions or preferences here?

kardianos · 2022-06-21T21:45:34Z

@jinzhu Can you explain to me when you would want a connection retry other then (default=2 or zero)?

This "retry" is to try to take a connection out of the connection pool, then if it is not a valid connection, return an error with BadConn. It is fundamentally about connection pooling, though the side effect might be to re-run a query if a driver mis-reports the connection to be bad, when it actually had side effects.

Turning off this connection retry I can understand: the driver is (now) provided many ways to report a connection as bad, when we actually attempt a query, we want to give the user an option to not retry.

Alternatively, the option to not retry could be implemented as a query parameter say, const NoRetry ..., then pass that into a param in the query.

So questions:

Can you justify any other option besides Default and Off?
Would the use case always be for the entire connection pool? Or would this be better per query?

To answer your question:

Hello @rsc , which connReuseStrategy should we use if the maxBadConnRetries set to 0? alwaysNewConn or cachedOrNewConn?

If max retries is set to zero, you must use cachedOrNewConn. If you used alwaysNew, then it would negate the entire point of a connection pool.

jinzhu · 2022-06-22T01:54:27Z

Hi @kardianos ,

Thank you for your response

@jinzhu Can you explain to me when you would want a connection retry other then (default=2 or zero)?

From my use case, default or zero should be enough.

Alternatively, the option to not retry could be implemented as a query parameter say, const NoRetry ..., then pass that into a param in the query.

This sounds make things even more tricky ;)

Compare to per query solution, I would prefer to set up two database pools with different retry option in this case, which works perfectly in read-write splitting cases.

kardianos · 2022-06-22T03:08:17Z

So it sounds like you do have a case where some connections (reads) would retry, and write would not. Do you mean tricky from a db user or implementation?

…

On Tue, Jun 21, 2022, 18:54 Jinzhu ***@***.***> wrote: Hi @kardianos <https://github.com/kardianos> , Thank you for your response @jinzhu <https://github.com/jinzhu> Can you explain to me when you would want a connection retry other then (default=2 or zero)? From my use case, default or zero should be enough. Alternatively, the option to not retry could be implemented as a query parameter say, const NoRetry ..., then pass that into a param in the query. This sounds make things even more tricky ;) Compare to per query solution, I would prefer to set up two database pools with different retry option in this case, which works perfectly in read-write splitting cases. — Reply to this email directly, view it on GitHub <#52886 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAFYLMJM7ZPVOWP2CHHMLGTVQJW6DANCNFSM5V23UA6Q> . You are receiving this because you were mentioned.Message ID: ***@***.***>

jinzhu · 2022-06-22T03:13:24Z

So it sounds like you do have a case where some connections (reads) would retry, and write would not.

yes, sounds reasonable in some cases, but in our case, we would like to disable retry totally for https://cacm.acm.org/magazines/2013/2/160173-the-tail-at-scale/fulltext

Do you mean tricky from a db user or implementation?

For a db user

jinzhu · 2022-06-27T07:28:00Z

Hi @kardianos

for the implementation https://go-review.googlesource.com/c/go/+/404935 , do you have any suggestions?

rsc · 2022-07-20T17:44:57Z

ping @kardianos

kardianos · 2022-07-20T19:29:12Z

@jinzhu I like how your CL simplifies existing code. I can see the use case.

Upon examining the code more, only queries directly on the DB and Stmt use maxBadConnRetries. Conn and Tx do not (because once the connection is acquired, it is assumed to be good. As such, if you need ensure no retries, you can pull a dedicated *Conn or start a Tx. Most DB systems have a way to automatically roll back a transaction if the underlying network connection is terminated.

I would be okay merging your CL without the configuration in the DB, just as a simplification where retry gets the retry count from the const rather then the DB atomic.
I would be okay possibly adding a *Conn.StmtContext(*Stmt) *Stmt to allow calling a stmt from a specific Conn.
I would consider maybe a switch just to turn off the bad conn retries for a pool; I don't think I would want to allow any N to be configured.

Can you describe why we would want to make the retry configurable (on/off), rather then starting a Tx or grabbing an explicit Conn?

jinzhu · 2022-07-25T12:00:24Z

Hi @kardianos

if you need ensure no retries, you can pull a dedicated *Conn or start a Tx. Most DB systems have a way to automatically roll back a transaction if the underlying network connection is terminated.

Yeah, this should work for me. 👍

Then I will go to option #1, just created another PR, thank you for your review.

https://go-review.googlesource.com/c/go/+/419182

rsc · 2022-07-27T17:31:54Z

If we're down to CL 419182, it sounds like there are no API changes being proposed anymore, in which case we can close this issue as retracted. Do I have that right?

jinzhu · 2022-07-28T02:14:04Z

If we're down to CL 419182, it sounds like there are no API changes being proposed anymore, in which case we can close this issue as retracted. Do I have that right?

Yes, going to close this one, thank you all for the reviews.

rsc · 2022-08-03T18:01:51Z

This proposal has been declined as retracted.
— rsc for the proposal review group

jinzhu added the Proposal label May 13, 2022

gopherbot added this to the Proposal milestone May 13, 2022

jinzhu mentioned this issue May 13, 2022

database/sql: add SetMaxBadConnRetries #52771

Closed

ianlancetaylor added this to Incoming in Proposals (old) May 13, 2022

rsc moved this from Incoming to Active in Proposals (old) May 18, 2022

jinzhu closed this as completed Jul 28, 2022

rsc moved this from Active to Declined in Proposals (old) Aug 3, 2022

golang locked and limited conversation to collaborators Aug 3, 2023

gopherbot added the FrozenDueToAge label Aug 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: database/sql: add SetMaxBadConnRetries #52886

proposal: database/sql: add SetMaxBadConnRetries #52886

jinzhu commented May 13, 2022

gopherbot commented May 13, 2022

ianlancetaylor commented May 13, 2022

kardianos commented May 18, 2022

rsc commented May 18, 2022

rsc commented May 18, 2022

jinzhu commented May 19, 2022

MatthewJamesBoyle commented May 21, 2022

rsc commented May 25, 2022

jinzhu commented May 26, 2022

jinzhu commented May 26, 2022

jinzhu commented Jun 3, 2022

rsc commented Jun 8, 2022

kardianos commented Jun 21, 2022

jinzhu commented Jun 22, 2022

kardianos commented Jun 22, 2022 via email

jinzhu commented Jun 22, 2022

jinzhu commented Jun 27, 2022

rsc commented Jul 20, 2022

kardianos commented Jul 20, 2022

jinzhu commented Jul 25, 2022

rsc commented Jul 27, 2022

jinzhu commented Jul 28, 2022

rsc commented Aug 3, 2022

proposal: database/sql: add SetMaxBadConnRetries #52886

proposal: database/sql: add SetMaxBadConnRetries #52886

Comments

jinzhu commented May 13, 2022

gopherbot commented May 13, 2022

ianlancetaylor commented May 13, 2022

kardianos commented May 18, 2022

rsc commented May 18, 2022

rsc commented May 18, 2022

jinzhu commented May 19, 2022

MatthewJamesBoyle commented May 21, 2022

rsc commented May 25, 2022

jinzhu commented May 26, 2022

jinzhu commented May 26, 2022

jinzhu commented Jun 3, 2022

rsc commented Jun 8, 2022

kardianos commented Jun 21, 2022

jinzhu commented Jun 22, 2022

kardianos commented Jun 22, 2022 via email

jinzhu commented Jun 22, 2022

jinzhu commented Jun 27, 2022

rsc commented Jul 20, 2022

kardianos commented Jul 20, 2022

jinzhu commented Jul 25, 2022

rsc commented Jul 27, 2022

jinzhu commented Jul 28, 2022

rsc commented Aug 3, 2022