Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: database/sql: Add MaxRepreparedStatements and MaxRetries to DBStats #49078

Closed
j-delaney opened this issue Oct 19, 2021 · 7 comments
Closed

Comments

@j-delaney
Copy link

This is a proposal to add 2 new fields to the DBStats struct:

// Counters
// ...
MaxRetries              int64 // The total number of operation retries
MaxRepreparedStatements int64 // The total number of times a prepared statement has been reprepared due to not existing on the chosen connection

Implementation would be with atomic.AddInt64 (similar to this PR for another new DBStat). I'm happy to handle implementation here.

MaxRepreparedStatements

An int64 field that exposes the cumulative number of times a prepared statement was executed but the connection used did not already have the statement prepared on it so it had to re-prepare it. This happens here in database/sql.

Why?

Prepared statements offer a performance benefit since the server can use more efficient protocols and/or skip re-analyzing and planning the query (PostgreSQL docs, MySQL docs). This benefit can be lost though if the statement needs to be reprepared before being executed (as it’s another network roundtrip and the server will potentially need to replan and analyze the query).

This metric would be useful to monitor to see if it correlates with increased latencies. It’s also useful in finding the optimal values for settings like MaxConnLifetime. If MaxRepreparedStatements is constantly increasing it might mean you need to increase the lifespan of your connections so statements don’t need to be reprepared on new connections.

What about using server metrics for this?

Some SQL databases expose a counter on how often PREPARE is called (e.g. Com_stmt_prepare in MySQL). While this is useful, it does not cover the same use case as a driver metric. Driver metrics can be more specific (since each service might have their own driver instance you could attach that as a metric tag). Com_stmt_prepare can also increase in cases where MaxRepreparedStatements doesn’t: if a driver is used in only thread, prepares a statement, executes it 5 times synchronously, and then closes the statement then Com_stmt_prepare will increase on the database but MaxRepreparedStatements shouldn’t increase in the driver.

MaxRetries

An int64 field that exposes the cumulative number of times an operation has been retried. For example, when calling ExecContext if the initial attempt fails with ErrBadConn then database/sql will retry once with a cached or new connection and then once more with a new connection. Each of these retries would increment the field by 1.

Why?

This metric is useful for determining the source of tail latencies. Retries can lead to having to establish an entirely new connection which has overhead. This would be a useful metric to monitor to see if it’s correlating with increased latencies. It could indicate networking issues or other problems connecting and sending data to the DB.

@ianlancetaylor ianlancetaylor changed the title database/sql: Add MaxRepreparedStatements and MaxRetries to DBStats proposal: database/sql: Add MaxRepreparedStatements and MaxRetries to DBStats Oct 20, 2021
@gopherbot gopherbot added this to the Proposal milestone Oct 20, 2021
@ianlancetaylor
Copy link
Contributor

CC @kardianos

@ianlancetaylor ianlancetaylor added this to Incoming in Proposals (old) Oct 20, 2021
@rsc
Copy link
Contributor

rsc commented Oct 27, 2021

This seems incredibly specific. Do we really need to add it to the standard package?
Could a custom driver watch for this kind of thing?
Or is there some better 'back ends might not be healthy' metric?

@rsc rsc moved this from Incoming to Active in Proposals (old) Oct 27, 2021
@rsc
Copy link
Contributor

rsc commented Oct 27, 2021

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Nov 3, 2021

ping @kardianos. Still not sure this is the right approach for the general problem of detecting unhealthy backends.

@kardianos
Copy link
Contributor

I don't think this would help much. Knowing the current DB connection size either from the client or server and comparing them to the tail latency would probably be more useful. In a sql/v2, retry behavior/implementation would be something I would want to be able to swap out, but I don't think adding this statistic wold be worthwhile.

Possibly one proxy you could use instead is to look at new connections by creating a Connector wrapper. If a DB restarts and all the connections close and must be "retried", then you would see many new DB connections being made.

I would suggest "decline".

@rsc rsc moved this from Active to Likely Decline in Proposals (old) Nov 10, 2021
@rsc
Copy link
Contributor

rsc commented Nov 10, 2021

Based on the discussion above, this proposal seems like a likely decline.
— rsc for the proposal review group

@rsc rsc moved this from Likely Decline to Declined in Proposals (old) Dec 1, 2021
@rsc
Copy link
Contributor

rsc commented Dec 1, 2021

No change in consensus, so declined.
— rsc for the proposal review group

@rsc rsc closed this as completed Dec 1, 2021
@golang golang locked and limited conversation to collaborators Dec 1, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
Development

No branches or pull requests

5 participants