New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/pkgsite: SQL error on insert #43899
Comments
I wasn't able to duplicate this running locally against the dev DB. Nothing to do right now; leaving open in case it happens again. |
Saw it again for src.elv.sh@v0.7.0. Could not repro. |
Change https://golang.org/cl/287793 mentions this issue: |
Add logging to the insertUnits function to see if we can understand why we occasionally see "ON CONFLICT DO UPDATE command cannot affect row a second time". According to https://pganalyze.com/docs/log-insights/app-errors/U126, it happens because we have duplicate conflict columns. For golang/go#43899 Change-Id: Idba3018708c3f5ee451e574b2c747929a5e2c30d Reviewed-on: https://go-review.googlesource.com/c/pkgsite/+/287793 Trust: Jonathan Amsterdam <jba@google.com> Run-TryBot: Jonathan Amsterdam <jba@google.com> TryBot-Result: kokoro <noreply+kokoro@google.com> Reviewed-by: Jamal Carvalho <jamal@golang.org>
Change https://golang.org/cl/290269 mentions this issue: |
Increase the isolation level of the transaction that inserts a module to a level which does not allow non-repeatable reads. This fixes a bug in which insertPaths, which purports to return a map of all paths relevant to the module, fails to include some paths. To understand the bug, it is necessary to understand exactly how insertPaths works. To avoid a large and slow bulk insert, it first reads all the paths relevant to the module, then inserts only those that are missing. The first part is done with a SELECT, and the second with an INSERT ... ON CONFLICT DO NOTHING ... RETURNING. The paths it returns combine the ones it read in the SELECT with the new ones that are inserted. The INSERT statement only returns data for rows it inserted, not rows it skipped due to a conflict. The key point to notice here is that insertPaths performs two reads of the database. The first is, of course, the SELECT. The second occurs inside the INSERT statement: the database must read the table for potentially conflicting rows. At the default isolation level, a transaction can experience non-repeatable reads. As defined in https://www.postgresql.org/docs/11/transaction-iso.html, that is when "a transaction re-reads data it has previously read and finds that data has been modified by another transaction (that committed since the initial read)." To see how a non-repeatable read can happen in insertPaths, imagine that two different versions of the same module are being inserted at the same time. They will probably have many of the same paths; certainly they will share the module path itself. Also imagine that this is the first time we've seen this module. The insertPaths call for one version reads the paths and does not find any of them. It prepares its insert statement. Meanwhile, the transaction for the other version runs all the way to completion, committing all the module's paths to the database. Now the insert statement for the first version runs. It finds that many (perhaps all) of its paths are already present, because the other transaction committed them and this second read is not obligated to repeat the results of the first read. Each path that is present results in a conflict, and conflicts do not result in a returned row. These paths were not in the initial SELECT, and they are not returned by the INSERT, so they do not appear in the result. By running the transaction at a level that disallows non-repeatable reads, we can be sure that the second read sees exactly what the first read saw. Every path will appear in either the SELECT or the rows returned from the INSERT. I was able to confirm that this worked with high probability by repeatedly running a test that inserted multiple versions of the same module. It failed most of the time at default isolation and never failed at RepeatableRead. The test will be in a forthcoming CL. Although this fixes the problem with insertPaths, there may still be a second problem that results in duplicate-row problem described in the issue below. So not marking it fixed just yet. For golang/go#43899 Change-Id: Ia3f0f89da1f2327d9f3d16d1826c5376ee6e5b47 Reviewed-on: https://go-review.googlesource.com/c/pkgsite/+/290269 Trust: Jonathan Amsterdam <jba@google.com> Run-TryBot: Jonathan Amsterdam <jba@google.com> TryBot-Result: kokoro <noreply+kokoro@google.com> Reviewed-by: Julie Qiu <julie@golang.org>
Change https://golang.org/cl/291609 mentions this issue: |
We've noticed that we aren't retrying errors that have the right serialization code. The likely reason is that the error type is not what we expect. Get information about the type so we can recognize it in the code. For golang/go#43899 Change-Id: I72aceb12002eecbf580654b77c68d845759e8182 Reviewed-on: https://go-review.googlesource.com/c/pkgsite/+/291609 Trust: Jonathan Amsterdam <jba@google.com> Run-TryBot: Jonathan Amsterdam <jba@google.com> TryBot-Result: kokoro <noreply+kokoro@google.com> Reviewed-by: Julie Qiu <julie@golang.org>
I think this may have been due to bad logging: we were logging serialization-failure errors at a low level, without realizing that they would be retried higher up. https://golang.org/cl/302190 fixed that. |
at 2021-01-24T20:48:17.318677958Z
The text was updated successfully, but these errors were encountered: