Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync: occasional panic "inconsistent mutex state" on sync.allPoolsMu #56648

Closed
supergao222 opened this issue Nov 8, 2022 · 7 comments
Closed
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Milestone

Comments

@supergao222
Copy link

supergao222 commented Nov 8, 2022

What version of Go are you using (go version)?

$ go version
go version go1.18.3 linux/amd64

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/root/go"
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"

What did you do?

What did you expect to see?

no panic

What did you see instead?

I encounter occasional panic "inconsistent mutex state" when server try to run sync.allPoolsMu.Lock() in pinSlow(). The panic always happens on sync.allPoolsMu, all other mutex works OK.

I collected 6 panic's coredump file, print the allPoolsMu's value using delve. And found some incredible values:

1. sync.allPoolsMu = sync.Mutex {state: -1159398752, sema: 192}
2. sync.allPoolsMu = sync.Mutex {state: -679418736, sema: 192}
3. sync.allPoolsMu = sync.Mutex {state: -1955878282, sema: 192}
4. sync.allPoolsMu = sync.Mutex {state: 1481213382, sema: 192}
5. sync.allPoolsMu = sync.Mutex {state: -1298326986, sema: 192}
6. sync.allPoolsMu = sync.Mutex {state: -565380458, sema: 192}
@supergao222 supergao222 changed the title affected/package: sync, occasional panic "inconsistent mutex state" on sync.allPoolsMu sync: occasional panic "inconsistent mutex state" on sync.allPoolsMu Nov 8, 2022
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Nov 8, 2022
@mknyszek
Copy link
Contributor

mknyszek commented Nov 8, 2022

That looks like memory corruption. Do you have a way to reproduce? What happens if you run your application with the race detector?

CC @golang/runtime

@mknyszek mknyszek added this to the Backlog milestone Nov 8, 2022
@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Nov 8, 2022
@bingzhuo2008
Copy link

I have this problem too. It happens in production environment by chance, it's hard to reproduce it by pressure tests. Besides unsafe.Pointer, what other operations may lead to memory corruption?

@satyrswang
Copy link

That looks like memory corruption. Do you have a way to reproduce? What happens if you run your application with the race detector?

CC @golang/runtime

@mknyszek why other variables' memory corruption caused by data race can lead to the state in mutex unpredictable?

@satyrswang
Copy link

if integer overflow, or something like that, BUT there has no memory protection?

@CAFxX
Copy link
Contributor

CAFxX commented Nov 9, 2022

@satyrswang if your program imports unsafe, or imports packages that transitively import unsafe, or IIRC also if you use cgo, yes it is possible for memory corruption to occur even elsewhere (e.g. the mutex state). Similar effects may manifest also in presence of data races.

@mknyszek mknyszek added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Nov 9, 2022
@mknyszek
Copy link
Contributor

mknyszek commented Nov 9, 2022

I think we need a way to reproduce this to move forward, otherwise we'll just be guessing. I can't think of a recent change off the top of my head that might involve mutexes in the last couple releases and memory corruption is too broad a category to try to pinpoint on any particular change in recent memory.

@gopherbot
Copy link

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

@golang golang locked and limited conversation to collaborators Dec 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

6 participants