Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: fatal error: notewakeup - double wakeup #19305

Closed
a-robinson opened this issue Feb 27, 2017 · 18 comments
Closed

runtime: fatal error: notewakeup - double wakeup #19305

a-robinson opened this issue Feb 27, 2017 · 18 comments

Comments

@a-robinson
Copy link

What version of Go are you using (go version)?

go version go1.8 linux/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/go"
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="/x-tools/x86_64-unknown-linux-gnu/bin/x86_64-unknown-linux-gnu-gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/go/src/github.com/cockroachdb/cockroach/artifacts/go-build113899826=/tmp/go-build -gno-record-gcc-switches"
CXX="/x-tools/x86_64-unknown-linux-gnu/bin/x86_64-unknown-linux-gnu-g++"
CGO_ENABLED="1"
PKG_CONFIG="pkg-config"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-static-libgcc -static-libstdc++"

What did you do?

Unfortunately, we have no idea yet whether this is reproducible and we don't have anything resembling a minimal test case for it. This triggered in one of CockroachDB's nightly test runs, which runs inside a docker image. To re-run the test that hit this, run:

git clone https://github.com/cockroachdb/cockroach.git
cd cockroach
git checkout 382709bd132741c65ef9759c02b08d52b7510d86
./build/builder.sh env COCKROACH_NIGHTLY_STRESS=false COCKROACH_PROPOSER_EVALUATED_KV=true make stress PKG=github.com/cockroachdb/cockroach/pkg/sql/sqlbase GOFLAGS= TAGS= TESTTIMEOUT=30m TESTFLAGS=-test.v 'STRESSFLAGS=-maxtime 15m -maxfails 1 -stderr'

What did you expect to see?

The tests to run successfully for up to 15 minutes.

What did you see instead?

A go runtime failure:

notewakeup - double wakeup (1)
fatal error: notewakeup - double wakeup

I couldn't find any reference to errors like this since #5139, so I'm opening this as a tracker in case others hit the same issue even though we don't have a great repro case.

The full output of the build log is here, also copied below:

[02:43:00][Step 2/2] 58 runs so far, 0 failures, over 5s
[02:43:05][Step 2/2] 121 runs so far, 0 failures, over 10s
[02:43:10][Step 2/2] 188 runs so far, 0 failures, over 15s
[02:43:10][Step 2/2] 189 runs completed, 1 failures, over 15s
[02:43:10][Step 2/2] FAIL
[02:43:10][Step 2/2] 
[02:43:10][Step 2/2] I170227 07:43:10.629748 1 rand.go:76  Random seed: 934809813366473345
[02:43:10][Step 2/2] === RUN   TestEncDatum
[02:43:10][Step 2/2] --- PASS: TestEncDatum (0.02s)
[02:43:10][Step 2/2] === RUN   TestEncDatumNull
[02:43:10][Step 2/2] --- PASS: TestEncDatumNull (0.00s)
[02:43:10][Step 2/2] === RUN   TestEncDatumCompare
[02:43:10][Step 2/2] --- PASS: TestEncDatumCompare (0.00s)
[02:43:10][Step 2/2] === RUN   TestEncDatumFromBuffer
[02:43:10][Step 2/2] --- PASS: TestEncDatumFromBuffer (0.01s)
[02:43:10][Step 2/2] === RUN   TestEncDatumRowCompare
[02:43:10][Step 2/2] --- PASS: TestEncDatumRowCompare (0.00s)
[02:43:10][Step 2/2] === RUN   TestEncDatumRowAlloc
[02:43:10][Step 2/2] notewakeup - double wakeup (1)
[02:43:10][Step 2/2] fatal error: notewakeup - double wakeup
[02:43:10][Step 2/2] --- PASS: TestEncDatumRowAlloc (0.01s)
[02:43:10][Step 2/2] === RUN   TestKeyAddress
[02:43:10][Step 2/2] 
[02:43:10][Step 2/2] runtime stack:
[02:43:10][Step 2/2] runtime.throw(0x1aea9b9, 0x1a)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/panic.go:596 +0x95 fp=0x2b2d7ac00a70 sp=0x2b2d7ac00a50
[02:43:10][Step 2/2] runtime.notewakeup(0x2b4e2d8)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/lock_futex.go:133 +0xb4 fp=0x2b2d7ac00a98 sp=0x2b2d7ac00a70
[02:43:10][Step 2/2] runtime.gchelper()
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/mgc.go:1923 +0xf5 fp=0x2b2d7ac00ac8 sp=0x2b2d7ac00a98
[02:43:10][Step 2/2] runtime.stopm()
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:1653 +0xdd fp=0x2b2d7ac00af0 sp=0x2b2d7ac00ac8
[02:43:10][Step 2/2] runtime.findrunnable(0xc420022600, 0x0)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:2102 +0x2e4 fp=0x2b2d7ac00b88 sp=0x2b2d7ac00af0
[02:43:10][Step 2/2] runtime.schedule()
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:2222 +0x14c fp=0x2b2d7ac00bc8 sp=0x2b2d7ac00b88
[02:43:10][Step 2/2] runtime.park_m(0xc42039b040)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:2285 +0xab fp=0x2b2d7ac00c00 sp=0x2b2d7ac00bc8
[02:43:10][Step 2/2] runtime.mcall(0x2b2d77b842c0)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/asm_amd64.s:269 +0x5b fp=0x2b2d7ac00c10 sp=0x2b2d7ac00c00
[02:43:10][Step 2/2] 
[02:43:10][Step 2/2] goroutine 1 [chan receive]:
[02:43:10][Step 2/2] runtime.gopark(0x1b3acc8, 0xc42032a0b8, 0x1ad9fc1, 0xc, 0xc42001e117, 0x3)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:271 +0x13a fp=0xc4201c5af0 sp=0xc4201c5ac0
[02:43:10][Step 2/2] runtime.goparkunlock(0xc42032a0b8, 0x1ad9fc1, 0xc, 0x17, 0x3)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:277 +0x5e fp=0xc4201c5b30 sp=0xc4201c5af0
[02:43:10][Step 2/2] runtime.chanrecv(0x18d1680, 0xc42032a060, 0x0, 0xc4201c5c01, 0x77492a)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/chan.go:513 +0x371 fp=0xc4201c5bd0 sp=0xc4201c5b30
[02:43:10][Step 2/2] runtime.chanrecv1(0x18d1680, 0xc42032a060, 0x0)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/chan.go:395 +0x35 fp=0xc4201c5c08 sp=0xc4201c5bd0
[02:43:10][Step 2/2] testing.(*T).Run(0xc4203aeb60, 0x1adc03f, 0xe, 0x1b38ef0, 0xc4200bfc01)
[02:43:10][Step 2/2] 	/usr/local/go/src/testing/testing.go:698 +0x2f4 fp=0xc4201c5cb0 sp=0xc4201c5c08
[02:43:10][Step 2/2] testing.runTests.func1(0xc4203aeb60)
[02:43:10][Step 2/2] 	/usr/local/go/src/testing/testing.go:882 +0x67 fp=0xc4201c5d00 sp=0xc4201c5cb0
[02:43:10][Step 2/2] testing.tRunner(0xc4203aeb60, 0xc4200bfdb0)
[02:43:10][Step 2/2] 	/usr/local/go/src/testing/testing.go:657 +0x96 fp=0xc4201c5d28 sp=0xc4201c5d00
[02:43:10][Step 2/2] testing.runTests(0xc4203a8e20, 0x2650ee0, 0x16, 0x16, 0x2b4c640)
[02:43:10][Step 2/2] 	/usr/local/go/src/testing/testing.go:888 +0x2c1 fp=0xc4201c5de0 sp=0xc4201c5d28
[02:43:10][Step 2/2] testing.(*M).Run(0xc4201c5f20, 0x2b90f40)
[02:43:10][Step 2/2] 	/usr/local/go/src/testing/testing.go:822 +0xfc fp=0xc4201c5ed0 sp=0xc4201c5de0
[02:43:10][Step 2/2] github.com/cockroachdb/cockroach/pkg/sql/sqlbase_test.TestMain(0xc4200bff20)
[02:43:10][Step 2/2] 	/go/src/github.com/cockroachdb/cockroach/pkg/sql/sqlbase/main_test.go:34 +0x8e fp=0xc4201c5f00 sp=0xc4201c5ed0
[02:43:10][Step 2/2] main.main()
[02:43:10][Step 2/2] 	github.com/cockroachdb/cockroach/pkg/sql/sqlbase/_test/_testmain.go:84 +0xf7 fp=0xc4201c5f88 sp=0xc4201c5f00
[02:43:10][Step 2/2] runtime.main()
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:185 +0x20a fp=0xc4201c5fe0 sp=0xc4201c5f88
[02:43:10][Step 2/2] runtime.goexit()
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/asm_amd64.s:2197 +0x1 fp=0xc4201c5fe8 sp=0xc4201c5fe0
[02:43:10][Step 2/2] 
[02:43:10][Step 2/2] goroutine 17 [syscall, locked to thread]:
[02:43:10][Step 2/2] runtime.goexit()
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/asm_amd64.s:2197 +0x1 fp=0xc420042fe8 sp=0xc420042fe0
[02:43:10][Step 2/2] 
[02:43:10][Step 2/2] goroutine 2 [force gc (idle)]:
[02:43:10][Step 2/2] runtime.gopark(0x1b3acc8, 0x2b4b650, 0x1add4b8, 0xf, 0x1b3ab14, 0x1)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:271 +0x13a fp=0xc420030768 sp=0xc420030738
[02:43:10][Step 2/2] runtime.goparkunlock(0x2b4b650, 0x1add4b8, 0xf, 0xc420000114, 0x1)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:277 +0x5e fp=0xc4200307a8 sp=0xc420030768
[02:43:10][Step 2/2] runtime.forcegchelper()
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:226 +0x9e fp=0xc4200307e0 sp=0xc4200307a8
[02:43:10][Step 2/2] runtime.goexit()
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/asm_amd64.s:2197 +0x1 fp=0xc4200307e8 sp=0xc4200307e0
[02:43:10][Step 2/2] created by runtime.init.4
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:215 +0x35
[02:43:10][Step 2/2] 
[02:43:10][Step 2/2] goroutine 3 [runnable]:
[02:43:10][Step 2/2] runtime.gopark(0x1b3acc8, 0x2b4b940, 0x1adac0a, 0xd, 0x6db714, 0x1)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:271 +0x13a fp=0xc420030f58 sp=0xc420030f28
[02:43:10][Step 2/2] runtime.goparkunlock(0x2b4b940, 0x1adac0a, 0xd, 0x14, 0x1)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:277 +0x5e fp=0xc420030f98 sp=0xc420030f58
[02:43:10][Step 2/2] runtime.bgsweep(0xc42005e000)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/mgcsweep.go:72 +0x12e fp=0xc420030fd8 sp=0xc420030f98
[02:43:10][Step 2/2] runtime.goexit()
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/asm_amd64.s:2197 +0x1 fp=0xc420030fe0 sp=0xc420030fd8
[02:43:10][Step 2/2] created by runtime.gcenable
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/mgc.go:212 +0x61
[02:43:10][Step 2/2] 
[02:43:10][Step 2/2] goroutine 4 [finalizer wait]:
[02:43:10][Step 2/2] runtime.gopark(0x1b3acc8, 0x2b91370, 0x1adc36b, 0xe, 0x14, 0x1)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:271 +0x13a fp=0xc420031718 sp=0xc4200316e8
[02:43:10][Step 2/2] runtime.goparkunlock(0x2b91370, 0x1adc36b, 0xe, 0x14, 0x1)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:277 +0x5e fp=0xc420031758 sp=0xc420031718
[02:43:10][Step 2/2] runtime.runfinq()
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/mfinal.go:161 +0xb2 fp=0xc4200317e0 sp=0xc420031758
[02:43:10][Step 2/2] runtime.goexit()
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/asm_amd64.s:2197 +0x1 fp=0xc4200317e8 sp=0xc4200317e0
[02:43:10][Step 2/2] created by runtime.createfing
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/mfinal.go:142 +0x62
[02:43:10][Step 2/2] 
[02:43:10][Step 2/2] goroutine 5 [chan receive]:
[02:43:10][Step 2/2] runtime.gopark(0x1b3acc8, 0xc4202d0058, 0x1ad9fc1, 0xc, 0xc4202dc017, 0x3)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:271 +0x13a fp=0xc420031e70 sp=0xc420031e40
[02:43:10][Step 2/2] runtime.goparkunlock(0xc4202d0058, 0x1ad9fc1, 0xc, 0x17, 0x3)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:277 +0x5e fp=0xc420031eb0 sp=0xc420031e70
[02:43:10][Step 2/2] runtime.chanrecv(0x18d1340, 0xc4202d0000, 0xc420031fb0, 0x6fc23ac01, 0xc4202d4000)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/chan.go:513 +0x371 fp=0xc420031f50 sp=0xc420031eb0
[02:43:10][Step 2/2] runtime.chanrecv2(0x18d1340, 0xc4202d0000, 0xc420031fb0, 0x0)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/chan.go:400 +0x35 fp=0xc420031f88 sp=0xc420031f50
[02:43:10][Step 2/2] github.com/cockroachdb/cockroach/pkg/util/log.(*loggingT).flushDaemon(0x2b4dec0)
[02:43:10][Step 2/2] 	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/clog.go:1012 +0x77 fp=0xc420031fd8 sp=0xc420031f88
[02:43:10][Step 2/2] runtime.goexit()
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/asm_amd64.s:2197 +0x1 fp=0xc420031fe0 sp=0xc420031fd8
[02:43:10][Step 2/2] created by github.com/cockroachdb/cockroach/pkg/util/log.init.1
[02:43:10][Step 2/2] 	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/clog.go:583 +0xd2
[02:43:10][Step 2/2] 
[02:43:10][Step 2/2] goroutine 6 [chan receive]:
[02:43:10][Step 2/2] runtime.gopark(0x1b3acc8, 0xc42023cfb8, 0x1ad9fc1, 0xc, 0xc4203c6017, 0x3)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:271 +0x13a fp=0xc420032688 sp=0xc420032658
[02:43:10][Step 2/2] runtime.goparkunlock(0xc42023cfb8, 0x1ad9fc1, 0xc, 0x17, 0x3)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:277 +0x5e fp=0xc4200326c8 sp=0xc420032688
[02:43:10][Step 2/2] runtime.chanrecv(0x18d20c0, 0xc42023cf60, 0xc4200327c0, 0x1, 0x0)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/chan.go:513 +0x371 fp=0xc420032768 sp=0xc4200326c8
[02:43:10][Step 2/2] runtime.chanrecv2(0x18d20c0, 0xc42023cf60, 0xc4200327c0, 0x0)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/chan.go:400 +0x35 fp=0xc4200327a0 sp=0xc420032768
[02:43:10][Step 2/2] github.com/cockroachdb/cockroach/pkg/util/log.(*loggingT).gcDaemon(0x2b4dec0)
[02:43:10][Step 2/2] 	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/clog.go:1039 +0x5b fp=0xc4200327d8 sp=0xc4200327a0
[02:43:10][Step 2/2] runtime.goexit()
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/asm_amd64.s:2197 +0x1 fp=0xc4200327e0 sp=0xc4200327d8
[02:43:10][Step 2/2] created by github.com/cockroachdb/cockroach/pkg/util/log.init.1
[02:43:10][Step 2/2] 	/go/src/github.com/cockroachdb/cockroach/pkg/util/log/clog.go:584 +0xf6
[02:43:10][Step 2/2] 
[02:43:10][Step 2/2] goroutine 18 [syscall]:
[02:43:10][Step 2/2] runtime.notetsleepg(0x2b4c338, 0x12a05a637, 0x1)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/lock_futex.go:205 +0x42 fp=0xc42002c760 sp=0xc42002c730
[02:43:10][Step 2/2] runtime.timerproc()
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/time.go:209 +0x327 fp=0xc42002c7e0 sp=0xc42002c760
[02:43:10][Step 2/2] runtime.goexit()
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/asm_amd64.s:2197 +0x1 fp=0xc42002c7e8 sp=0xc42002c7e0
[02:43:10][Step 2/2] created by runtime.addtimerLocked
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/time.go:116 +0xed
[02:43:10][Step 2/2] 
[02:43:10][Step 2/2] goroutine 9 [GC worker (idle)]:
[02:43:10][Step 2/2] runtime.gopark(0x1b3ab48, 0xc420377ec0, 0x1addf05, 0x10, 0x258ac614, 0x0)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:271 +0x13a fp=0xc420033750 sp=0xc420033720
[02:43:10][Step 2/2] runtime.gcBgMarkWorker(0xc420020000)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/mgc.go:1491 +0x138 fp=0xc4200337d8 sp=0xc420033750
[02:43:10][Step 2/2] runtime.goexit()
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/asm_amd64.s:2197 +0x1 fp=0xc4200337e0 sp=0xc4200337d8
[02:43:10][Step 2/2] created by runtime.gcBgMarkStartWorkers
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/mgc.go:1412 +0x98
[02:43:10][Step 2/2] 
[02:43:10][Step 2/2] goroutine 10 [GC worker (idle)]:
[02:43:10][Step 2/2] runtime.gopark(0x1b3ab48, 0xc420377ed0, 0x1addf05, 0x10, 0x14, 0x0)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:271 +0x13a fp=0xc420033f50 sp=0xc420033f20
[02:43:10][Step 2/2] runtime.gcBgMarkWorker(0xc420021300)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/mgc.go:1491 +0x138 fp=0xc420033fd8 sp=0xc420033f50
[02:43:10][Step 2/2] runtime.goexit()
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/asm_amd64.s:2197 +0x1 fp=0xc420033fe0 sp=0xc420033fd8
[02:43:10][Step 2/2] created by runtime.gcBgMarkStartWorkers
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/mgc.go:1412 +0x98
[02:43:10][Step 2/2] 
[02:43:10][Step 2/2] goroutine 11 [GC worker (idle)]:
[02:43:10][Step 2/2] runtime.gopark(0x1b3ab48, 0xc420377ee0, 0x1addf05, 0x10, 0x14, 0x0)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:271 +0x13a fp=0xc4205d6750 sp=0xc4205d6720
[02:43:10][Step 2/2] runtime.gcBgMarkWorker(0xc420022600)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/mgc.go:1491 +0x138 fp=0xc4205d67d8 sp=0xc4205d6750
[02:43:10][Step 2/2] runtime.goexit()
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/asm_amd64.s:2197 +0x1 fp=0xc4205d67e0 sp=0xc4205d67d8
[02:43:10][Step 2/2] created by runtime.gcBgMarkStartWorkers
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/mgc.go:1412 +0x98
[02:43:10][Step 2/2] 
[02:43:10][Step 2/2] goroutine 12 [GC worker (idle)]:
[02:43:10][Step 2/2] runtime.gopark(0x1b3ab48, 0xc420377ef0, 0x1addf05, 0x10, 0x14, 0x0)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:271 +0x13a fp=0xc4205d6f50 sp=0xc4205d6f20
[02:43:10][Step 2/2] runtime.gcBgMarkWorker(0xc420023900)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/mgc.go:1491 +0x138 fp=0xc4205d6fd8 sp=0xc4205d6f50
[02:43:10][Step 2/2] runtime.goexit()
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/asm_amd64.s:2197 +0x1 fp=0xc4205d6fe0 sp=0xc4205d6fd8
[02:43:10][Step 2/2] created by runtime.gcBgMarkStartWorkers
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/mgc.go:1412 +0x98
[02:43:10][Step 2/2] 
[02:43:10][Step 2/2] goroutine 19 [GC worker (idle)]:
[02:43:10][Step 2/2] runtime.gopark(0x1b3ab48, 0xc4205da000, 0x1addf05, 0x10, 0x14, 0x0)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:271 +0x13a fp=0xc42002cf50 sp=0xc42002cf20
[02:43:10][Step 2/2] runtime.gcBgMarkWorker(0xc420024c00)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/mgc.go:1491 +0x138 fp=0xc42002cfd8 sp=0xc42002cf50
[02:43:10][Step 2/2] runtime.goexit()
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/asm_amd64.s:2197 +0x1 fp=0xc42002cfe0 sp=0xc42002cfd8
[02:43:10][Step 2/2] created by runtime.gcBgMarkStartWorkers
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/mgc.go:1412 +0x98
[02:43:10][Step 2/2] 
[02:43:10][Step 2/2] goroutine 20 [GC worker (idle)]:
[02:43:10][Step 2/2] runtime.gopark(0x1b3ab48, 0xc4205da010, 0x1addf05, 0x10, 0x14, 0x0)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:271 +0x13a fp=0xc42002d750 sp=0xc42002d720
[02:43:10][Step 2/2] runtime.gcBgMarkWorker(0xc420026000)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/mgc.go:1491 +0x138 fp=0xc42002d7d8 sp=0xc42002d750
[02:43:10][Step 2/2] runtime.goexit()
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/asm_amd64.s:2197 +0x1 fp=0xc42002d7e0 sp=0xc42002d7d8
[02:43:10][Step 2/2] created by runtime.gcBgMarkStartWorkers
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/mgc.go:1412 +0x98
[02:43:10][Step 2/2] 
[02:43:10][Step 2/2] goroutine 21 [GC worker (idle)]:
[02:43:10][Step 2/2] runtime.gopark(0x1b3ab48, 0xc4205da020, 0x1addf05, 0x10, 0x14, 0x0)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:271 +0x13a fp=0xc42002df50 sp=0xc42002df20
[02:43:10][Step 2/2] runtime.gcBgMarkWorker(0xc420027300)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/mgc.go:1491 +0x138 fp=0xc42002dfd8 sp=0xc42002df50
[02:43:10][Step 2/2] runtime.goexit()
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/asm_amd64.s:2197 +0x1 fp=0xc42002dfe0 sp=0xc42002dfd8
[02:43:10][Step 2/2] created by runtime.gcBgMarkStartWorkers
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/mgc.go:1412 +0x98
[02:43:10][Step 2/2] 
[02:43:10][Step 2/2] goroutine 22 [GC worker (idle)]:
[02:43:10][Step 2/2] runtime.gopark(0x1b3ab48, 0xc4205da030, 0x1addf05, 0x10, 0x14, 0x0)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/proc.go:271 +0x13a fp=0xc42002e750 sp=0xc42002e720
[02:43:10][Step 2/2] runtime.gcBgMarkWorker(0xc420028600)
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/mgc.go:1491 +0x138 fp=0xc42002e7d8 sp=0xc42002e750
[02:43:10][Step 2/2] runtime.goexit()
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/asm_amd64.s:2197 +0x1 fp=0xc42002e7e0 sp=0xc42002e7d8
[02:43:10][Step 2/2] created by runtime.gcBgMarkStartWorkers
[02:43:10][Step 2/2] 	/usr/local/go/src/runtime/mgc.go:1412 +0x98
[02:43:10][Step 2/2] 
[02:43:10][Step 2/2] goroutine 50 [running]:
[02:43:10][Step 2/2] 	goroutine running on other thread; stack unavailable
[02:43:10][Step 2/2] created by testing.(*T).Run
[02:43:10][Step 2/2] 	/usr/local/go/src/testing/testing.go:697 +0x2ca
[02:43:10][Step 2/2] 
[02:43:10][Step 2/2] 
[02:43:10][Step 2/2] ERROR: exit status 2
[02:43:10][Step 2/2] 
[02:43:10][Step 2/2] Makefile:177: recipe for target 'stress' failed
[02:43:10][Step 2/2] make: *** [stress] Error 1
@josharian
Copy link
Contributor

josharian commented Feb 27, 2017

Thanks for the report. The usual questions for runtime bugs: Any cgo? Any unsafe? Clean bill of health from the race detector and msan?

cc @aclements

@a-robinson
Copy link
Author

cgo: yes, but not in this the code path for these tests
unsafe: yes
race detector: clean
msan: I'm not sure whether we've run with msan recently -- @tamird, do you know?

@tamird
Copy link
Contributor

tamird commented Feb 27, 2017

@bdarnell recently has; clean AFAIK.

@ianlancetaylor
Copy link
Contributor

Does your program ever call runtime.GOMAXPROCS?

CC @aclements @RLH

@ianlancetaylor ianlancetaylor added this to the Go1.9 milestone Feb 27, 2017
@bdarnell
Copy link

We were msan-clean a month ago. My scripts for msan builds appear to have rotted so I don't have a fresh run.

There are no calls to GOMAXPROCS in our code.

@tamird
Copy link
Contributor

tamird commented Feb 28, 2017

Confirmed that we are msan-clean.

@bradfitz
Copy link
Contributor

@a-robinson
Copy link
Author

We hit this again in cockroachdb/cockroach#14060 (logs here).

Is there anything we could instrument our code with to help get some info if we hit it again?

@tamird
Copy link
Contributor

tamird commented Mar 28, 2017

Seen again in cockroachdb/cockroach#14407.

@tamird
Copy link
Contributor

tamird commented Mar 29, 2017

Another cockroachdb/cockroach#14440.

@tamird
Copy link
Contributor

tamird commented Apr 9, 2017

Another cockroachdb/cockroach#14726.

@quentinmit
Copy link
Contributor

This just happened in a trybot run while running heapsampling.go:

http://storage.googleapis.com/dev-go-build-log/cbb91820/windows-amd64-gce_3070529c.log

@aclements
Copy link
Member

Sorry for the slow response on this. I've looked at this several times but today I finally came up with a strong theory.

@a-robinson or @tamird, is this happening often enough that you'd be able to test and confirm a small fix? If not, that's fine too. The change really does fix a bug and it's zero risk, so we can put it in 1.8.2. I'm just not 100% positive it's the cause and it would be nice to confirm that if possible.

@tamird
Copy link
Contributor

tamird commented May 11, 2017

@aclements unfortunately I don't think so. We've been linking every occurrence to this issue, so if you scroll through the "referenced this issue" list in this thread, you'll see that it last happened 15 days ago, and before that it was 26 days ago, and before that it was 32 days ago.

Is the fix already in 1.9? Could you link to the CL/issue/commit?

@aclements
Copy link
Member

Is the fix already in 1.9? Could you link to the CL/issue/commit?

Not yet. It's trivial to work around in the runtime, but the bug is actually in the compiler (#20335), so this may be just one instance of it.

@aclements aclements modified the milestones: Go1.8.2, Go1.9 May 11, 2017
@gopherbot
Copy link

CL https://golang.org/cl/43311 mentions this issue.

@aclements
Copy link
Member

Reopening: CL 43311 is not on release branch.

@aclements aclements reopened this May 16, 2017
@gopherbot
Copy link

CL https://golang.org/cl/43412 mentions this issue.

@bradfitz bradfitz modified the milestones: Go1.8.2, Go1.8.3 May 18, 2017
gopherbot pushed a commit that referenced this issue May 23, 2017
Cherry-pick of CL 43311.

runtime.gchelper depends on the non-atomic load of work.ndone
happening strictly before the atomic add of work.nwait. Until very
recently (commit 978af9c, fixing #20334), the compiler reordered
these operations. This created a race since work.ndone can change as
soon as work.nwait is equal to work.ndone. If that happened, more than
one gchelper could attempt to wake up the work.alldone note, causing a
"double wakeup" panic.

This was fixed in the compiler, but to make this code less subtle,
make the load of work.ndone atomic. This clearly forces the order of
these operations, ensuring the race doesn't happen.

Fixes #19305 (though really 978af9c fixed it).

Change-Id: Ieb1a84e1e5044c33ac612c8a5ab6297e7db4c57d
Reviewed-on: https://go-review.googlesource.com/43412
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
@broady broady closed this as completed May 23, 2017
@golang golang locked and limited conversation to collaborators May 23, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants