Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: apparent deadlock in TestDebugCallUnsafePoint #57311

Open
bcmills opened this issue Dec 14, 2022 · 8 comments
Open

runtime: apparent deadlock in TestDebugCallUnsafePoint #57311

bcmills opened this issue Dec 14, 2022 · 8 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Dec 14, 2022

#!watchflakes
post <- pkg == "runtime" && `TestDebugCallUnsafePoint` && `panic: test timed out`

https://storage.googleapis.com/go-build-log/e9552219/linux-amd64-nounified_87ac7ca9.log (a TryBot on https://go.dev/cl/457336):

panic: test timed out after 3m0s
running tests:
	TestDebugCallUnsafePoint (2m51s)
…
goroutine 23224 [stopping the world]:
runtime.systemstack_switch()
	/workdir/go/src/runtime/asm_amd64.s:463 fp=0xc000077dc0 sp=0xc000077db8 pc=0x478540
runtime.stopTheWorld({0x824124, 0xa})
	/workdir/go/src/runtime/proc.go:1162 +0xa5 fp=0xc000077e10 sp=0xc000077dc0 pc=0x443665
runtime.stopTheWorldGC({0x824124, 0xa})
	/workdir/go/src/runtime/proc.go:1210 +0x45 fp=0xc000077e48 sp=0xc000077e10 pc=0x443865
runtime.GOMAXPROCS(0x10)
	/workdir/go/src/runtime/debug.go:28 +0x85 fp=0xc000077e70 sp=0xc000077e48 pc=0x40ada5
runtime_test.TestDebugCallUnsafePoint.func2()
	/workdir/go/src/runtime/debug_test.go:245 +0x26 fp=0xc000077e88 sp=0xc000077e70 pc=0x6e3ce6
runtime_test.TestDebugCallUnsafePoint(0xc0005b24e0)
	/workdir/go/src/runtime/debug_test.go:267 +0x307 fp=0xc000077f70 sp=0xc000077e88 pc=0x6e3b67
testing.tRunner(0xc0005b24e0, 0x84b578)
	/workdir/go/src/testing/testing.go:1575 +0x10b fp=0xc000077fc0 sp=0xc000077f70 pc=0x50602b
testing.(*T).Run.func1()
	/workdir/go/src/testing/testing.go:1628 +0x2a fp=0xc000077fe0 sp=0xc000077fc0 pc=0x50706a
runtime.goexit()
	/workdir/go/src/runtime/asm_amd64.s:1598 +0x1 fp=0xc000077fe8 sp=0xc000077fe0 pc=0x47a761
created by testing.(*T).Run
	/workdir/go/src/testing/testing.go:1628 +0x3e5
@bcmills bcmills added this to the Backlog milestone Dec 14, 2022
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Dec 14, 2022
@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Dec 14, 2022
@mknyszek mknyszek self-assigned this Dec 21, 2022
@mknyszek
Copy link
Contributor

We already know this code plays it fast and loose a bit. There's a requirement that the test code here doesn't have any write barriers, and yet we know that it does. But since this is stalling when trying to set GOMAXPROCS, i.e. when it's trying to STW and preempt everything, we might be leaving behind some bad state. I'll take a look.

@mknyszek
Copy link
Contributor

Also, this is almost certainly a test-only bug. Those issues are with the test code, not the typical use of this feature. (This feature is also only used when debugging, and even then somewhat rarely, so it's not on a critical path.) Not a release blocker, IMO.

@aclements
Copy link
Member

One more interesting goroutine from that trace:

goroutine 23250 [runnable]:
runtime.goexit1()
	/workdir/go/src/runtime/proc.go:3616 +0x54 fp=0xc000750fe0 sp=0xc000750fd8 pc=0x448934
runtime.goexit()
	/workdir/go/src/runtime/asm_amd64.s:1599 +0x6 fp=0xc000750fe8 sp=0xc000750fe0 pc=0x47a766
created by runtime_test.TestDebugCallUnsafePoint

@cherrymui
Copy link
Member

goroutine 23250 [runnable]:
runtime.goexit1()
/workdir/go/src/runtime/proc.go:3616 +0x54 fp=0xc000750fe0 sp=0xc000750fd8 pc=0x448934
runtime.goexit()
/workdir/go/src/runtime/asm_amd64.s:1599 +0x6 fp=0xc000750fe8 sp=0xc000750fe0 pc=0x47a766
created by runtime_test.TestDebugCallUnsafePoint

This looks like the debugCallUnsafePointWorker goroutine is just exiting. It is preempted at goexit1's prologue. That by itself shouldn't cause a problem for stopping the world?

@gopherbot
Copy link

Found new dashboard test flakes for:

#!watchflakes
post <- pkg == "runtime" && `TestDebugCallUnsafePoint` && `panic: test timed out`
2023-02-04 07:02 android-amd64-emu go@864b54e2 runtime.TestDebugCallPanic (log)
panic: test timed out after 5m0s
running tests:
	TestDebugCallUnsafePoint (4m47s)

runtime.semrelease1(0x7199d3ddad18?, 0x0, 0x0)
	/workdir/go/src/runtime/sema.go:175 +0x1e9 fp=0xc00097ada8 sp=0xc00097ada0 pc=0x7199d3769e89
runtime.semrelease(...)
	/workdir/go/src/runtime/sema.go:172
runtime.gcStart({0xc0000001fe?, 0x64d3725265?, 0xd3740e40?})
	/workdir/go/src/runtime/mgc.go:752 +0x4dc fp=0xc00097ae40 sp=0xc00097ada8 pc=0x7199d373933c
runtime.GC()
	/workdir/go/src/runtime/mgc.go:444 +0x3e fp=0xc00097ae78 sp=0xc00097ae40 pc=0x7199d3738b9e
runtime_test.TestDebugCallPanic(0xc000990680)
	/workdir/go/src/runtime/debug_test.go:286 +0xac fp=0xc00097af70 sp=0xc00097ae78 pc=0x7199d39e570c
testing.tRunner(0xc000990680, 0x7199d3bb89b8)

watchflakes

@gopherbot
Copy link

Found new dashboard test flakes for:

#!watchflakes
post <- pkg == "runtime" && `TestDebugCallUnsafePoint` && `panic: test timed out`
2023-05-23 11:36 linux-amd64-boringcrypto go@0cd04724 runtime (log)
panic: test timed out after 3m0s
running tests:
	TestDebugCallUnsafePoint (2m53s)

goroutine 23243 [running]:
panic({0x9399e0?, 0xc00011a040?})
	/workdir/go/src/runtime/panic.go:1017 +0x3ac fp=0xc000072f10 sp=0xc000072e60 pc=0x5b546c
testing.(*M).startAlarm.func1()
	/workdir/go/src/testing/testing.go:2259 +0x3b9 fp=0xc000072fe0 sp=0xc000072f10 pc=0x682019
runtime.goexit()
	/workdir/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc000072fe8 sp=0xc000072fe0 pc=0x5f3b21
created by time.goFunc
	/workdir/go/src/time/sleep.go:176 +0x2d

watchflakes

@mknyszek
Copy link
Contributor

Sorry for the delay here, still planning to look into it.

@gopherbot
Copy link

Found new dashboard test flakes for:

#!watchflakes
post <- pkg == "runtime" && `TestDebugCallUnsafePoint` && `panic: test timed out`
2023-05-25 00:49 linux-amd64 go@558301a5 runtime.TestDebugCallUnsafePoint (log)
panic: test timed out after 3m0s
running tests:
	TestDebugCallUnsafePoint (2m51s)

runtime.systemstack_switch()
	/workdir/go/src/runtime/asm_amd64.s:474 +0x8 fp=0xc00064bdb0 sp=0xc00064bda0 pc=0x478f48
runtime.stopTheWorld(0x9)
	/workdir/go/src/runtime/proc.go:1231 +0xc9 fp=0xc00064be10 sp=0xc00064bdb0 pc=0x442549
runtime.stopTheWorldGC(0xf3?)
	/workdir/go/src/runtime/proc.go:1279 +0x34 fp=0xc00064be50 sp=0xc00064be10 pc=0x442714
runtime.GOMAXPROCS(0x10)
	/workdir/go/src/runtime/debug.go:28 +0x69 fp=0xc00064be70 sp=0xc00064be50 pc=0x40abc9
runtime_test.TestDebugCallUnsafePoint.func2()
	/workdir/go/src/runtime/debug_test.go:245 +0x25 fp=0xc00064be88 sp=0xc00064be70 pc=0x6fc065
runtime_test.TestDebugCallUnsafePoint(0xc0007ea340)
	/workdir/go/src/runtime/debug_test.go:267 +0x2f1 fp=0xc00064bf70 sp=0xc00064be88 pc=0x6fbf11
testing.tRunner(0xc0007ea340, 0x881430)

watchflakes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Status: No status
Development

No branches or pull requests

5 participants