Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/debug: SetGCPercent grabs heap lock on G leading to potential self-deadlock #32105

Closed
Helflym opened this issue May 17, 2019 · 2 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@Helflym
Copy link
Contributor

Helflym commented May 17, 2019

The first failure on the dashboard is
https://build.golang.org/log/b3defcc2b2f4353229022a8bc9c810b2b1f5d37e

But it might have occur before as runtime tests were failing, runtime/debug tests weren't run.

It is freezing during a copystack. Using stackDebug variable, I've this kind of trace.

...
copystack gp=0xa000100000b0480 [0xa00010000088000 0xa00010000088f70 0xa00010000089000] -> [0xa00010000208000 0xa00010000209f70 0xa0001000020a000]/8192
    adjusting testing.tRunner.func1 frame=[0x0,0x0] pc=0x1000cc420 continpc=0x1000cc420
    adjusting runtime/debug_test.TestReadGCStats frame=[0xa00010000209f70,0xa00010000209f70] pc=0x1001038f8 continpc=0x1001038f8
    adjusting testing.tRunner frame=[0xa00010000209f70,0xa00010000209fb0] pc=0x1000c8bcc continpc=0x1000c8bcc
    adjusting runtime.goexit frame=[0xa00010000209fb0,0xa00010000209fb0] pc=0x100063614 continpc=0x100063614
stackfree 0xa00010000088000 4096
stack grow done
runtime: newstack sp=0xa000100002086a0 stack=[0xa00010000208000, 0xa0001000020a000]
        morebuf={pc:0x10001b9c0 sp:0xa000100002086a0 lr:0x0}
        sched={pc:0x100023994 sp:0xa000100002086a0 lr:0x10001b9c0 ctxt:0x0}
stackalloc 16384
stackcacherefill order=3
SIGQUIT: quit
PC=0x900000000547fd0 m=1 sigcode=8

goroutine 0 [idle]:
runtime.usleep(0x2710cd99d5e8)
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/os2_aix.go:509 +0x54 fp=0x110167698 sp=0x110167650 pc=0x100030264
runtime.sysmon()
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/proc.go:4296 +0xa4 fp=0x110167708 sp=0x110167698 pc=0x100040954
runtime.mstart1()
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/proc.go:1201 +0xf4 fp=0x110167740 sp=0x110167708 pc=0x100038144
runtime.mstart()
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/proc.go:1167 +0x5c fp=0x110167770 sp=0x110167740 pc=0x10003802c

goroutine 1 [chan receive]:
runtime.gopark(0x10015cfe8, 0xa000100000ee238, 0xe17000100154f7d, 0x3)
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/proc.go:304 +0x118 fp=0xa00010000207ab0 sp=0xa00010000207a80 pc=0x100035298
runtime.goparkunlock(...)
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/proc.go:310
runtime.chanrecv(0xa000100000ee1e0, 0xa00010000207c00, 0x100010000000180, 0x1000c8f48)
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/chan.go:524 +0x2cc fp=0xa00010000207b50 sp=0xa00010000207ab0 pc=0x100004c9c
runtime.chanrecv1(0xa000100000ee1e0, 0xa00010000207c00)
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/chan.go:406 +0x24 fp=0xa00010000207b90 sp=0xa00010000207b50 pc=0x1000049b4
testing.(*T).Run(0xa00010000216100, 0x100154f7d, 0xf, 0x10015d0a8, 0x8002306ae)
        /opt/freeware/src/packages/BUILD/go-root/src/testing/testing.go:961 +0x334 fp=0xa00010000207c50 sp=0xa00010000207b90 pc=0x1000c8f64
testing.runTests.func1(0xa00010000216000)
        /opt/freeware/src/packages/BUILD/go-root/src/testing/testing.go:1207 +0x78 fp=0xa00010000207cb0 sp=0xa00010000207c50 pc=0x1000cc888
testing.tRunner(0xa00010000216000, 0xa00010000207db8)
        /opt/freeware/src/packages/BUILD/go-root/src/testing/testing.go:909 +0xcc fp=0xa00010000207cf0 sp=0xa00010000207cb0 pc=0x1000c8bcc
testing.runTests(0xa000100000de040, 0x110015f80, 0x7, 0x7, 0x0)
        /opt/freeware/src/packages/BUILD/go-root/src/testing/testing.go:1205 +0x2d0 fp=0xa00010000207dd8 sp=0xa00010000207cf0 pc=0x1000ca430
testing.(*M).Run(0xa00010000200000, 0x0)
        /opt/freeware/src/packages/BUILD/go-root/src/testing/testing.go:1122 +0x18c fp=0xa00010000207ed0 sp=0xa00010000207dd8 pc=0x1000c949c
main.main()
        _testmain.go:58 +0x130 fp=0xa00010000207f68 sp=0xa00010000207ed0 pc=0x100105d50
runtime.main()
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/proc.go:203 +0x28c fp=0xa00010000207fc0 sp=0xa00010000207f68 pc=0x100034dec
runtime.goexit()
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/asm_ppc64x.s:884 +0x4 fp=0xa00010000207fc0 sp=0xa00010000207fc0 pc=0x100063614

goroutine 2 [force gc (idle)]:
runtime.gopark(0x10015cfe8, 0x110021490, 0x1114000000000000, 0x1)
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/proc.go:304 +0x118 fp=0xa00010000054f80 sp=0xa00010000054f50 pc=0x100035298
runtime.goparkunlock(...)
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/proc.go:310
runtime.forcegchelper()
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/proc.go:253 +0xd0 fp=0xa00010000054fc0 sp=0xa00010000054f80 pc=0x100035100
runtime.goexit()
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/asm_ppc64x.s:884 +0x4 fp=0xa00010000054fc0 sp=0xa00010000054fc0 pc=0x100063614
created by runtime.init.4
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/proc.go:242 +0x34

goroutine 3 [GC sweep wait]:
runtime.gopark(0x10015cfe8, 0x110021680, 0xc14000000000000, 0x1)
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/proc.go:304 +0x118 fp=0xa000100000c0f78 sp=0xa000100000c0f48 pc=0x100035298
runtime.goparkunlock(...)
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/proc.go:310
runtime.bgsweep(0xa00010000022070)
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/mgcsweep.go:70 +0xac fp=0xa000100000c0fb8 sp=0xa000100000c0f78 pc=0x1000247cc
runtime.goexit()
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/asm_ppc64x.s:884 +0x4 fp=0xa000100000c0fb8 sp=0xa000100000c0fb8 pc=0x100063614
created by runtime.gcenable
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/mgc.go:210 +0x58

goroutine 4 [GC scavenge wait]:
runtime.gopark(0x10015cfe8, 0x1100217e0, 0xd14000000000000, 0x1)
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/proc.go:304 +0x118 fp=0xa00010000084ef0 sp=0xa00010000084ec0 pc=0x100035298
runtime.goparkunlock(...)
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/proc.go:310
runtime.bgscavenge(0xa00010000022070)
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/mgcscavenge.go:257 +0xf0 fp=0xa00010000084fb8 sp=0xa00010000084ef0 pc=0x100023e30
runtime.goexit()
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/asm_ppc64x.s:884 +0x4 fp=0xa00010000084fb8 sp=0xa00010000084fb8 pc=0x100063614
created by runtime.gcenable
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/mgc.go:211 +0x78

goroutine 17 [finalizer wait]:
runtime.gopark(0x10015cfe8, 0x11004cd38, 0x1014000000000000, 0x1)
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/proc.go:304 +0x118 fp=0xa00010000055720 sp=0xa000100000556f0 pc=0x100035298
runtime.goparkunlock(...)
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/proc.go:310
runtime.runfinq()
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/mfinal.go:175 +0xc8 fp=0xa000100000557c0 sp=0xa00010000055720 pc=0x100019998
runtime.goexit()
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/asm_ppc64x.s:884 +0x4 fp=0xa000100000557c0 sp=0xa000100000557c0 pc=0x100063614
created by runtime.createfing
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/mfinal.go:156 +0x90

goroutine 18 [copystack]:
runtime.gcPaceScavenger()
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/mgcscavenge.go:83 +0x14 fp=0xa000100002086a0 sp=0xa000100002086a0 pc=0x100023994
runtime.gcSetTriggerRatio(0x3fec000000000000)
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/mgc.go:858 +0x150 fp=0xa00010000208700 sp=0xa000100002086a0 pc=0x10001b9c0
runtime/debug.setGCPercent(0xffffffff00000000, 0x0)
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/mgc.go:227 +0x98 fp=0xa00010000208730 sp=0xa00010000208700 pc=0x10001a998
runtime/debug.SetGCPercent(0xffffffffffffffff, 0x0)
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/debug/garbage.go:92 +0x2c fp=0xa00010000208760 sp=0xa00010000208730 pc=0x1000c054c
runtime/debug_test.TestReadGCStats(0xa00010000216100)
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/debug/garbage_test.go:16 +0x40 fp=0xa00010000209f70 sp=0xa00010000208760 pc=0x100103910
testing.tRunner(0xa00010000216100, 0x10015d0a8)
        /opt/freeware/src/packages/BUILD/go-root/src/testing/testing.go:909 +0xcc fp=0xa00010000209fb0 sp=0xa00010000209f70 pc=0x1000c8bcc
runtime.goexit()
        /opt/freeware/src/packages/BUILD/go-root/src/runtime/asm_ppc64x.s:884 +0x4 fp=0xa00010000209fb0 sp=0xa00010000209fb0 pc=0x100063614
created by testing.(*T).Run
        /opt/freeware/src/packages/BUILD/go-root/src/testing/testing.go:960 +0x318

r0   0xffffffffffffffff r1   0x110167290
r2   0xffffffffffffffff r3   0xffffffffffffffff
r4   0xffffffffffffffff r5   0xffffffffffffffff
r6   0xffffffffffffffff r7   0xffffffffffffffff
r8   0xffffffffffffffff r9   0xffffffffffffffff
r10  0xffffffffffffffff r11  0xffffffffffffffff
r12  0xffffffffffffffff r13  0x11016f800
r14  0x0        r15  0x0
r16  0x0        r17  0x0
r18  0x0        r19  0x0
r20  0x0        r21  0x0
r22  0x0        r23  0x0
r24  0x0        r25  0x0
r26  0x110167430        r27  0x0
r28  0xfffffffffffffe0  r29  0x9001000a00927d0
r30  0x9001000a00043d8  r31  0x0
pc   0x900000000547fd0  ctr  0xffffffff00000000
link 0xffffffffffffffff xer  0xffffffff
ccr  0x0        trap 0x0

@mknyszek is it related to #31966 or similar issues ?

@mknyszek
Copy link
Contributor

No this is different.

What's happening is a self-deadlock on the G calling SetGCPercent.

  1. G calls SetGCPercent,
  2. ...which grabs the heap lock,
  3. ...then calls into gcSetTriggerRatio,
  4. ...which calls into gcPaceScavenger (new in 1.13),
  5. ...which is just enough to tip the G into requiring more stack space,
  6. ...and there's no free stack space of that size,
  7. ...which means that it tries to allocate a new stack from mheap,
  8. ...thereby trying to grab the heap lock, which it already owns.

SetGCPercent is just incorrect. It should only grab the heap lock on the system stack.

What's surprising to me here is that this wasn't happening before. I guess calls to SetGCPercent are just rare enough that they don't trigger a morestack?

@mknyszek mknyszek changed the title runtime/debug: TestReadGCStats freezing on AIX runtime/debug: SetGCPercent grabs heap lock on G leading to potential self-deadlock May 17, 2019
@mknyszek mknyszek self-assigned this May 17, 2019
@gopherbot
Copy link

Change https://golang.org/cl/177857 mentions this issue: runtime: document and maintain heap lock invariant

@mknyszek mknyszek added this to the Go1.13 milestone May 17, 2019
@mknyszek mknyszek added the NeedsFix The path to resolution is known, but the work has not been done. label May 17, 2019
@golang golang locked and limited conversation to collaborators May 23, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

3 participants