Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build: Builders takes too long to complete tests #45916

Closed
mengzhuo opened this issue May 3, 2021 · 11 comments
Closed

x/build: Builders takes too long to complete tests #45916

mengzhuo opened this issue May 3, 2021 · 11 comments
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Milestone

Comments

@mengzhuo
Copy link
Contributor

mengzhuo commented May 3, 2021

Take a look at https://build.golang.org now, some of builders that running for more than 20000 seconds (5 h).
But when I login into the machine, it got "hang up" since 5 hours ago.

截屏2021-05-03 09 19 51

https://farmer.golang.org/temporarylogs?name=linux-mips64le-mengzhuo&rev=95c5f4da80960d0e3511d39c9a9db7280099a37e&st=0xc0167edba0&subName=net&subRev=89ef3d95e781148a0951956029c92a211477f7f9

:: Running /tmp/workdir-host-linux-mipsle-mengzhuo/go/bin/go with args ["/tmp/workdir-host-linux-mipsle-mengzhuo/go/bin/go" "test" "-short" "golang.org/x/net/http2/h2demo/..."] and env ["LANG=zh_CN.UTF-8" "PATH=/tmp/workdir-host-linux-mipsle-mengzhuo/go/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin" "HOME=/home/gopher" "LOGNAME=gopher" "USER=gopher" "SHELL=/bin/bash" "INVOCATION_ID=d5d020da84d24f069076b4fbb99b636d" "JOURNAL_STREAM=8:130301336" "GO_BUILDER_ENV=host-linux-mipsle-mengzhuo" "WORKDIR=/tmp/workdir-host-linux-mipsle-mengzhuo" "HTTPS_PROXY=http://proxy:8123" "HTTP_PROXY=http://proxy:8123" "GO_STAGE0_NET_DELAY=1.3s" "GO_STAGE0_DL_DELAY=2.6s" "GOROOT_BOOTSTRAP=/tmp/workdir-host-linux-mipsle-mengzhuo/go1.4" "GO_BUILDER_NAME=linux-mips64le-mengzhuo" "GO_BUILDER_FLAKY_NET=1" "GOROOT_BOOTSTRAP=/usr/lib/golang" "GOMIPS64=hardfloat" "GOARCH=mips64le" "GOHOSTARCH=mips64le" "GOROOT=/tmp/workdir-host-linux-mipsle-mengzhuo/go" "GOPATH=/tmp/workdir-host-linux-mipsle-mengzhuo/gopath" "GOPROXY=http://10.240.0.32:30157" "GOPROXY=https://proxy.golang.org" "TMPDIR=/tmp/workdir-host-linux-mipsle-mengzhuo/tmp" "GOCACHE=/tmp/workdir-host-linux-mipsle-mengzhuo/gocache"] in dir /tmp/workdir-host-linux-mipsle-mengzhuo/gopath/src/golang.org/x/net/http2/h2demo

go: downloading cloud.google.com/go v0.38.0
go: downloading go4.org v0.0.0-20190218023631-ce4c26f7be8e
go: downloading golang.org/x/build v0.0.0-20190509182522-45de920fc22c
go: downloading golang.org/x/crypto v0.0.0-20200622213623-75b288015ac9
go: downloading github.com/googleapis/gax-go/v2 v2.0.4
go: downloading google.golang.org/api v0.4.0
go: downloading google.golang.org/genproto v0.0.0-20190502173448-54afdca5d873
go: downloading go.opencensus.io v0.21.0
go: downloading google.golang.org/grpc v1.20.1
go: downloading golang.org/x/oauth2 v0.0.0-20190402181905-9f3314589c9a
go: downloading github.com/hashicorp/golang-lru v0.5.1
go: downloading github.com/golang/protobuf v1.3.1

@bcmills

@gopherbot gopherbot added the Builders x/build issues (builders, bots, dashboards) label May 3, 2021
@gopherbot gopherbot added this to the Unreleased milestone May 3, 2021
@bcmills
Copy link
Contributor

bcmills commented May 3, 2021

Can you send a SIGQUIT to the stuck go test process? That might give us enough of a stack dump to figure out what's going on.

@mengzhuo
Copy link
Contributor Author

mengzhuo commented May 3, 2021

Done.

htop shows vet might be the culprit.
截屏2021-05-03 09 38 50

PC=0x8e264 m=0 sigcode=0

goroutine 0 [idle]:
runtime.futex()
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/sys_linux_mips64x.s:458 +0x1c
runtime.futexsleep(0x538dd8, 0x0, 0xffffffffffffffff)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/os_linux.go:44 +0x40
runtime.notesleep(0x538dd8)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/lock_futex.go:160 +0xd0
runtime.mPark()
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/proc.go:1441 +0x28
runtime.stopm()
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/proc.go:2406 +0xbc
runtime.findrunnable()
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/proc.go:2982 +0xd18
runtime.schedule()
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/proc.go:3355 +0x334
runtime.park_m(0xc000083380)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/proc.go:3504 +0x1c4
runtime.mcall()
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/asm_mips64x.s:143 +0x50

goroutine 1 [semacquire]:
sync.runtime_Semacquire(0xc00012dd94)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/sema.go:56 +0x4c
sync.(*WaitGroup).Wait(0xc00012dd94)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/sync/waitgroup.go:130 +0xc0
cmd/vendor/golang.org/x/tools/go/analysis/unitchecker.run.func6({0xc00009c4f0, 0x1, 0x1})
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/vendor/golang.org/x/tools/go/analysis/unitchecker/unitchecker.go:368 +0x7c
cmd/vendor/golang.org/x/tools/go/analysis/unitchecker.run(0xc0001a4b40, 0xc00014a9c0, {0xc0001a4a00, 0x8, 0x8})
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/vendor/golang.org/x/tools/go/analysis/unitchecker/unitchecker.go:371 +0x840
cmd/vendor/golang.org/x/tools/go/analysis/unitchecker.Run({0xfffbba79db, 0x4b}, {0xc0001a4a00, 0x8, 0x8})
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/vendor/golang.org/x/tools/go/analysis/unitchecker/unitchecker.go:132 +0xf4
cmd/vendor/golang.org/x/tools/go/analysis/unitchecker.Main({0xc0001a4a00, 0x8, 0x8})
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/vendor/golang.org/x/tools/go/analysis/unitchecker/unitchecker.go:119 +0x420
main.main()
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/vet/main.go:40 +0x7e8

goroutine 22 [semacquire]:
cmd/vendor/golang.org/x/tools/go/analysis/unitchecker.run.func6({0x536fa8, 0x1, 0x1})
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/vendor/golang.org/x/tools/go/analysis/unitchecker/unitchecker.go:360 +0x4c
cmd/vendor/golang.org/x/tools/go/analysis/unitchecker.run.func5.1()
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/vendor/golang.org/x/tools/go/analysis/unitchecker/unitchecker.go:305 +0xbc
sync.(*Once).doSlow(0xc000046c30, 0xc00002e738)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/sync/once.go:68 +0x104
sync.(*Once).Do(...)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/sync/once.go:59
cmd/vendor/golang.org/x/tools/go/analysis/unitchecker.run.func5(0x52de20)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/vendor/golang.org/x/tools/go/analysis/unitchecker/unitchecker.go:304 +0x174
cmd/vendor/golang.org/x/tools/go/analysis/unitchecker.run.func6.1(0xc00009c7c0, 0xc00012dd94, 0x52de20)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/vendor/golang.org/x/tools/go/analysis/unitchecker/unitchecker.go:364 +0x48
created by cmd/vendor/golang.org/x/tools/go/analysis/unitchecker.run.func6
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/vendor/golang.org/x/tools/go/analysis/unitchecker/unitchecker.go:363 +0xf8

r0   0x0	r1   0x1
r2   0x144a	r3   0x538720
r4   0x538dd8	r5   0x80
r6   0x0	r7   0x0
r8   0x0	r9   0x0
r10  0xfffe514030	r11  0xaaaaa7
r12  0x3	r13  0x1
r14  0x4b219dd2	r15  0x18
r16  0xfffbba36f8	r17  0x538c80
r18  0x188	r19  0x31
r20  0x538f10	r21  0x0
r22  0x36bea0	r23  0x520000
r24  0x341d3df61ce01e	r25  0xd21a09776ea74
r26  0x0	r27  0x0
r28  0x0	r29  0xfffbba3560
r30  0x538720	r31  0x4c6a8
pc   0x8e264	link 0x4c6a8
lo   0xaccec	hi   0x0
?   	golang.org/x/net/http2/h2demo	[no test files]
FAIL

@bcmills
Copy link
Contributor

bcmills commented May 3, 2021

That's a really weird stack trace. Goroutine 22 is stuck on a semaquire, but the line that it claims to be stuck on is just a variable declaration:

I'm inclined to suspect a runtime bug. (CC @prattmic @mknyszek)

@ALTree ALTree added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 3, 2021
@prattmic
Copy link
Member

prattmic commented May 3, 2021

w.r.t. the line number on goroutine 22, here's the objdump from vet built at 95c5f4d:

0027b268 <cmd/vendor/golang.org/x/tools/go/analysis/unitchecker.run.func6>:
cmd/vendor/golang.org/x/tools/go/analysis/unitchecker.run.func6():
/usr/local/google/home/mpratt/src/go/src/cmd/vendor/golang.org/x/tools/go/analysis/unitchecker/unitchecker.go:359
        execAll = func(analyzers []*analysis.Analyzer) {
  27b268:       8fc10008        lw      at,8(s8)
  27b26c:       003d082b        sltu    at,at,sp
  27b270:       14200006        bnez    at,27b28c <cmd/vendor/golang.org/x/tools/go/analysis/unitchecker.run.func6+0x24>
  27b274:       00000000        nop
  27b278:       001f1825        or      v1,zero,ra
  27b27c:       0c021dc9        jal     87724 <runtime.morestack>
  27b280:       00000000        nop
  27b284:       1000fff8        b       27b268 <cmd/vendor/golang.org/x/tools/go/analysis/unitchecker.run.func6>
  27b288:       00000000        nop
  27b28c:       afbfffd4        sw      ra,-44(sp)
  27b290:       27bdffd4        addiu   sp,sp,-44
  27b294:       8ec10004        lw      at,4(s6)
  27b298:       afa10028        sw      at,40(sp)
/usr/local/google/home/mpratt/src/go/src/cmd/vendor/golang.org/x/tools/go/analysis/unitchecker/unitchecker.go:360
                var wg sync.WaitGroup
  27b29c:       3c020030        lui     v0,0x30
  27b2a0:       2442de60        addiu   v0,v0,-8608
  27b2a4:       afa20004        sw      v0,4(sp)
  27b2a8:       0c00783d        jal     1e0f4 <runtime.newobject>
  27b2ac:       00000000        nop
  27b2b0:       8fa10008        lw      at,8(sp)
/usr/local/google/home/mpratt/src/go/src/cmd/vendor/golang.org/x/tools/go/analysis/unitchecker/unitchecker.go:361
                for _, a := range analyzers {
  27b2b4:       8fa20034        lw      v0,52(sp)
  27b2b8:       18400006        blez    v0,27b2d4 <cmd/vendor/golang.org/x/tools/go/analysis/unitchecker.run.func6+0x6c>
  27b2bc:       00000000        nop
/usr/local/google/home/mpratt/src/go/src/cmd/vendor/golang.org/x/tools/go/analysis/unitchecker/unitchecker.go:360
                var wg sync.WaitGroup
  27b2c0:       afa10024        sw      at,36(sp)
/usr/local/google/home/mpratt/src/go/src/cmd/vendor/golang.org/x/tools/go/analysis/unitchecker/unitchecker.go:361
                for _, a := range analyzers {
  27b2c4:       8fa30030        lw      v1,48(sp)
  27b2c8:       00002025        move    a0,zero
  27b2cc:       1000000c        b       27b300 <cmd/vendor/golang.org/x/tools/go/analysis/unitchecker.run.func6+0x98>
  27b2d0:       00000000        nop
/usr/local/google/home/mpratt/src/go/src/cmd/vendor/golang.org/x/tools/go/analysis/unitchecker/unitchecker.go:368
                wg.Wait()
  27b2d4:       afa10004        sw      at,4(sp)
  27b2d8:       0c025153        jal     9454c <sync.(*WaitGroup).Wait>
  27b2dc:       00000000        nop

... more ...

...func6 +0x4c is 0x27b2b4, which objdump thinks is line 361, but we may be decrementing the PC for the traceback?

Just looking at this, the line numbers seem reasonable (the runtime.newobject should be associated with the WaitGroup declaration), but I don't see how that would end up in a semacquire.

@prattmic
Copy link
Member

prattmic commented May 3, 2021

Just looking at this, the line numbers seem reasonable (the runtime.newobject should be associated with the WaitGroup declaration), but I don't see how that would end up in a semacquire.

Actually, runtime.newobject could decide to start a GC, which would use semacquire, though it looks like we are a few instructions past newobject. It would be be more helpful to have this with GOTRACEBACK=system.

@gopherbot
Copy link

Change https://golang.org/cl/316469 mentions this issue: cmd/dist: set GOTRACEBACK to "system" for testing

@mengzhuo
Copy link
Contributor Author

mengzhuo commented May 4, 2021

Another one.
https://build.golang.org/log/6e8e622f180dcd1dcbd00747d987147fb909aeba

# cmd/link/internal/loadxcoff
SIGQUIT: quit
PC=0x9680c m=0 sigcode=0

goroutine 0 [idle]:
runtime.futex()
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/sys_linux_mips64x.s:458 +0x1c
runtime.futexsleep(0x1199c18, 0x0, 0xffffffffffffffff)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/os_linux.go:44 +0x40
runtime.notesleep(0x1199c18)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/lock_futex.go:160 +0xd0
runtime.mPark()
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/proc.go:1441 +0x28
runtime.stopm()
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/proc.go:2406 +0xbc
runtime.findrunnable()
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/proc.go:2982 +0xd18
runtime.schedule()
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/proc.go:3355 +0x334
runtime.park_m(0xc000500ea0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/proc.go:3504 +0x1c4
runtime.mcall()
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/asm_mips64x.s:143 +0x50

goroutine 1 [semacquire, 261 minutes]:
sync.runtime_Semacquire(0xc00052b784)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/sema.go:56 +0x4c
sync.(*WaitGroup).Wait(0xc00052b784)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/sync/waitgroup.go:130 +0xc0
cmd/compile/internal/gc.compileFunctions()
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:153 +0x2d8
cmd/compile/internal/gc.Main(0xb3e8a0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/main.go:287 +0x1704
main.main()
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/main.go:55 +0x1a0

goroutine 8 [semacquire]:
cmd/compile/internal/ssa.zcse(0xc000083a00)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/ssa/zcse.go:23 +0x604
cmd/compile/internal/ssa.Compile(0xc000083a00)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/ssa/compile.go:96 +0xc6c
cmd/compile/internal/ssagen.buildssa(0xc000138420, 0x1)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/ssagen/ssa.go:642 +0x1ae4
cmd/compile/internal/ssagen.Compile(0xc000138420, 0x1)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/ssagen/pgen.go:165 +0x54
cmd/compile/internal/gc.compileFunctions.func2.1(0xc000166480, 0xc000138420, 0xc00000f960, 0xc00052b784)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:136 +0x64
created by cmd/compile/internal/gc.compileFunctions.func2
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:134 +0x98

goroutine 10 [semacquire]:
cmd/compile/internal/ssa.(*Func).newValue(0xc0000011e0, 0x985, 0xc00037e380, 0xc0007b6c30, {0x0, 0x0})
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/ssa/func.go:219 +0x260
cmd/compile/internal/ssa.(*Block).NewValue0(0xc0007b6c30, {0x0, 0x0}, 0x985, 0xc00037e380)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/ssa/func.go:375 +0x64
cmd/compile/internal/ssa.rewriteValuegeneric_OpConstString(0xc0007877e0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/ssa/rewritegeneric.go:3765 +0x588
cmd/compile/internal/ssa.rewriteValuegeneric(0xc0007877e0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/ssa/rewritegeneric.go:50 +0x21f0
cmd/compile/internal/ssa.applyRewrite(0xc0000011e0, 0xb3ebd8, 0xb3ec50, 0x1)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/ssa/rewrite.go:129 +0xd3c
cmd/compile/internal/ssa.opt(0xc0000011e0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/ssa/opt.go:9 +0x64
cmd/compile/internal/ssa.Compile(0xc0000011e0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/ssa/compile.go:96 +0xc6c
cmd/compile/internal/ssagen.buildssa(0xc0001386e0, 0x0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/ssagen/ssa.go:642 +0x1ae4
cmd/compile/internal/ssagen.Compile(0xc0001386e0, 0x0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/ssagen/pgen.go:165 +0x54
cmd/compile/internal/gc.compileFunctions.func2.1(0xc000166480, 0xc0001386e0, 0xc00000f960, 0xc00052b784)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:136 +0x64
created by cmd/compile/internal/gc.compileFunctions.func2
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:134 +0x98

goroutine 14 [semacquire]:
cmd/internal/obj.Addrel(...)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/internal/obj/data.go:205
cmd/internal/obj.(*LSym).writeAddr(0xc0008b8800, 0xc000174400, 0x4e, 0x4, 0xc00058d280, 0x0, 0x1)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/internal/obj/data.go:121 +0x270
cmd/internal/obj.(*LSym).WriteAddr(...)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/internal/obj/data.go:135
cmd/internal/obj.dwCtxt.AddDWARFAddrSectionOffset({0xc000174400}, {0xc6c9c0, 0xc0008b8800}, {0xafc720, 0xc00058d280}, 0x0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/internal/obj/dwarf.go:256 +0xc0
cmd/internal/dwarf.putattr({0xc80e20, 0xc000174400}, {0xc6c9c0, 0xc0008b8800}, 0x10, 0x10, 0x6, 0x0, {0xafc720, 0xc00058d280})
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/internal/dwarf/dwarf.go:996 +0x908
cmd/internal/dwarf.putvar({0xc80e20, 0xc000174400}, 0xc000939cd0, 0xc0009851f0, {0xc6c9c0, 0x0}, 0x3, 0xffffffffffffffff, {0xc00052d620, 0x0, ...})
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/internal/dwarf/dwarf.go:1589 +0x314
cmd/internal/dwarf.putscope({0xc80e20, 0xc000174400}, 0xc000939cd0, {0xc0008bf240, 0x1, 0x1}, 0x0, 0x3, {0xc00052d620, 0x0, ...})
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/internal/dwarf/dwarf.go:1427 +0x6c0
cmd/internal/dwarf.putPrunedScopes({0xc80e20, 0xc000174400}, 0xc000939cd0, 0x3)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/internal/dwarf/dwarf.go:1172 +0x130
cmd/internal/dwarf.PutDefaultFunc({0xc80e20, 0xc000174400}, 0xc000939cd0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/internal/dwarf/dwarf.go:1399 +0x48c
cmd/internal/obj.(*Link).populateDWARF(0xc000174400, {0xaff6e0, 0xc000720580}, 0xc000724280, {0xfffbd1b874, 0x1b})
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/internal/obj/dwarf.go:383 +0x49c
cmd/internal/obj.Flushplist(0xc000174400, 0xc000939f50, 0xc0009823d0, {0xfffbd1b874, 0x1b})
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/internal/obj/plist.go:140 +0x65c
cmd/compile/internal/objw.(*Progs).Flush(...)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/objw/prog.go:124
cmd/compile/internal/ssagen.Compile(0xc000720580, 0x3)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/ssagen/pgen.go:190 +0x67c
cmd/compile/internal/gc.compileFunctions.func2.1(0xc000166480, 0xc000720580, 0xc00000f960, 0xc00052b784)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:136 +0x64
created by cmd/compile/internal/gc.compileFunctions.func2
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:134 +0x98

goroutine 49 [semacquire]:
cmd/compile/internal/ssa.schedule(0xc00037a1a0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/ssa/schedule.go:112 +0x154
cmd/compile/internal/ssa.Compile(0xc00037a1a0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/ssa/compile.go:96 +0xc6c
cmd/compile/internal/ssagen.buildssa(0xc0007209a0, 0x2)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/ssagen/ssa.go:642 +0x1ae4
cmd/compile/internal/ssagen.Compile(0xc0007209a0, 0x2)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/ssagen/pgen.go:165 +0x54
cmd/compile/internal/gc.compileFunctions.func2.1(0xc000166480, 0xc0007209a0, 0xc00000f960, 0xc00052b784)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:136 +0x64
created by cmd/compile/internal/gc.compileFunctions.func2
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:134 +0x98

goroutine 50 [chan receive]:
cmd/compile/internal/gc.compileFunctions.func2.1(0xc000166480, 0xc000720b00, 0xc00000f960, 0xc00052b784)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:135 +0x48
created by cmd/compile/internal/gc.compileFunctions.func2
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:134 +0x98

goroutine 52 [chan receive]:
cmd/compile/internal/gc.compileFunctions.func2.1(0xc000166480, 0xc000720dc0, 0xc00000f960, 0xc00052b784)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:135 +0x48
created by cmd/compile/internal/gc.compileFunctions.func2
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:134 +0x98

goroutine 53 [chan receive]:
cmd/compile/internal/gc.compileFunctions.func2.1(0xc000166480, 0xc000720f20, 0xc00000f960, 0xc00052b784)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:135 +0x48
created by cmd/compile/internal/gc.compileFunctions.func2
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:134 +0x98

goroutine 54 [chan receive]:
cmd/compile/internal/gc.compileFunctions.func2.1(0xc000166480, 0xc000721080, 0xc00000f960, 0xc00052b784)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:135 +0x48
created by cmd/compile/internal/gc.compileFunctions.func2
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:134 +0x98

goroutine 55 [chan receive]:
cmd/compile/internal/gc.compileFunctions.func2.1(0xc000166480, 0xc0007211e0, 0xc00000f960, 0xc00052b784)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:135 +0x48
created by cmd/compile/internal/gc.compileFunctions.func2
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:134 +0x98

goroutine 56 [chan receive]:
cmd/compile/internal/gc.compileFunctions.func2.1(0xc000166480, 0xc000721340, 0xc00000f960, 0xc00052b784)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:135 +0x48
created by cmd/compile/internal/gc.compileFunctions.func2
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:134 +0x98

goroutine 57 [chan receive]:
cmd/compile/internal/gc.compileFunctions.func2.1(0xc000166480, 0xc0007214a0, 0xc00000f960, 0xc00052b784)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:135 +0x48
created by cmd/compile/internal/gc.compileFunctions.func2
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:134 +0x98

goroutine 58 [chan receive]:
cmd/compile/internal/gc.compileFunctions.func2.1(0xc000166480, 0xc000721600, 0xc00000f960, 0xc00052b784)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:135 +0x48
created by cmd/compile/internal/gc.compileFunctions.func2
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:134 +0x98

goroutine 59 [chan receive]:
cmd/compile/internal/gc.compileFunctions.func2.1(0xc000166480, 0xc000721760, 0xc00000f960, 0xc00052b784)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:135 +0x48
created by cmd/compile/internal/gc.compileFunctions.func2
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:134 +0x98

goroutine 60 [chan receive]:
cmd/compile/internal/gc.compileFunctions.func2.1(0xc000166480, 0xc0007218c0, 0xc00000f960, 0xc00052b784)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:135 +0x48
created by cmd/compile/internal/gc.compileFunctions.func2
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:134 +0x98

goroutine 61 [chan receive]:
cmd/compile/internal/gc.compileFunctions.func2.1(0xc000166480, 0xc000721a20, 0xc00000f960, 0xc00052b784)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:135 +0x48
created by cmd/compile/internal/gc.compileFunctions.func2
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/compile/internal/gc/compile.go:134 +0x98

r0   0x0	r1   0xc0007b9230
r2   0x144a	r3   0x11992a0
r4   0x1199c18	r5   0x80
r6   0x0	r7   0x0
r8   0x0	r9   0x0
r10  0xc000039ed0	r11  0x1
r12  0x0	r13  0xc0007bd208
r14  0x1	r15  0x1
r16  0xfffbd196d0	r17  0x1199ac0
r18  0x0	r19  0x1
r20  0x1199d50	r21  0x0
r22  0xb3fbc8	r23  0x1110000
r24  0x117c9302a54798	r25  0xdb28a5d49ab09
r26  0x0	r27  0x0
r28  0x0	r29  0xfffbd19520
r30  0x11992a0	r31  0x4fd40
pc   0x9680c	link 0x4fd40
lo   0x0	hi   0x1
go tool dist: FAILED: /tmp/workdir-host-linux-mipsle-mengzhuo/go/pkg/tool/linux_mips64le/go_bootstrap install -gcflags=all= -ldflags=all= -a -i cmd/asm cmd/cgo cmd/compile cmd/link: exit status 2```

@prattmic
Copy link
Member

prattmic commented May 4, 2021

Was this issue first seen 2 days ago? If this is a runtime issue, http://golang.org/cl/310850 is a good candidate for potentially missing work. Ideally we could get a core file from one of these runs (@mknyszek perhaps golang.org/cl/316469 should use 'crash' after all?).

@mengzhuo If you see this again, I wonder if you could use gcore to generate a core file (assuming gdb is installed on the builders)?

gopherbot pushed a commit that referenced this issue May 4, 2021
If we're testing through dist, we're testing the implementation of Go,
so we're interested in any package failing with potential runtime
issues. In these cases, we'd like to have as much relevant detail as
possible, but currently runtime stack frames and goroutines are
suppressed due to the default GOTRACEBACK setting.

So, try to set GOTRACEBACK to system if it's unset. Check if it's unset
first so we don't override the user asking for a lower or higher level.

This change was brought up in the context of #45916, since there's an
apparent deadlock (or something!) in the runtime that appears when
running other code, but it's difficult to see exactly where it's
blocked. However, this change is very generally useful.

This change also runs scripted tests with GOTRACEBACK=system, upgrading
from GOTRACEBACK=all. Often, script tests can trigger failures deep in
the runtime in interesting ways because they start many individual Go
processes, so being able to identify points of interest in the runtime
is quite useful.

For #45916.

Change-Id: I3d50658d0d0090fb4c9182b87200d266c7f8f915
Reviewed-on: https://go-review.googlesource.com/c/go/+/316469
Trust: Michael Knyszek <mknyszek@google.com>
Trust: Bryan C. Mills <bcmills@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
@gopherbot
Copy link

Change https://golang.org/cl/318569 mentions this issue: runtime: hold sched.lock across atomic pidleget/pidleput

@prattmic
Copy link
Member

prattmic commented May 10, 2021

It is hard to be certain without more complete stacks, but I think http://golang.org/cl/318569 will fix this. i.e., that #45975, #45916, #45885, and #45884 all have the same cause.

gopherbot pushed a commit that referenced this issue May 11, 2021
As a cleanup, golang.org/cl/307914 unintentionally caused the idle GC
work recheck to drop sched.lock between acquiring a P and committing to
keep it (once a worker G was found).

This is unsafe, as releasing a P requires extra checks once sched.lock
is taken (such as for runSafePointFn). Since checkIdleGCNoP does not
perform these extra checks, we can now race with other users.

In the case of #45975, we may hang with this sequence:

1. M1: checkIdleGCNoP takes sched.lock, gets P1, releases sched.lock.
2. M2: forEachP takes sched.lock, iterates over sched.pidle without
   finding P1, releases sched.lock.
3. M1: checkIdleGCNoP puts P1 back in sched.pidle.
4. M2: forEachP waits forever for P1 to run the safePointFn.

Change back to the old behavior of releasing sched.lock only after we
are certain we will keep the P. Thus if we put it back its removal from
sched.pidle was never visible.

Fixes #45975
For #45916
For #45885
For #45884

Change-Id: I191a1800923b206ccaf96bdcdd0bfdad17b532e9
Reviewed-on: https://go-review.googlesource.com/c/go/+/318569
Trust: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
@prattmic
Copy link
Member

I think this bug should be fixed, but can't be sure. Please note here if you see more occurrences.

@prattmic prattmic added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label May 11, 2021
@golang golang locked and limited conversation to collaborators May 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Builders x/build issues (builders, bots, dashboards) FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

5 participants