Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: crash on Plan 9 since CL 46033 #22227

Open
0intro opened this issue Oct 12, 2017 · 18 comments
Open

runtime: crash on Plan 9 since CL 46033 #22227

0intro opened this issue Oct 12, 2017 · 18 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Plan9
Milestone

Comments

@0intro
Copy link
Member

0intro commented Oct 12, 2017

#!watchflakes
post <- goos == "plan9" && log ~ `fatal error: unexpected signal` && (log ~ `&\s*os\.StartProcess` || log ~ `^\s*runtime\.newm1`)

Since the runtime.LockOSThread change in CL 46033, Go programs are crashing on Plan 9 when running commands with os/exec.

fatal error: unexpected signal during runtime execution
[signal sys: trap: fault read code=0x0 addr=0x0 pc=0x5ef]

runtime stack:
runtime.throw(0x36dca9, 0x2a)
	/tmp/workdir/go/src/runtime/panic.go:616 +0x6a
runtime.sigpanic()
	/tmp/workdir/go/src/runtime/os_plan9.go:79 +0x240

goroutine 1 [chan receive]:
syscall.startProcess(0x1087b310, 0x7, 0x1085db60, 0x4, 0x4, 0x107b7290, 0x10, 0x1087bc80, 0x107b7290)
	/tmp/workdir/go/src/syscall/exec_plan9.go:514 +0xac
syscall.StartProcess(0x1087b310, 0x7, 0x1085db60, 0x4, 0x4, 0x107b7290, 0x2, 0x4, 0x0, 0x0)
	/tmp/workdir/go/src/syscall/exec_plan9.go:525 +0x48
os.startProcess(0x1087b310, 0x7, 0x1085db60, 0x4, 0x4, 0x108a1c3c, 0x1084ca00, 0x3d, 0x3d)
	/tmp/workdir/go/src/os/exec_plan9.go:33 +0x191
os.StartProcess(0x1087b310, 0x7, 0x1085db60, 0x4, 0x4, 0x108a1c3c, 0x0, 0x0, 0x41dbde8)
	/tmp/workdir/go/src/os/exec.go:94 +0x48
os/exec.(*Cmd).Start(0x107c0580, 0x1086b201, 0x1076f7a0)
	/tmp/workdir/go/src/os/exec/exec.go:363 +0x3c1
os/exec.(*Cmd).Run(0x107c0580, 0x1076f7a0, 0x1086b858)
	/tmp/workdir/go/src/os/exec/exec.go:286 +0x20
os/exec.(*Cmd).CombinedOutput(0x107c0580, 0x1, 0x1, 0x1086b4a0, 0x3c, 0x3c)
	/tmp/workdir/go/src/os/exec/exec.go:506 +0xed
cmd/go/internal/work.(*Builder).gccSupportsFlag(0x107e2230, 0x10766ea8, 0x1, 0x1, 0x3661a1, 0x18, 0x0)
	/tmp/workdir/go/src/cmd/go/internal/work/build.go:3514 +0x437
cmd/go/internal/work.(*Builder).compilerCmd(0x107e2230, 0x35d3b1, 0x2, 0x35d387, 0x2, 0x35d2b1, 0x1, 0x0, 0x0, 0x302ce0, ...)
	/tmp/workdir/go/src/cmd/go/internal/work/build.go:3454 +0x61b
cmd/go/internal/work.(*Builder).GccCmd(0x107e2230, 0x35d2b1, 0x1, 0x0, 0x0, 0x107c29a0, 0xb, 0x16)
	/tmp/workdir/go/src/cmd/go/internal/work/build.go:3390 +0x64
cmd/go/internal/envcmd.MkEnv(0x1086a000, 0x3b, 0x3b)
	/tmp/workdir/go/src/cmd/go/internal/envcmd/env.go:76 +0x31a
main.main()
	/tmp/workdir/go/src/cmd/go/main.go:122 +0x479

goroutine 5 [syscall]:
os/signal.signal_recv(0x0, 0x0)
	/tmp/workdir/go/src/runtime/sigqueue_plan9.go:109 +0x69
os/signal.loop()
	/tmp/workdir/go/src/os/signal/signal_plan9.go:27 +0x19
created by os/signal.init.0
	/tmp/workdir/go/src/os/signal/signal_plan9.go:22 +0x36

goroutine 13 [runnable, locked to thread]:
syscall.Close(0x7, 0x1085dbe0, 0x5)
	/tmp/workdir/go/src/syscall/zsyscall_plan9_386.go:245 +0x86
syscall.forkExec(0x1087b310, 0x7, 0x1085db60, 0x4, 0x4, 0x107b7290, 0x108159d0, 0x0, 0x0)
	/tmp/workdir/go/src/syscall/exec_plan9.go:434 +0x4c0
syscall.startProcess.func1(0x1087b310, 0x7, 0x1085db60, 0x4, 0x4, 0x107b7290, 0x1084f600)
	/tmp/workdir/go/src/syscall/exec_plan9.go:488 +0x51
created by syscall.startProcess
	/tmp/workdir/go/src/syscall/exec_plan9.go:484 +0x80
2017/10/11 15:05:51 Failed: exit status: 'go 2462: 2'

See https://build.golang.org/log/c14eefc1c70d14ad2d0f768ff9f13222fa8d435c

@0intro 0intro added this to the Go1.10 milestone Oct 12, 2017
@0intro 0intro self-assigned this Oct 12, 2017
@0intro
Copy link
Member Author

0intro commented Oct 12, 2017

After disabling the use of template thread in newm, the build passes. However, I get another issue when running the net tests:

fatal error: runtime·lock: lock count
fatal error: runtime·unlock: lock count
fatal error: fatal error: systemstack called from unexpected goroutineruntime·lock: lock count

This issue seems to disappear once I add a call to runtime.UnlockOSThread in newAsyncIO. Why is it necessary?

Also, why the template thread doesn't work on Plan 9?

@aclements

@gopherbot
Copy link

Change https://golang.org/cl/70231 mentions this issue: runtime: disable use of template thread on Plan 9

@gopherbot
Copy link

Change https://golang.org/cl/70450 mentions this issue: internal/poll: call UnlockOSThread in Plan 9 netpoller

@0intro
Copy link
Member Author

0intro commented Oct 12, 2017

I propose CL 70231 and CL 70450 which should get the Plan 9 builders back to "ok".

@0intro
Copy link
Member Author

0intro commented Oct 13, 2017

The "lock count" panics are related to the change to terminate locked OS threads in CL 46038 (see goexit0).

As I said earlier, this issue can be worked around by calling runtime.UnlockOSThread in newAsyncIO (see CL 70450), so the OS threads are reused instead of being terminated.

I'm not sure why I get "lock count" panics when locked OS threads are terminated.

@0intro
Copy link
Member Author

0intro commented Oct 13, 2017

Any help would be kindly appreciated.

@ianlancetaylor
Copy link
Contributor

CC @aclements

@ianlancetaylor
Copy link
Contributor

The "lock count" panic means that the lock count of the m is negative. That clearly shouldn't happen, so the question is why it does happen. Can you get a stack trace of where it happens?

@0intro
Copy link
Member Author

0intro commented Oct 13, 2017

I couldn't manage to get a proper stack trace. It looks like the stacks are corrupted and it usually ends with a recursive panic.

The output usually looks like:

fatal error: runtime·lock: lock count
panic during panic
fatal error: systemstack called from unexpected goroutine

runtime stack:

stack trace unavailable
fatal error: systemstack called from unexpected goroutine
fatal error: runtime·lock: lock count
fatal error: systemstack called from unexpected goroutine
fatal error: systemstack called from unexpected goroutine
[...]
fatal error: systemstack called from unexpected goroutine
fatal error: systemstack called from unexpected goroutine
runtime: newstack called from g=0x1061ee00
	m=0x10620000 m->curg=0x10746fc0 m->g0=0x1061e0e0 m->gsignal=0x1061e000
runtime.recordForPanic(0x1ae92f, 0xd, 0xd)
	/usr/go/src/runtime/print.go:40 +0x1d
runtime.gwrite(0x1ae92f, 0xd, 0xd)
	/usr/go/src/runtime/print.go:89 +0x3c
runtime.printstring(0x1ae92f, 0xd)
	/usr/go/src/runtime/print.go:236 +0x6c
runtime.throw(0x1b7412, 0x2c)
	/usr/go/src/runtime/panic.go:610 +0x1e
runtime.badsystemstack()
	/usr/go/src/runtime/stubs.go:57 +0x2a
runtime.systemstack(0x1bb398)
	/usr/go/src/runtime/asm_386.s:371 +0x2e
runtime.startpanic()
	/usr/go/src/runtime/panic.go:589 +0x11
runtime.throw(0x1b7412, 0x2c)
	/usr/go/src/runtime/panic.go:615 +0x5e
[...]
fatal error: runtime: wrong goroutine in newstack

It looks like there is an issue in the way OS threads are terminated on Plan 9 and the stacks end being corrupted.

@golang golang deleted a comment Oct 14, 2017
@gopherbot
Copy link

Change https://golang.org/cl/71230 mentions this issue: runtime: don't terminate locked OS threads on Plan 9

@0intro
Copy link
Member Author

0intro commented Oct 16, 2017

I've replaced CL 70450 by CL 71230 so it makes more clear what is the problem worked around in this change.

I still don't know why termination of OS threads isn't working properly on Plan9. I'm investigating.

@0intro
Copy link
Member Author

0intro commented Oct 16, 2017

To sum it up, there are two issues on Plan 9 caused by the recent runtime changes.

  • CL 46033, which implements "template threads", lead to fault read/write panic when running os.StartProcess (worked around in CL 70231).
  • CL 46037 and CL 46038, which implement termination of locked OS thread, lead to lock count panic when using runtime.LockOSThread without calling runtime.UnlockOSThread (notably in the net tests) (worked around in CL 71230).

The first issue prevents Go to be compiled because cmd/go uses os.StartProcess, which uses runtime.LockOSThread. Once the first issue worked around, the second issues prevents Go to complete the net tests, because implementation of network timelines use runtime.LockOSThread.

gopherbot pushed a commit that referenced this issue Oct 17, 2017
CL 46033 added a "template thread" mechanism to
allow creation of thread with a known-good state
from a thread of unknown state.

However, we are experiencing issues on Plan 9
with programs using the os/exec and net package.
These package are relying on runtime.LockOSThread.

Updates #22227.

Change-Id: I85b71580a41df9fe8b24bd8623c064b6773288b0
Reviewed-on: https://go-review.googlesource.com/70231
Run-TryBot: David du Colombier <0intro@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
gopherbot pushed a commit that referenced this issue Oct 17, 2017
CL 46037 and CL 46038 implemented termination of
locked OS threads when the goroutine exits.

However, this behavior leads to crashes of Go programs
using runtime.LockOSThread on Plan 9. This is notably
the case of the os/exec and net packages.

This change disables termination of locked OS threads
on Plan 9.

Updates #22227.

Change-Id: If9fa241bff1c0b68e7e9e321e06e5203b3923212
Reviewed-on: https://go-review.googlesource.com/71230
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
@rsc rsc modified the milestones: Go1.10, Go1.11 Nov 22, 2017
@ianlancetaylor
Copy link
Contributor

@0intro Do you know what the status of this issue is? Thanks.

@0intro
Copy link
Member Author

0intro commented Jul 10, 2018

The issue has been worked around with CL 70231 and CL 71230. Unfortunately, I've not been able to find a better solution yet.

Maybe we can postpone this issue to Go 1.12.

@ianlancetaylor ianlancetaylor modified the milestones: Go1.11, Go1.12 Jul 10, 2018
@ianlancetaylor ianlancetaylor added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jul 10, 2018
@bradfitz bradfitz removed this from the Go1.12 milestone Jan 18, 2019
@bradfitz bradfitz added this to the Unplanned milestone Jan 18, 2019
@gopherbot
Copy link

Change https://golang.org/cl/233797 mentions this issue: dashboard: change "plan9" SlowBot alias to refer to plan9-arm

gopherbot pushed a commit to golang/build that referenced this issue May 13, 2020
The plan9-arm builder is currently passing fairly reliably on the
dashboard, whereas the plan9-386-0intro builder is frequently failing
with an "unexpected fault address".

Until the plan9-386-0intro build is fixed, the plan9-arm builder seems
more useful as a SlowBot to check architecture-agnostic changes that
affect plan9.

Updates golang/go#35456
Updates golang/go#22227

Change-Id: Ib5ecddafec91ca9274f9f0e15cb98621b41dda85
Reviewed-on: https://go-review.googlesource.com/c/build/+/233797
Run-TryBot: Bryan C. Mills <bcmills@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 7, 2022
@bcmills bcmills added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Mar 16, 2023
@gopherbot
Copy link

Found new dashboard test flakes for:

#!watchflakes
post <- goos == "plan9" && log ~ `fatal error: unexpected signal` && log ~ `^\s*runtime\.newm1`
2023-03-15 07:19 plan9-arm go@71f82df1 cmd/vet.TestVet (log)
--- FAIL: TestVet (0.01s)
    --- FAIL: TestVet/print (12.44s)
        vet_test.go:146: error check failed: 
            Unmatched Errors:
            fatal error: unexpected signal during runtime execution
            [signal sys: fp: code=0x0 addr=0x0 pc=0x73ac4]
            runtime stack:
            runtime.throw({0x33dd8f, 0x2a})
            	../../runtime/panic.go:1076 +0x4c fp=0x3ffffcb0 sp=0x3ffffc9c pc=0x3e054
            runtime.sigpanic()
...
            runtime.systemstack_switch()
            	../../runtime/asm_arm.s:274 +0x4 fp=0x8c37f44 sp=0x8c37f40 pc=0x711f0
            runtime.gcMarkDone()
            	../../runtime/mgc.go:806 +0xf4 fp=0x8c37f80 sp=0x8c37f44 pc=0x1b7ac
            runtime.gcBgMarkWorker()
            	../../runtime/mgc.go:1406 +0x3dc fp=0x8c37fec sp=0x8c37f80 pc=0x1d66c
            runtime.goexit()
            	../../runtime/asm_arm.s:831 +0x4 fp=0x8c37fec sp=0x8c37fec pc=0x72d5c
            created by runtime.gcBgMarkStartWorkers in goroutine 1
            	../../runtime/mgc.go:1198 +0x20

watchflakes

@bcmills bcmills removed the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Mar 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Plan9
Projects
Status: No status
Development

No branches or pull requests

6 participants