Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: fatal errors on netbsd #13945

Closed
rillig opened this issue Jan 14, 2016 · 14 comments
Closed

runtime: fatal errors on netbsd #13945

rillig opened this issue Jan 14, 2016 · 14 comments

Comments

@rillig
Copy link
Contributor

rillig commented Jan 14, 2016

  • go version go1.5.2 netbsd/amd64 (used for bootstrapping the master)
  • master 66a7097 (being built)
  • NetBSD 7.0 (GENERIC.201509250726Z) amd64

I ran the following command 14 times. It always failed to build the Go distribution, with differing results. Attached are the outputs. I saved the intermediate files, so if you need them I can provide them as well.

$ env GOROOT_BOOTSTRAP=$HOME/pkg/go GOROOT_FINAL=$HOME/gossa GOARCH=amd64 GOOS=netbsd bash ./all.bash
@mikioh mikioh changed the title Go master 66a7097 does not build on NetBSD/amd64 runtime: Go master 66a7097 does not build on NetBSD/amd64 Jan 14, 2016
@mikioh mikioh changed the title runtime: Go master 66a7097 does not build on NetBSD/amd64 runtime: fatal error: exitsyscall: syscall frame is no longer valid on netbsd/amd64 Jan 14, 2016
@mikioh
Copy link
Contributor

mikioh commented Jan 14, 2016

Dying message in try014.txt:

ok      crypto/rand     0.040s
ok      crypto/rc4      0.342s
ok      crypto/rsa      0.241s
ok      crypto/rsa      0.241s
fatal error: exitsyscall: syscall frame is no longer valid
panic during panic

goroutine 0 [idle]:
runtime.startpanic_m()
        /home/rillig/gossa/src/runtime/panic.go:587 +0x13a
runtime.systemstack(0xac66b0)
        /home/rillig/gossa/src/runtime/asm_amd64.s:307 +0xab
runtime.startpanic()
        /home/rillig/gossa/src/runtime/panic.go:508 +0x14
runtime.sighandler(0xc800000005, 0xc820009c00, 0xc820009c80, 0xc820357380)
        /home/rillig/gossa/src/runtime/signal_amd64x.go:158 +0x38c
runtime.sigtrampgo(0x5, 0xc820009c00, 0xc820009c80)
        /home/rillig/gossa/src/runtime/signal_sigtramp.go:48 +0xf2
runtime.sigtramp(0x100000005, 0x0, 0x0, 0x1, 0x0, 0xfffffe8003738de0, 0xfffffe803fa59378, 0x80000000
0001, 0xffffffff80718da6, 0xffffffff80fa6d40, ...)
        /home/rillig/gossa/src/runtime/sys_netbsd_amd64.s:252 +0x17
runtime.sigreturn_tramp(0x0, 0x0, 0x1, 0x0, 0xfffffe8003738de0, 0xfffffe803fa59378, 0x800000000001, 
0xffffffff80718da6, 0xffffffff80fa6d40, 0x43, ...)
        /home/rillig/gossa/src/runtime/sys_netbsd_amd64.s:220

goroutine 22 [running]:
runtime.systemstack_switch()
        /home/rillig/gossa/src/runtime/asm_amd64.s:245 fp=0xc820c3ece0 sp=0xc820c3ecd8
runtime.startpanic()
        /home/rillig/gossa/src/runtime/panic.go:508 +0x14 fp=0xc820c3ecf0 sp=0xc820c3ece0
runtime.throw(0xa6dea0, 0x2d)
        /home/rillig/gossa/src/runtime/panic.go:529 +0x83 fp=0xc820c3ed08 sp=0xc820c3ecf0
runtime.exitsyscall(0x860fa0)
        /home/rillig/gossa/src/runtime/proc.go:2400 +0x62 fp=0xc820c3ed30 sp=0xc820c3ed08
syscall.Syscall6(0x49bd59, 0xc8207406a8, 0xc820c3edd8, 0x8, 0xc800000001, 0xc8207406a8, 0x4981cc, 0x
9553a0, 0xc8207406a8, 0xc820c3edd8)
        /home/rillig/gossa/src/syscall/asm_netbsd_amd64.s:63 +0x61 fp=0xc820c3ed38 sp=0xc820c3ed30
created by main.(*builder).do
        /home/rillig/gossa/src/cmd/go/build.go:1315 +0x39e

goroutine 1 [semacquire]:
sync.runtime_Semacquire(0xc82048682c)
        /home/rillig/gossa/src/runtime/sema.go:47 +0x26
sync.(*WaitGroup).Wait(0xc820486820)
        /home/rillig/gossa/src/sync/waitgroup.go:127 +0xb4
main.(*builder).do(0xc8201742a0, 0xc820bdaea0)
        /home/rillig/gossa/src/cmd/go/build.go:1318 +0x3c6
main.runTest(0xc766e0, 0xc82000ab20, 0xa1, 0xae)
        /home/rillig/gossa/src/cmd/go/test.go:595 +0x2836
main.main()
        /home/rillig/gossa/src/cmd/go/main.go:181 +0x783
(snip)

@mikioh mikioh added this to the Go1.6 milestone Jan 14, 2016
@ianlancetaylor
Copy link
Contributor

These results look frankly impossible. Can anybody reproduce them on a different machine?

@rillig
Copy link
Contributor Author

rillig commented Jan 14, 2016

I have some more details to share:

  • My NetBSD installation runs in a VirtualBox, hosted by Windows 10
  • I just tried to recompile the bootstrap compiler, and the unit tests failed as detailed below
ok      reflect 0.559s
ok      regexp  2.441s
ok      regexp/syntax   1.007s
--- FAIL: TestCgoExternalThreadSIGPROF (6.04s)
        crash_cgo_test.go:95: expected "OK\n", but got ""
FAIL
FAIL    runtime 157.238s
ok      runtime/debug   0.437s
exit status 255
FAIL    runtime/pprof   0.176s
ok      runtime/trace   39.029s
ok      sort    0.174s

I had built that compiler before without running the unit tests, and then I used that one as the bootstrap compiler. Maybe I shouldn’t have done that. Therefore, I’m currently rebuilding all my Go compilers from scratch, with unit tests enabled, hoping that it will work better.

@mikioh
Copy link
Contributor

mikioh commented Jan 14, 2016

My netbsd7-amd64 vm can reproduce this sort of runtime crashes (close to 100%). Looks like this is the same as #13947 and easy to reproduce when GOMAXPROCS>1.

runtime: newstack sp=0xc820069e40 stack=[0xc820025800, 0xc820025fe0]
        morebuf={pc:0xc82001c2c0 sp:0xc820069e48 lr:0x0}
        sched={pc:0x4c5d92 sp:0xc820069e40 lr:0x0 ctxt:0x0}
created by os/exec.(*Cmd).Start
        /home/mikioh/go/src/os/exec/exec.go:345 +0x967
fatal error: runtime: stack split at bad time
panic during panic

@mikioh mikioh changed the title runtime: fatal error: exitsyscall: syscall frame is no longer valid on netbsd/amd64 runtime: fatal errors on netbsd Jan 14, 2016
@ianlancetaylor
Copy link
Contributor

Thanks, I find it easier to believe "stack split at bad time" than I do "syscall frame is no longer valid."

@gopherbot
Copy link

CL https://golang.org/cl/18716 mentions this issue.

@gopherbot
Copy link

CL https://golang.org/cl/18776 mentions this issue.

@rillig
Copy link
Contributor Author

rillig commented Jan 21, 2016

I checked out https://go-review.googlesource.com/#/c/18776/3 and tried the following command again:

$ env GOROOT_BOOTSTRAP=$HOME/pkg/go GOROOT_FINAL=$HOME/gossa GOARCH=amd64 GOOS=netbsd bash ./all.bash

It still fails, the output is:

…
ok      os/exec 1.353s
runtime: newstack sp=0xc82001c6f0 stack=[0xc82001c000, 0xc82001c7e0]
    morebuf={pc:0x42a4a9 sp:0xc82001c6f8 lr:0x0}
    sched={pc:0x42ad0d sp:0xc82001c6f0 lr:0x0 ctxt:0x0}
syscall.Syscall(0x25, 0x8eb, 0x1e, 0x0, 0x8eb, 0xb3b, 0x0)
    panic during panic

goroutine 0 [idle]:
runtime.startpanic_m()
    /home/rillig/git/go/src/runtime/panic.go:587 +0x13a
runtime.systemstack(0x5e0d40)
    /home/rillig/git/go/src/runtime/asm_amd64.s:307 +0xab
runtime.startpanic()
    /home/rillig/git/go/src/runtime/panic.go:508 +0x14
runtime.sighandler(0xc800000005, 0xc82009dc00, 0xc82009dc80, 0xc820001680)
    /home/rillig/git/go/src/runtime/signal_amd64x.go:158 +0x38c
runtime.sigtrampgo(0x5, 0xc82009dc00, 0xc82009dc80)
    /home/rillig/git/go/src/runtime/signal_sigtramp.go:48 +0xf2
runtime.sigtramp(0x100000005, 0x0, 0x0, 0x1, 0x0, 0xfffffe800337daa0, 0xffffffff8094a64f, 0x232000, 0x300000000, 0xfffffe80159b1b98, ...)
    /home/rillig/git/go/src/runtime/sys_netbsd_amd64.s:252 +0x17
runtime.sigreturn_tramp(0x0, 0x0, 0x1, 0x0, 0xfffffe800337daa0, 0xffffffff8094a64f, 0x232000, 0x300000000, 0xfffffe80159b1b98, 0x22a000, ...)
    /home/rillig/git/go/src/runtime/sys_netbsd_amd64.s:220

goroutine 11 [syscall]:
runtime.throw(0x5cce40, 0x2d)
    /home/rillig/git/go/src/runtime/panic.go:524 +0x9 fp=0xc82001c710 sp=0xc82001c6f8
runtime.exitsyscall(0x48a6ab)
    /home/rillig/git/go/src/runtime/proc.go:2404 +0x62 fp=0xc82001c738 sp=0xc82001c710
syscall.Syscall(0x25, 0x8eb, 0x1e, 0x0, 0x8eb, 0xb3b, 0x0)
    /home/rillig/git/go/src/syscall/asm_netbsd_amd64.s:21 +0x5 fp=0xc82001c740 sp=0xc82001c738
syscall.Kill(0x8eb, 0x1e, 0x0, 0x0)
    /home/rillig/git/go/src/syscall/zsyscall_netbsd_amd64.go:657 +0x4b fp=0xc82001c780 sp=0xc82001c740
os/signal.TestStress.func2(0xc820066660, 0xc8200666c0)
    /home/rillig/git/go/src/os/signal/signal_test.go:96 +0x8d fp=0xc82001c7b0 sp=0xc82001c780
runtime.goexit()
    /home/rillig/git/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc82001c7b8 sp=0xc82001c7b0
created by os/signal.TestStress
    /home/rillig/git/go/src/os/signal/signal_test.go:101 +0x10b

goroutine 5 [syscall]:
os/signal.signal_recv(0x7f7ff7f70078)
    /home/rillig/git/go/src/runtime/sigqueue.go:116 +0x132
os/signal.loop()
    /home/rillig/git/go/src/os/signal/signal_unix.go:22 +0x18
created by os/signal.init.1
    /home/rillig/git/go/src/os/signal/signal_unix.go:28 +0x37

goroutine 10 [runnable]:
os/signal.TestStress.func1(0xc820066660, 0xc8200666c0)
    /home/rillig/git/go/src/os/signal/signal_test.go:81 +0x235
created by os/signal.TestStress
    /home/rillig/git/go/src/os/signal/signal_test.go:88 +0xdf
FAIL    os/signal   0.026s
ok      os/user 0.022s
ok      path    0.017s
ok      path/filepath   0.185s
ok      reflect 0.966s
ok      regexp  1.224s
ok      regexp/syntax   0.907s
--- FAIL: TestCgoExternalThreadSIGPROF (0.01s)
    crash_cgo_test.go:96: expected "OK\n", but got:
--- FAIL: TestSignalExitStatus (0.00s)
    crash_unix_test.go:145: test program succeeded unexpectedly
FAIL
FAIL    runtime 90.092s
ok      runtime/debug   0.977s
ok      runtime/internal/atomic 0.397s
exit status 255
FAIL    runtime/pprof   0.085s
ok      runtime/trace   30.445s
ok      sort    0.170s
…
ok      cmd/vet 13.394s
2016/01/21 05:56:16 Failed: exit status 1

@mikioh
Copy link
Contributor

mikioh commented Jan 21, 2016

@rillig,

I think they are different bugs.

  • runtime/cgo related: TestCgoExternalThreadSIGPROF and TestSignalExitStatus failures
  • runtime related: throw("exitsyscall: syscall frame is no longer valid") followed by panic during panic
  • unknown: runtime/internal/atomic test failures

Let's open an new issue for each, let this issue focus on signal stack setup issue. Just skimmed issues and looks like the second one happens on other platforms, but the others happen on only NetBSD.

@rillig
Copy link
Contributor Author

rillig commented Jan 21, 2016

Created issues #14050, #14051, #14052 for the above three.

@krytarowski
Copy link
Contributor

If there is buildbot I can contribute a NetBSD-7.0 amd64 buildslave.

@krytarowski
Copy link
Contributor

@minux ^

@gopherbot
Copy link

CL https://golang.org/cl/18814 mentions this issue.

@gopherbot
Copy link

CL https://golang.org/cl/29971 mentions this issue.

gopherbot pushed a commit that referenced this issue Sep 28, 2016
… thread on dragonfly

This change reverts CL 18814 which is a workaroud for older DragonFly
BSD kernels, and fixes #13945 and #13947 in a more general way the
same as other platforms except NetBSD.

This is a followup to CL 29491.

Updates #16329.

Change-Id: I771670bc672c827f2b3dbc7fd7417c49897cb991
Reviewed-on: https://go-review.googlesource.com/29971
Run-TryBot: Mikio Hara <mikioh.mikioh@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@golang golang locked and limited conversation to collaborators Sep 28, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants