-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: tests fail with "unexpected signal during runtime execution" when building on solaris #7860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Comments
Hi, sorry, I missed that comment/mail. This on box of my own, not on Joyents cloud, so if compared to them I guess one difference would be a newer platform image. The zone is a 13.4.1 image with 2GB RAM and no CPU restrictions, single Xeon X3440 on a Supermicro box. I got the same error on an older zone running the 13.1.0 image, but I would anyway guess that if it's something in the OS that differs relevantly that would be the base image. |
For what it's worth, I spun up a zone on another machine. Platform image joyent_20140307T223339Z, base64 13.4.2 zone, 8 GB, Xeon E-1230. Clean install, pkgin in build-essential mercurial, create a user, clone the repo, hg update default (824f981dd4b7) and ./all.bash. On this setup it's not 100% reproducible. If I just run ./all.bash repeatedly I get the above panic about half the time, the other half the test passes. Sorry for the obviously vague info, not sure what more I can provide. :/ |
Please post the output of these commands on the 2GB image: /bin/prctl -n zone.max-swap $$ /bin/prctl -n zone.max-physical-memory $$ /bin/prctl -n zone.max-locked-memory $$ /bin/prctl -n zone.cpu-shares $$ /bin/prctl -n zone.cpu-cap $$ /bin/prctl -n zone.max-lwps $$ /bin/prctl -n process.max-data-size $$ /bin/prctl -n process.max-address-space $$ /bin/prctl -n process.max-cpu-time $$ |
jb@zlogin3:~ $ /bin/prctl -n zone.max-swap $$ process: 14393: -bash NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT zone.max-swap usage 40.5MB privileged 2.00GB - deny - system 16.0EB max deny - jb@zlogin3:~ $ /bin/prctl -n zone.max-physical-memory $$ process: 14393: -bash NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT zone.max-physical-memory usage 41.9MB privileged 2.00GB - deny - system 16.0EB max deny - jb@zlogin3:~ $ /bin/prctl -n zone.max-locked-memory $$ process: 14393: -bash NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT zone.max-locked-memory usage 0B privileged 2.00GB - deny - system 16.0EB max deny - jb@zlogin3:~ $ /bin/prctl -n zone.cpu-shares $$ process: 14393: -bash NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT zone.cpu-shares usage 100 privileged 100 - none - system 65.5K max none - jb@zlogin3:~ $ /bin/prctl -n zone.cpu-cap $$ process: 14393: -bash NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT zone.cpu-cap usage 0 system 4.29G inf deny - jb@zlogin3:~ $ /bin/prctl -n zone.max-lwps $$ process: 14393: -bash NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT zone.max-lwps usage 59 privileged 2.00K - deny - system 2.15G max deny - jb@zlogin3:~ $ /bin/prctl -n process.max-data-size $$ process: 14393: -bash NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT process.max-data-size privileged 16.0EB max deny - system 16.0EB max deny - jb@zlogin3:~ $ /bin/prctl -n process.max-address-space $$ process: 14393: -bash NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT process.max-address-space privileged 16.0EB max deny - system 16.0EB max deny - jb@zlogin3:~ $ /bin/prctl -n process.max-cpu-time $$ process: 14393: -bash NAME PRIVILEGE VALUE FLAG ACTION RECIPIENT process.max-cpu-time privileged 18.4Es inf signal=XCPU - system 18.4Es inf none - jb@zlogin3:~ $ |
This is possibly related to issue #7554. It seems to be caused by resource exhaustion, but we haven't yet figured exactly what happens. If you say this this happens in production I will dedicate more time in fixing this. In the meantime, a core dump would be helpful. |
This is possibly related to issue #7554. It seems to be caused by resource exhaustion, but we haven't yet figured exactly what happens. If you say this this happens in production I will dedicate more time in fixing this. In the meantime, a core dump would be helpful. |
Yes, looks like that could be it - the stack trace is always in GC when the crash happens. It's not "in production" as in any money riding on it, just a personal project that I'd rather run directly on SmartOS than on KVM. How do I create such a core dump (as opposed to the usual stack print)? |
Hm. Sorry, but I can't get that to work. I can get coredumps from non-Go processes by sending them a SIGABRT (so my coreadm settings etc are OK) but Go binaries don't generate a core dump even with GOTRACEBACK=crash. I've tried with both this "real" crash and manually sending SIGABRT and SIGQUIT to no avail. jb@zlogin3:~ $ ls -l /var/cores total 0 jb@zlogin3:~ $ coreadm global core file pattern: /var/cores/%f.%n.%p.%t.core global core file content: default init core file pattern: /%Z/cores/core.%f.%p init core file content: default global core dumps: enabled per-process core dumps: disabled global setid core dumps: enabled per-process setid core dumps: disabled global core dump logging: enabled jb@zlogin3:~ $ ulimit -a core file size (blocks, -c) unlimited data seg size (kbytes, -d) unlimited file size (blocks, -f) unlimited open files (-n) 65536 pipe size (512 bytes, -p) 10 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 32725 virtual memory (kbytes, -v) unlimited jb@zlogin3:~ $ sleep 100 & [2] 35014 jb@zlogin3:~ $ kill -ABRT % jb@zlogin3:~ $ [2]+ Abort (core dumped) sleep 100 jb@zlogin3:~ $ ls -l /var/cores/ total 2569 -rw------- 1 root root 5527218 Apr 30 17:56 gsleep.zlogin3.35014.1398873368.core jb@zlogin3:~ $ cat sleep.go package main import "time" func main() { for { time.Sleep(1*time.Second) } } jb@zlogin3:~ $ go build sleep.go jb@zlogin3:~ $ GOTRACEBACK=crash ./sleep & [2] 35032 jb@zlogin3:~ $ kill -ABRT % SIGABRT: used by abort, replace SIGIOT in the future PC=0xfffffd7fff2bf5e7 goroutine 0 [idle]: goroutine 17 [syscall]: runtime.notetsleepg(0xfffffd7ffefaff68, 0xdf8475800) /home/jb/go/src/pkg/runtime/lock_sema.c:263 +0x71 fp=0xfffffd7ffefaff40 runtime.MHeap_Scavenger() /home/jb/go/src/pkg/runtime/mheap.c:531 +0xa3 fp=0xfffffd7ffefaffa8 runtime.goexit() /home/jb/go/src/pkg/runtime/proc.c:1430 fp=0xfffffd7ffefaffb0 created by runtime.main /home/jb/go/src/pkg/runtime/proc.c:207 goroutine 16 [sleep]: time.Sleep(0x3b9aca00) /home/jb/go/src/pkg/runtime/time.goc:39 +0x31 main.main() /home/jb/sleep.go:7 +0x26 rax 0x5b rbx 0xfffffd7fff182a40 rcx 0xfffffd7ffef96000 rdx 0x0 rdi 0x0 rsi 0xfffffd7fffdff970 rbp 0xfffffd7fffdff960 rsp 0xfffffd7fffdff8f8 r8 0xfffffd7ffef88a40 r9 0x76 r10 0xfffffd7ffef96000 r11 0xfffffffffbc05648 r12 0x494700 r13 0xfffffd7fff337a00 r14 0x0 r15 0xfffffd7fffdff970 rip 0xfffffd7fff2bf5e7 rflags 0x247 cs 0x53 fs 0x0 gs 0x0 [2]+ Exit 2 GOTRACEBACK=crash ./sleep jb@zlogin3:~ $ ls -l /var/cores/ total 2569 -rw------- 1 root root 5527218 Apr 30 17:56 gsleep.zlogin3.35014.1398873368.core jb@zlogin3:~ $ |
I'm confused, your paste indicates the existence of a core file. 04757c51-b5c2-4d23-9a69-1e9e305bc4da:2$ cat panic.go package main func main() { panic("test") } 04757c51-b5c2-4d23-9a69-1e9e305bc4da:2$ go build -o panic panic.go 04757c51-b5c2-4d23-9a69-1e9e305bc4da:2$ ls panic* panic.go 04757c51-b5c2-4d23-9a69-1e9e305bc4da:2$ ./panic panic: test goroutine 16 [running]: runtime.panic(0x424d60, 0xc208000010) /home/aram/go/src/pkg/runtime/panic.c:279 +0xf5 main.main() /tmp/2/panic.go:4 +0x73 goroutine 17 [runnable]: runtime.MHeap_Scavenger() /home/aram/go/src/pkg/runtime/mheap.c:507 runtime.goexit() /home/aram/go/src/pkg/runtime/proc.c:1446 goroutine 18 [runnable]: bgsweep() /home/aram/go/src/pkg/runtime/mgc0.c:1891 runtime.goexit() /home/aram/go/src/pkg/runtime/proc.c:1446 04757c51-b5c2-4d23-9a69-1e9e305bc4da:2$ ls panic* panic.go 04757c51-b5c2-4d23-9a69-1e9e305bc4da:2$ GOTRACEBACK=crash ./panic panic: test goroutine 16 [running]: runtime.panic(0x424d60, 0xc208000010) /home/aram/go/src/pkg/runtime/panic.c:279 +0xf5 main.main() /tmp/2/panic.go:4 +0x73 runtime.main() /home/aram/go/src/pkg/runtime/proc.c:243 +0x11a runtime.goexit() /home/aram/go/src/pkg/runtime/proc.c:1446 created by _rt0_go /home/aram/go/src/pkg/runtime/asm_amd64.s:97 +0x132 goroutine 17 [runnable]: runtime.MHeap_Scavenger() /home/aram/go/src/pkg/runtime/mheap.c:507 runtime.goexit() /home/aram/go/src/pkg/runtime/proc.c:1446 created by runtime.main /home/aram/go/src/pkg/runtime/proc.c:203 goroutine 18 [runnable]: bgsweep() /home/aram/go/src/pkg/runtime/mgc0.c:1891 runtime.goexit() /home/aram/go/src/pkg/runtime/proc.c:1446 created by runtime.gc /home/aram/go/src/pkg/runtime/mgc0.c:2179 Abort (core dumped) 04757c51-b5c2-4d23-9a69-1e9e305bc4da:2$ ls core panic* panic.go 04757c51-b5c2-4d23-9a69-1e9e305bc4da:2$ |
The corefile that is created is from the regular system /opt/local/bin/sleep just to make sure the system actually does write core files - it was maybe not the clearest example in the world to use the same name for my test binary. But panic.go gives me the same, no "Abort (core dumped)" at the end and no core file. Am I on the wrong Go version? go version devel +f8b50ad4cac4 Mon Apr 21 17:00:27 2014 -0700 solaris/amd64 jb@zlogin3:~ $ cat>panic.go package main func main() { panic("test") } jb@zlogin3:~ $ go build panic.go jb@zlogin3:~ $ ./panic panic: test goroutine 16 [running]: runtime.panic(0x425ba0, 0xc208000010) /home/jb/go/src/pkg/runtime/panic.c:279 +0xf5 main.main() /home/jb/panic.go:4 +0x61 goroutine 17 [runnable]: runtime.MHeap_Scavenger() /home/jb/go/src/pkg/runtime/mheap.c:507 runtime.goexit() /home/jb/go/src/pkg/runtime/proc.c:1430 goroutine 18 [runnable]: bgsweep() /home/jb/go/src/pkg/runtime/mgc0.c:1960 runtime.goexit() /home/jb/go/src/pkg/runtime/proc.c:1430 [2] jb@zlogin3:~ $ GOGCTRACE=crash ./panic panic: test goroutine 16 [running]: runtime.panic(0x425ba0, 0xc208000010) /home/jb/go/src/pkg/runtime/panic.c:279 +0xf5 main.main() /home/jb/panic.go:4 +0x61 goroutine 17 [runnable]: runtime.MHeap_Scavenger() /home/jb/go/src/pkg/runtime/mheap.c:507 runtime.goexit() /home/jb/go/src/pkg/runtime/proc.c:1430 goroutine 18 [runnable]: bgsweep() /home/jb/go/src/pkg/runtime/mgc0.c:1960 runtime.goexit() /home/jb/go/src/pkg/runtime/proc.c:1430 [2] jb@zlogin3:~ $ ls -l /var/cores/ total 2569 -rw------- 1 root root 5527218 Apr 30 17:56 gsleep.zlogin3.35014.1398873368.core |
Please update and try again, specifically check out https://golang.org/cl/97800045/ |
Indeed! Here's a core file and the corresponding runtime.test binary, ran as 'runtime.test -test.short -test.cpu 1,2,4". Attachments:
|
Actually no, sorry, I was too quick there. It does crash, but with "fatal error: runtime: mcall called on m->g0 stack", i.e. issue #6193 but I guess in a new context. |
Ah, glad to hear. I was concerned for a moment. Please post in issue #6193 the stacktrace you get with GOTRACEBACK=2 |
This issue was closed.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
The text was updated successfully, but these errors were encountered: