-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: garbage collection crash in freebsd/386 runtime running on freebsd/amd64 #2675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
On tip: [dho@meep /usr/home/dho/go-old/src]$ GOARCH=386 gdb73.1 --args /home/dho/go-old/bin/go install -a -v std GNU gdb (GDB) 7.3.1 [GDB v7.3.1 for FreeBSD] Copyright (C) 2011 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>; This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-portbld-freebsd8.1". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>;... Reading symbols from /usr/home/dho/go-old/bin/go...done. (gdb) r Starting program: /usr/home/dho/go-old/bin/go install -a -v std runtime Program received signal SIGSEGV, Segmentation fault. nextgandunlock () at /usr/home/dho/go-old/src/pkg/runtime/./proc.c:602 602 if(m->helpgc) { (gdb) bt #0 nextgandunlock () at /usr/home/dho/go-old/src/pkg/runtime/./proc.c:602 #1 0x08067a41 in schedule (gp=void) at /usr/home/dho/go-old/src/pkg/runtime/./proc.c:856 #2 0x0806ec3c in runtime.mcall (fn=void) at /usr/home/dho/go-old/src/pkg/runtime/./asm_386.s:172 #3 0x3825d000 in ?? () Backtrace stopped: previous frame inner to this frame (corrupt stack?) |
Owner changed to builder@golang.org. |
Are you sure? [dho@meep ~/go/src]$ GOARCH=386 ./all.bash ... # Building packages and commands. runtime ./make.bash: line 67: 23446 Segmentation fault: 11 (core dumped) ../bin/tool/go_bootstrap install -a -v std [dho@meep ~/go/src]$ hg summ parent: 11883:4a0c77722a5e tip gc: diagnose field+method of same name branch: default commit: 10 unknown (clean) update: (current) Program received signal SIGSEGV, Segmentation fault. runtime.exitsyscall () at /home/dho/go/src/pkg/runtime/proc.c:956 956 runtime·exitsyscall(void) (gdb) bt #0 runtime.exitsyscall () at /home/dho/go/src/pkg/runtime/proc.c:956 #1 0x08104e97 in syscall.Syscall () at /home/dho/go/src/pkg/syscall/asm_freebsd_386.s:34 #2 0x081077ae in syscall.Read (fd=4, p=..., n=3, err=...) at /home/dho/go/src/pkg/syscall/zsyscall_freebsd_386.go:810 #3 0x080994e2 in os.(*File).read (f=0x38545540, b=..., n=134748239, err=...) at /home/dho/go/src/pkg/os/file_unix.go:163 #4 0x08097b2d in os.(*File).Read (f=0x38545540, b=..., n=0, err=...) at /home/dho/go/src/pkg/os/file.go:60 #5 0x080816c7 in bytes.(*Buffer).ReadFrom (b=0x382e3720, r=..., n=0, err=...) at /home/dho/go/src/pkg/bytes/buffer.go:153 #6 0x080955be in io.Copy (dst=..., src=..., written=0, err=...) at /home/dho/go/src/pkg/io/io.go:326 #7 0x0809ed3c in os/exec._func_003 (&w=void, &pr=void, noname=void) at /home/dho/go/src/pkg/os/exec/exec.go:201 #8 0x383d2765 in ?? () Backtrace stopped: previous frame inner to this frame (corrupt stack?) |
Why does the message say Starting program: /usr/home/dho/go-old/bin/cgo -- cgo_unix.go warning: `/usr/libexec/ld-elf.so.1': Shared library architecture i386:x86-64 is not compatible with target architecture i386. warning: `/usr/libexec/ld-elf.so.1': Shared library architecture i386:x86-64 is not compatible with target architecture i386. Are you on an x86-64 machine doing a 386 cross-compile? Status changed to Accepted. |
> Are you sure? Sure, majidesu. --- cd ../test 0 known bugs; 0 unexpected bugs ALL TESTS PASSED --- Installed Go for freebsd/386 in /home/mikioh/go Installed commands in /home/mikioh/go/bin vm5% uname -a FreeBSD vm5.localdomain 8.2-RELEASE-p3 FreeBSD 8.2-RELEASE-p3 #0: Tue Sep 27 18:07:27 UTC 2011 root@i386-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC i386 |
Hi Devon, We assume you are using i386 runtime on freebsd/amd64, correct? I have no experience to make i386 runtime on freebsd/amd64 like following: cd /usr/src; make build32; make install32; ldconfig -v -m -R /usr/lib32. If so I'm not sure whether it's worth to dive into it. Labels changed: added os-freebsd. |
Ack, sorry! I missed the updates here in the floods of stuff in my inbox. Yeah, this is indeed a 386 crossbuild (though for some reason I thought it was a 386 machine). This machine has 32-bit compat installed, but I'm in the process of upgrading it to RELENG_9 right now, so it'll be a bit before I can test this again. (Previously was RELENG_8). |
Devon, if you can still produce these crashes on demand, please post a core file and corresponding binary as an attachment in this issue. If the Go SIGSEGV handler is keeping the kernel from creating a core file, please edit src/pkg/runtime/signals_freebsd.h to change /* 11 */ P, "SIGSEGV: segmentation violation", to /* 11 */ 0, "SIGSEGV: segmentation violation", which will keep the Go runtime from trying to handle the signal. Thanks. Russ |
In addition to the attached core files, I'm seeing panics with "throw: entersyscall" and "runtime: split stack overflow". This is basically doing GOARCH=386 ./all.bash. Attached is a core file. It's actually pretty painful for me to get these due to all the cleanup that the go tool does, so if you need more information, let me know. I'm not able to deduce what's going on based on the binary/core file, and for some reason gdb 7.3.1 stopped working for me with Go programs. (In this case, I get "/usr/home/dho/go/src/./pkg/time/time.test.core" is not a core dump: File format is ambiguous, but when running live programs I'll frequently get other things). I'll keep this core / file around in case there's anything extra you'd like me to do. If you have any pointers as to what might be going on, that'd be great -- I can probably fix this, just not sure where to start. Some relevant info: [dho@meep ~/go/src]$ gdb73.1 time.test GNU gdb (GDB) 7.3.1 [GDB v7.3.1 for FreeBSD] Copyright (C) 2011 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>; This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-portbld-freebsd8.1". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>;... Reading symbols from /usr/home/dho/go/src/time.test...done. (gdb) r Starting program: /usr/home/dho/go/src/time.test Program received signal SIGSEGV, Segmentation fault. runtime.exitsyscall () at /usr/home/dho/go/src/pkg/runtime/proc.c:966 966 runtime·exitsyscall(void) (gdb) bt #0 runtime.exitsyscall () at /usr/home/dho/go/src/pkg/runtime/proc.c:966 #1 0x0805db42 in timerproc () at /home/dho/go/src/pkg/runtime/time.goc:3488 #2 0x08055292 in schedunlock () at /usr/home/dho/go/src/pkg/runtime/proc.c:259 #3 0x00000000 in ?? () (gdb) r The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /usr/home/dho/go/src/time.test throw: entersyscall goroutine 1 [syscall]: goroutine 2 [chan receive]: testing.RunTests(0x8048c00, 0x81f6f00, 0x2c, 0x2c, 0x81e8101, ...) /usr/home/dho/go/src/pkg/testing/testing.go:347 +0x6a7 testing.Main(0x8048c00, 0x81f6f00, 0x2c, 0x2c, 0x81f5cd8, ...) /usr/home/dho/go/src/pkg/testing/testing.go:282 +0x46 main.main() /tmp/go-build848228700/time/_test/_testmain.go:153 +0x4e created by _rt0_386 /usr/home/dho/go/src/pkg/runtime/asm_386.s:80 +0xbe goroutine 3 [sleep]: time.Sleep(0x5f5e100, 0x0) /usr/home/dho/go/src/pkg/runtime/ztime_386.c:21 +0x4a time_test.TestSleep(0x3820fb80, 0xe) /usr/home/dho/go/src/pkg/time/sleep_test.go:24 +0x69 testing.tRunner(0x3820fb80, 0x81f6f00, 0x0) /usr/home/dho/go/src/pkg/testing/testing.go:271 +0x6e created by testing.RunTests /usr/home/dho/go/src/pkg/testing/testing.go:346 +0x687 goroutine 4 [runnable]: time.Sleep(0x2faf080, 0x0) /usr/home/dho/go/src/pkg/runtime/ztime_386.c:21 +0x4a time_test._func_001() /usr/home/dho/go/src/pkg/time/sleep_test.go:20 +0x2b created by time_test.TestSleep /usr/home/dho/go/src/pkg/time/sleep_test.go:22 +0x2d goroutine 5 [syscall]: created by addtimer /usr/home/dho/go/src/pkg/runtime/ztime_386.c:69 [Inferior 1 (process 54458) exited with code 02] (gdb) r Starting program: /usr/home/dho/go/src/time.test throw: gosched of g0 goroutine 1 [syscall]: goroutine 2 [chan receive]: testing.RunTests(0x8048c00, 0x81f6f00, 0x2c, 0x2c, 0x81e8101, ...) /usr/home/dho/go/src/pkg/testing/testing.go:347 +0x6a7 testing.Main(0x8048c00, 0x81f6f00, 0x2c, 0x2c, 0x81f5cd8, ...) /usr/home/dho/go/src/pkg/testing/testing.go:282 +0x46 main.main() /tmp/go-build848228700/time/_test/_testmain.go:153 +0x4e created by _rt0_386 /usr/home/dho/go/src/pkg/runtime/asm_386.s:80 +0xbe goroutine 3 [runnable]: time.Sleep(0x5f5e100, 0x0) /usr/home/dho/go/src/pkg/runtime/ztime_386.c:21 +0x4a time_test.TestSleep(0x3820fb80, 0xe) /usr/home/dho/go/src/pkg/time/sleep_test.go:24 +0x69 testing.tRunner(0x3820fb80, 0x81f6f00, 0x0) /usr/home/dho/go/src/pkg/testing/testing.go:271 +0x6e created by testing.RunTests /usr/home/dho/go/src/pkg/testing/testing.go:346 +0x687 goroutine 5 [syscall]: created by addtimer /usr/home/dho/go/src/pkg/runtime/ztime_386.c:69 [Inferior 1 (process 54460) exited with code 02] Attachments:
|
The core you posted is dying in nextgandunlock after a call to notesleep returns: nextgandunlock+0x15e 0x08055a62 MOVL GS:fffffffc,AX nextgandunlock+0x165 0x08055a69 ADDL $84,AX nextgandunlock+0x16a 0x08055a6e MOVL AX,0(SP) nextgandunlock+0x16d 0x08055a71 CALL runtime.notesleep(SB) nextgandunlock+0x172 0x08055a76 MOVL GS:fffffffc,AX <<<<< nextgandunlock+0x179 0x08055a7d MOVL 74(AX),AX nextgandunlock+0x17c 0x08055a80 CMPL AX,$0 This strongly suggests that the thread-local storage is being reset or otherwise mishandled. The fault is _reading_ the thread-local storage word, not _using_ it. So it is like our thread-local storage disappeared completely! Does FreeBSD have cgo? I wonder if it is messing things up. TLS mishaps causing g not to point at a G structure would explain the throw("entersyscall") and the runtime split stack overflow failures too. Maybe it would make sense to try to use thr_new's tls_base instead of doing it ourselves in the new threads. Note that for bizarre ELF reasons, tls_base points _after_ the tls section. So you'd want to try making m->tls be an array of void*, then set tls[0] = g and tls[1] = m and then use &tls[2] as tls_base in the thr_new parameters. |
Yes, FreeBSD has cgo support. For proper tls handling on amd64, feel free to use http://golang.org/cl/5689065/ . I haven't have time to finish the same for FreeBSD/386. |
Issue #3115 has been merged into this issue. |
Op 25 februari 2012 13:08 heeft <go@googlecode.com> het volgende geschreven: The change you made in 5689065 works for me on amd64. I'm still unable to get an i386 version of this put together -- partially because I don't have a FreeBSD/i386 machine, and also partially because cross-compiling and running the 386 binary on the amd64 machine just doesn't work for me when I try to set it up "properly." I'd definitely appreciate input / suggestions for how to go about this, because the "straightforward" fix doesn't seem to work and my knowledge of i386 is significantly worse than my knowledge of amd64. --dho |
This happens on a 64-bit system compiling with GOARCH=386. It looks like somehow the tls pointer is being set to &m->tls[0] instead of &m->tls[0] + 2*sizeof(uintptr), at least if uc->uc_mcontext.gsbase is to believed. This would happen if setldt were being ignored and thr_new's param.tls_base were used instead. However, I tried fixing param.tls_base and commenting out the settls in thr_start and that did not fix anything. Since this only happens on a cross-compile, I think this can wait until after Go 1. Labels changed: added priority-later, removed priority-go1. |
Issue #3452 has been merged into this issue. |
FWIW the FreeBSD team has just committed two fixes related to running i386 LDT code on amd64 kernels. http://svnweb.freebsd.org/base?view=revision&revision=266846 http://svnweb.freebsd.org/base?view=revision&revision=266901 It is possible these fix the problem. There is the beginning of a discussion here: http://lists.freebsd.org/pipermail/freebsd-amd64/2014-May/thread.html#16093 I do not know whether or when these patches will hit earlier versions of FreeBSD. We will fix the one error-checking problem identified on that thread, but it's minor compared to the FreeBSD fixes. |
CL https://golang.org/cl/99680044 mentions this issue. |
This issue was updated by revision 19c8f67. The code here was using the error check for Linux/386, not the one for FreeBSD/386. Most of the time it worked. Thanks to Neel Natu (FreeBSD developer) for finding this. The s/JCC/JAE/ a few lines later is a no-op but makes the test match the rest of the file. Why we write JAE instead of JCC I don't know, but the two are equivalent and the file might as well be consistent. LGTM=bradfitz, minux R=golang-codereviews, bradfitz, minux CC=golang-codereviews https://golang.org/cl/99680044 |
Neel Natu tells me that both of the FreeBSD fixes will be merged into the FreeBSD stable branches in the next 2-3 weeks. Can someone please post a comment once you've seen the binaries working on a FreeBSD stable kernel? Thanks. Owner changed to @rsc. |
This was merged into the FreeBSD stable branches: http://svnweb.freebsd.org/base?view=revision&revision=267085 http://svnweb.freebsd.org/base?view=revision&revision=267086 http://svnweb.freebsd.org/base?view=revision&revision=267083 |
Go 1.5 on FreeBSD from FreeBSD stable branches allows to run GOARCH=386 CGO_ENABLED=1 make.bash on freebsd/amd64 host. I just confirmed it on freebsd-amd64 10.1-RELEASE-p10 and freebsd9-amd64 9.3-RELEASE-p13. |
The text was updated successfully, but these errors were encountered: