Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: crash in glibc in new thread #58422

Closed
yangboyd opened this issue Feb 9, 2023 · 8 comments
Closed

runtime: crash in glibc in new thread #58422

yangboyd opened this issue Feb 9, 2023 · 8 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.

Comments

@yangboyd
Copy link

yangboyd commented Feb 9, 2023

What version of Go are you using (go version)?

$ go version

go version go1.19.5 linux/amd64

Does this issue reproduce with the latest release?

maybe

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOENV="/root/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/root/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/root/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.19.5"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build1914819968=/tmp/go-build -gno-record-gcc-switches

What did you do?

go called c library via cgo
c library code calls function isspace
isspace used memory which is initialized in __ctype_init , that is lost after clone.

similar to #29689
But this bug occurs after thread clone.

What did you expect to see?

No crash.

What did you see instead?

Crash.

Thread 8 "xxxxxxxx" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffc95ef700 (LWP 3523)]
0x00007fffc85eebd0 in __nss_readline (fp=0x7fffd0103540, buf=0x7fffc95ebd20 "root:x:0:0:root:/root:/bin/bash\n", len=<optimized out>, poffset=0x7fffc95ebbb0) at ./nss_readline.c:68
68	      while (isspace (*p))
(gdb) bt
#0  0x00007fffc85eebd0 in __nss_readline (fp=0x7fffd0103540, buf=0x7fffc95ebd20 "root:x:0:0:root:/root:/bin/bash\n", len=<optimized out>, poffset=0x7fffc95ebbb0) at ./nss_readline.c:68
#1  0x00007fffc85ebc1f in internal_getent (stream=0x7fffd0103540, result=result@entry=0x7fffc95ebcf0, buffer=buffer@entry=0x7fffc95ebd20 "root:x:0:0:root:/root:/bin/bash\n", buflen=buflen@entry=1024, 
    errnop=errnop@entry=0x7fffc95ef6a8) at nss_files/files-XXX.c:152
#2  0x00007fffc85ebff3 in _nss_files_getpwuid_r (uid=0, result=0x7fffc95ebcf0, buffer=0x7fffc95ebd20 "root:x:0:0:root:/root:/bin/bash\n", buflen=1024, errnop=0x7fffc95ef6a8) at nss_files/files-pwd.c:39
#3  0x0000000000ce586d in getpwuid_r ()
#4  0x0000000000c907d2 in cuserid ()
#5  0x0000000000c69d2b in xxxx1 ()
#6  0x0000000000c67a3a in xxxx2 ()
#7  0x0000000000c64c37 in xxxx3 ()
#8  0x0000000000c652a6 in xxxx4 ()
#9  0x0000000000c5e26f in _cgo_f28213f6c123_Cfunc_xxxx4 (v=0xc000524888) at cgo-gcc-prolog:55
#10 0x000000000046d424 in runtime.asmcgocall () at /root/.cache/xxxxx-go/src/runtime/asm_amd64.s:848
#11 0x000000c000058a00 in ?? ()
#12 0x000000c00010a8a8 in ?? ()
#13 0x00007fffc95ef1e0 in ?? ()
#14 0x000000000041e665 in runtime.gcAssistAlloc.func1 () at /root/.cache/xxxxx-go/src/runtime/mgcmark.go:475
#15 0x000000000046b5a9 in runtime.systemstack () at /root/.cache/xxxxx-go/src/runtime/asm_amd64.s:496
#16 0x0000000000000280 in ?? ()
#17 0x01007fffac000020 in ?? ()
#18 0x0000000000800000 in regexp/syntax.(*parser).parseNamedClass (p=0x0, Python Exception <class 'OverflowError'> signed integer is greater than maximum: 
s=, r=[]int32 = {...}, out=..., rest=..., err=...) at /root/.cache/xxxxx-go/src/regexp/syntax/parse.go:1602
#19 0x0000000000c75b8a in start_thread ()
#20 0x0000000000ced693 in clone ()
(gdb) f 0
#0  0x00007fffc85eebd0 in __nss_readline (fp=0x7fffd0103540, buf=0x7fffc95ebd20 "root:x:0:0:root:/root:/bin/bash\n", len=<optimized out>, poffset=0x7fffc95ebbb0) at ./nss_readline.c:68
68	      while (isspace (*p))
(gdb) p p
$1 = 0x7fffc95ebd20 "root:x:0:0:root:/root:/bin/bash\n"
(gdb) 


clone bt
(gdb) bt
#0  0x0000000000ced650 in clone ()
#1  0x0000000000c74a4f in create_thread ()
#2  0x0000000000c7641b in pthread_create ()
#3  0x0000000000c5e4f1 in _cgo_try_pthread_create (thread=thread@entry=0x7fffca7fbfb0, attr=attr@entry=0x7fffca7fbfc0, pfn=pfn@entry=0xc5e5a0 <threadentry>, arg=arg@entry=0x189e960) at gcc_libinit.c:100
#4  0x0000000000c5e71f in _cgo_sys_thread_start (ts=0x189e960) at gcc_linux_amd64.c:75
#5  0x000000000046d461 in runtime.asmcgocall () at /root/.cache/xxxxxx-go/src/runtime/asm_amd64.s:878
#6  0x0000000001888d58 in runtime.newmHandoff ()
#7  0x0000000000000080 in ?? ()
#8  0x0000000000000000 in ?? ()
(gdb)

It seems the ctype initialized in golang/cgo is different than the ctype in c library, the later is not initialized.

Or the c library initialized ctype in previous thread is gabbage collected .

@ianlancetaylor
Copy link
Contributor

Can you show us a complete standalone test case that demonstrates the problem? Thanks.

@ianlancetaylor ianlancetaylor changed the title cgo crash after clone runtime: crash in glibc in new thread Feb 9, 2023
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Feb 9, 2023
@yangboyd
Copy link
Author

yangboyd commented Feb 9, 2023

Can you show us a complete standalone test case that demonstrates the problem? Thanks.

Uploaded test project here: https://github.com/yangboyd/golangcrashtest

Envrionment:

[/]#cat /etc/redhat-release
CentOS Stream release 8
[/]# uname -a
Linux 4.18.0-394.el8.x86_64 #1 SMP Tue May 31 16:19:11 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

@ianlancetaylor
Copy link
Contributor

Is there a way to recreate this problem without using a modified version of the Go libraries?

@yangboyd
Copy link
Author

yangboyd commented Feb 9, 2023

Is there a way to recreate this problem without using a modified version of the Go libraries?

It's the go net library, does it affect the thread initialization related function?

@ianlancetaylor
Copy link
Contributor

I'm more concerned about the fact that you are using the Tailscale Go distribution rather than the standard one.

And while I don't know why the Go net package would affect initializing threads, I also don't know why the problem would only be observed if you change the net package. Is it impossible to recreate the problem with ordinary Go code that does not modify the net package? Because if you have to modify the net package to see this problem, then it's not a bug that can affect ordinary Go users and I'm only mildly interested in what is causing it.

@seankhliao seankhliao added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Feb 10, 2023
@yangboyd
Copy link
Author

It seems the ctype is initialized only in the dynamic library .so file firstly loaed by dlopen in the first thread. In the later threads, the dlopen is not called again, so the ctype is not reinitialized. Then in later threads, it crashed.

@ianlancetaylor
Copy link
Contributor

That sounds like a glibc bug, not a Go bug. What version of glibc are you using?

@dr2chase dr2chase added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Feb 10, 2023
@yangboyd
Copy link
Author

[]# gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/8/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl --disable-libmpx --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=x86-64 --build=x86_64-redhat-linux
Thread model: posix
gcc version 8.5.0 20210514 (Red Hat 8.5.0-15) (GCC)

[]# rpm -qa |grep glibc
glibc-debuginfo-2.28-224.el8.x86_64
glibc-gconv-extra-2.28-224.el8.x86_64
glibc-common-2.28-224.el8.x86_64
glibc-headers-2.28-224.el8.x86_64
glibc-2.28-224.el8.x86_64
glibc-static-2.28-224.el8.x86_64
glibc-langpack-en-2.28-224.el8.x86_64
glibc-devel-2.28-224.el8.x86_64
glibc-debugsource-2.28-224.el8.x86_64

@golang golang locked and limited conversation to collaborators Feb 15, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

5 participants