Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/cgo: dlopen shared library cause bus error in pthread_create with raspberry pi (Arch Linux) #58548

Closed
dong-zeyu opened this issue Feb 15, 2023 · 11 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Milestone

Comments

@dong-zeyu
Copy link

What version of Go are you using (go version)?

$ go version
go version go1.20.1 linux/arm

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="arm"
GOBIN=""
GOCACHE="$HOME/.cache/go-build"
GOENV="$HOME/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="$HOME/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="$HOME/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/mnt/share/go1.20.1"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/mnt/share/go1.20.1/pkg/tool/linux_arm"
GOVCS=""
GOVERSION="go1.20.1"
GCCGO="gccgo"
GOARM="6"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-O2 -g"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-O2 -g"
CGO_FFLAGS="-O2 -g"
CGO_LDFLAGS="-O2 -g"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -marm -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build3555080228=/tmp/go-build -gno-record-gcc-switches"

What did you do?

The system is running on a raspberry pi model 3b+, with up to date Arch Linux ARM (32bit)

Create main.go

package main

func main() {

}

Create main.c

#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>

int main(int argc, char** argv)
{
    void *handle = dlopen("./libtest.so", RTLD_LAZY);
    if (!handle) {
        fprintf(stderr, "Error: %s\n", dlerror());
        return EXIT_FAILURE;
    }

    dlclose(handle);
    return EXIT_SUCCESS;
}

Build libtest.so and executable, and run the program

$ go build -buildmode=c-shared -o libtest.so main.go
$ gcc -g main.c -o main
$ ./main
Bus error (core dumped)

What did you expect to see?

The program should exit normally

What did you see instead?

The program failed with Bus error

$ gdb --args ./main
(gdb) run
Program received signal SIGBUS, Bus error.
(gdb) bt
#0  0xb6e9be08 in create_thread (pd=pd@entry=0xb6d1f3a0, attr=attr@entry=0xbefff1a8, stopped_start=0xbefff1a6, stopped_start@entry=0xbefff19e, stackaddr=stackaddr@entry=0xb651f000, 
    stacksize=<optimized out>, stacksize@entry=8388128, thread_ran=<optimized out>, thread_ran@entry=0xbefff19f) at pthread_create.c:295
#1  0xb6e9cae8 in __pthread_create_2_1 (newthread=0xb6d89414 <_rt0_arm_lib_go>, newthread@entry=0xbefff2c0, attr=0xbefff1a7, attr@entry=0x0, start_routine=0x0, 
    start_routine@entry=0xb6d89414 <_rt0_arm_lib_go>, arg=0x0, arg@entry=0xb6d8c0f8 <_cgo_try_pthread_create+68>) at pthread_create.c:828
#2  0xb6d8c0f8 in _cgo_try_pthread_create (thread=0xbefff2c0, thread@entry=0xbefff2b8, attr=attr@entry=0x0, pfn=0xb6d89414 <_rt0_arm_lib_go>, arg=0xb6d8c0f8 <_cgo_try_pthread_create+68>)
    at gcc_libinit.c:100
#3  0xb6d8c16c in x_cgo_sys_thread_create (func=<optimized out>, arg=<optimized out>) at gcc_libinit.c:27
#4  0xb6d89574 in _rt0_arm_lib () at /usr/lib/go/src/runtime/asm_arm.s:70
#5  0xb6fc83d4 in call_init (env=0xbefffb5c, argv=0xae, argc=1, l=<optimized out>) at dl-init.c:70
#6  call_init (l=<optimized out>, argc=1, argv=0xae, env=0xbefffb5c) at dl-init.c:26
#7  0xb6fc84e0 in _dl_init (main_map=0x4121a8, argc=1, argv=0xbefffb54, env=0xbefffb5c) at dl-init.c:117
#8  0xb6f6d718 in __GI__dl_catch_exception (exception=0x0, operate=0x0, args=0xbefff578) at dl-error-skeleton.c:182
#9  0xb6fcfb10 in dl_open_worker (a=0xbefff730) at dl-open.c:808
#10 0xb6f6d6b8 in __GI__dl_catch_exception (exception=0x400760, operate=0xbefff724, args=0xbefff7a8) at dl-error-skeleton.c:208
#11 0xb6fcfecc in _dl_open (file=0x400760 "./libtest.so", mode=-2147483647, caller_dlopen=0x400658 <main+48>, nsid=-2, argc=<optimized out>, argv=<optimized out>, env=<optimized out>)
    at dl-open.c:883
#12 0xb6e9762c in dlopen_doit (a=0xbefff9a4) at dlopen.c:56
#13 0xb6f6d6b8 in __GI__dl_catch_exception (exception=0xb6ff91a0, exception@entry=0xbefff948, operate=0xbefff950, args=0x0, args@entry=0xb6ffed00 <_rtld_global_ro>)
    at dl-error-skeleton.c:208
#14 0xb6f6d7a8 in __GI__dl_catch_error (objname=0xbefff97c, errstring=0xbefff980, mallocedp=0xbefff97b, operate=<optimized out>, args=<optimized out>) at dl-error-skeleton.c:227
#15 0xb6e96fc0 in _dlerror_run (operate=0xb6e97598 <dlopen_doit>, args=0xbefff9a4, args@entry=0xbefff99c) at dlerror.c:138
#16 0xb6e97738 in dlopen_implementation (dl_caller=<optimized out>, mode=<optimized out>, file=<optimized out>) at dlopen.c:71
#17 ___dlopen (file=<optimized out>, mode=<optimized out>) at dlopen.c:81
#18 0x00400658 in main (argc=0, argv=0x0) at main.c:7
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) x/i $pc-4
   0xb6e9be04 <create_thread+228>:      vst1.8  {d16-d17}, [r2 :64]
(gdb) p/x $r2
$2 = 0xbefff114
(gdb) p/x $sp
$3 = 0xbefff0ec
$ dmesg | tail
[14017413.131684] Alignment trap: not handling instruction f4420a1f at [<b6e9be04>]
[14017413.131720] 8<--- cut here ---
[14017413.131729] Unhandled fault: alignment exception (0xa21) at 0xbefff114
[14017413.131742] pgd = ff224ad3
[14017413.131755] [befff114] *pgd=0d951003, *pmd=05314003, *pte=e00000093cbf5f

I suspect the problem is caused by the stack is misaligned, so I trace back the stack pointer and find the misalignment happens at

TEXT _rt0_arm_lib(SB),NOSPLIT,$104

where the function _rt0_arm_lib applied for a stack size of 104. I changed it to 108 and this solves the issue.

I don't have much knowledge in assembly and processor architecture, so I'm not sure why the misalignment happens here and what should be the correct way to solve this issue.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Feb 15, 2023
@thanm thanm added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Feb 21, 2023
@tmm1
Copy link
Contributor

tmm1 commented Feb 22, 2023

I'm experiencing the same issue after dlopen a golang built shared so into an ARM binary.

Program terminated with signal SIGBUS, Bus error.
#0  0xf30329a0 in ?? () from /usr/lib/libc.so.6
[Current thread is 1 (Thread 0xdc7f92c0 (LWP 5308))]

Thread 1 (Thread 0xdc7f92c0 (LWP 5308)):
#0  0xf30329a0 in ?? () from /usr/lib/libc.so.6
#1  0xf30333e4 in pthread_create () from /usr/lib/libc.so.6
#2  0x01ee2080 in ?? ()

@mknyszek
Copy link
Contributor

CC @golang/arm @golang/runtime

In triage now, it looks like it's an unaligned 8 byte store, but it's not clear why this doesn't happen consistently on all Raspberry Pis. One theory is that it's this particular implementation of pthreads (which libc is this? glibc? musl?) expecting an aligned stack pointer. But it could just be a general issue on our side in failing to follow the C ABI fully.

@mknyszek mknyszek added this to the Backlog milestone Feb 22, 2023
@mknyszek mknyszek added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Feb 22, 2023
@mknyszek
Copy link
Contributor

@randall77 points out that we do 8-byte align the stack on arm (regardless of what the CPU actually requires, if it requires less).

@dong-zeyu
Copy link
Author

I'm using the official build of arch linux arm (32bit). The glibc link is here.

@tmm1
Copy link
Contributor

tmm1 commented Feb 23, 2023

FWIW, I'm experiencing this issue using go1.19. And it was working fine previously when using go1.17

@tmm1
Copy link
Contributor

tmm1 commented Feb 24, 2023

Can confirm that changing $104 to $108 fixes the SIGBUS

@dong-zeyu
Copy link
Author

dong-zeyu commented Jul 3, 2023

This issue is not specific to golang. Close and seek help from arch linux.

@elgatito
Copy link

Can confirm that changing $104 to $108 fixes the SIGBUS

@tmm1 How did you make that change?

@nandra
Copy link

nandra commented Feb 2, 2024

I did hit the same issue also with golang 1.19.4 + glibc 2.35. Fix is to patch go-runtime in following (as stated in original issue):

From 5551b0fbcbd5d715a5ac2b142fd45cdf38e62aa7 Mon Sep 17 00:00:00 2001
From: Marek Belisko <marek.belisko@open-nandra.com>
Date: Fri, 2 Feb 2024 08:02:44 +0100
Subject: [PATCH] Fix bus error issue

See https://github.com/golang/go/issues/58548
---
 src/runtime/asm_arm.s | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/runtime/asm_arm.s b/src/runtime/asm_arm.s
index 591ef2a..f5d62ba 100644
--- a/src/runtime/asm_arm.s
+++ b/src/runtime/asm_arm.s
@@ -28,7 +28,7 @@ TEXT main(SB),NOSPLIT|NOFRAME,$0
 // c-archive) or when the shared library is loaded (for c-shared).
 // We expect argc and argv to be passed in the usual C ABI registers
 // R0 and R1.
-TEXT _rt0_arm_lib(SB),NOSPLIT,$104
+TEXT _rt0_arm_lib(SB),NOSPLIT,$108
        // Preserve callee-save registers. Raspberry Pi's dlopen(), for example,
        // actually cares that R11 is preserved.
        MOVW    R4, 12(R13)
--
2.34.1

@dong-zeyu I've tried same code with glibc 2.35 and 2.36 and issue was still present until go-runtime was patched.

@cherrymui
Copy link
Member

@nandra thanks. Could you send a CL? (See https://go.dev/doc/contribute for how to contribute, if you haven't seen it.) Thanks.

@cherrymui
Copy link
Member

I think it may be that the original frame size was 104, but we add 4 byte for the stack slot to save the LR, which makes it not a multiple of 8. Changing to 108 makes it a multiple of 8 after adding the LR slot. (If you send a patch, a comment would be helpful. Thanks.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

8 participants