Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: Segfault on ppc64le during Go 1.18 build on Alpine Linux #51787

Closed
nmeum opened this issue Mar 18, 2022 · 17 comments
Closed

cmd/go: Segfault on ppc64le during Go 1.18 build on Alpine Linux #51787

nmeum opened this issue Mar 18, 2022 · 17 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.

Comments

@nmeum
Copy link

nmeum commented Mar 18, 2022

What version of Go are you using (go version)?

$ go version
go version go1.18 linux/ppc64le

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

Since the Go build itself fails, I cannot provide this information. I am using Alpine Linux Edge (which uses musl libc) on ppc64le. I added the go env output for Go 1.17 (which compiles fine on ppc64le Alpine Linux Edge) below.

go env Output
$ go env
$ go env
GO111MODULE=""
GOARCH="ppc64le"
GOBIN=""
GOCACHE="/home/buildozer/.cache/go-build"
GOENV="/home/buildozer/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="ppc64le"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/buildozer/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/buildozer/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/buildozer/aports/community/go/src/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/buildozer/aports/community/go/src/go/pkg/tool/linux_ppc64le"
GOVCS=""
GOVERSION="go1.17.8"
GCCGO="gccgo"
GOPPC64="power8"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build2007078828=/tmp/go-build -gno-record-gcc-switches

What did you do?

I am working on upgrading the Alpine Linux Go package from 1.17.8 to 1.18. While doing so, I noticed that 1.18 fails to compile on our ppc64le CI. On all other architectures supported by Alpine (aarch64, armhf, armv7, s390x, x86 and x86_64) Go 1.18 builds fine and passes all tests. I would also like to point out that Go 1.17.8 compiles fine and passes tests on ppc64le Alpine Linux.

I assume the Go 1.18 build has been tested on a glibc-based Linux system before? For this reason, I suspect that this might be related to musl libc. In order to reproduce this issue it should be sufficient to run ./make.bash on a ppc64le-based Alpine Linux Edge system (see our build receipe for details).

What did you expect to see?

A successful Go build.

What did you see instead?

A segfault while compile cmd/go:

…
cmd/go/internal/generate
cmd/go/internal/get
cmd/go/internal/run
cmd/go/internal/list
cmd/go/internal/modget
cmd/go/internal/test
cmd/go/internal/vet
cmd/go/internal/bug
cmd/go
go tool dist: FAILED: /builds/alpine/aports/community/go/src/go/bin/go list -gcflags=all= -ldflags=all= -f={{if .Stale}}	STALE {{.ImportPath}}: {{.StaleReason}}{{end}} std cmd: signal: segmentation fault (core dumped)

Full build log with ./make.bash -v from our Alpine ppc64le CI: alpine-linux-edge-ppc64le-go-1.18.txt

Since Go 1.17.8 builds fine, I think this a regression introduced in Go 1.18. Any ideas what might be causing this?

@ALTree
Copy link
Member

ALTree commented Mar 18, 2022

One thing that changed for ppc64le in 1.18 is that it now uses the new register abi. You could disabling it by setting a GOEXPERIMENT=noregabi env variable and see if it stops crashing. (I'm shooting in the dark).

@nmeum
Copy link
Author

nmeum commented Mar 18, 2022

One thing that changed for ppc64le in 1.18 is that it now uses the new register abi. You could disable it by setting a GOEXPERIMENT=noregabi env variable and see if it stops crashing. (I'm shooting in the dark).

Thanks for your suggestion, unfortunately it still segfaults with that environment variable set.

@ALTree
Copy link
Member

ALTree commented Mar 18, 2022

Thanks for trying.

AFAIK we don't have an Alpine or a musl builder1, so some breakage seems inevitable. Some would say that any platform we don't have a builder for is not officially supported.

Footnotes

  1. I think there used to be one, but it was discontinued due to constant failures: x/build: get Alpine builders passing #19938

@ALTree ALTree added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 18, 2022
@nmeum
Copy link
Author

nmeum commented Mar 18, 2022

Apart from the cgo tests (#39857) we didn't have any major issues with Go and its test suite in the past years. I think that a Go builder for Alpine/musl would be nice, let me know if I can be of assistance in that regard.

@cherrymui
Copy link
Member

cc @laboger

@laboger
Copy link
Contributor

laboger commented Mar 18, 2022

@pmur and I are looking at this. This could be related to #51790.

@pmur
Copy link
Contributor

pmur commented Mar 18, 2022

When the musl linker starts the go binary, it seems to be passing argc/argv/env through the stack, and the pointers r3/r4/r5 do not contain these pointers as go expects, nor is r4 0.

@pmur
Copy link
Contributor

pmur commented Mar 18, 2022

cmd/go is linking with the internal linker on 1.18 now, I suspect musl needs its startup code linked when using cgo+internal linking. I am not entirely sure how glibc gets away without it.

@pmur
Copy link
Contributor

pmur commented Mar 18, 2022

The call from musl's loader into the program vs glibc's is different on ppc64le. Explicitly requesting external linking mode will avoid this. Is there an option to do this without forcing external linking of pure go binaries?

@cherrymui
Copy link
Member

User could pass -ldflags=-linkmode=external. If you want to change the default, you could do something like https://cs.opensource.google/go/go/+/master:src/cmd/link/internal/ld/config.go;l=215 , adding little endian PPC64 there as well. However it would be nice if we could support the musl startup code. Could we support both at same time? I guess it is fine to support redundant arguments.

algitbot pushed a commit to alpinelinux/aports that referenced this issue Mar 19, 2022
This is a workaround for an upstream Go issues. Some tests fail as they
also try to use internal link mode, since this is (hopefully) a
temporary workaround just disable tests on ppc64le for now.

See golang/go#51787 (comment)
@nmeum
Copy link
Author

nmeum commented Mar 19, 2022

I can confirm that forcing external link mode via config.go "fixes" the encountered segfault on Alpine Linux Edge ppc64le. I have employed this as a (hopefully temporary) workaround in our Alpine testing repository.

@gopherbot
Copy link

Change https://go.dev/cl/394654 mentions this issue: runtime: make static/dynamic detection work with musl on ppc64le

@pmur
Copy link
Contributor

pmur commented Mar 22, 2022

@gopherbot please consider this for backport to 1.18, it is a severe bug for alpine.

@gopherbot
Copy link

Backport issue(s) opened: #51874 (for 1.18).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases.

@nmeum
Copy link
Author

nmeum commented Apr 13, 2022

With Go 1.18.1 the Go build succeeds and the test (with the exception of the cgo stuff due to #39857) pass successfully on ppc64le musl. However, we are seeing a lot of build/linking failures of Go packages on ppc64le. They often look as follows (and don't happen on other architectures):

github.com/xanzy/go-cloudstack/cloudstack.(*UsageService).AddTrafficType: unexpected trampoline for shared or dynamic linking
github.com/xanzy/go-cloudstack/cloudstack.(*UsageService).DeleteTrafficType: unexpected trampoline for shared or dynamic linking
github.com/xanzy/go-cloudstack/cloudstack.(*UsageService).GetTrafficTypeID: unexpected trampoline for shared or dynamic linking

or:

runtime/cgo(.text): relocation target _restgpr0_30 not defined
runtime/cgo(.text): relocation target _restgpr0_31 not defined
runtime/cgo(.text): relocation target _savegpr0_23 not defined
runtime/cgo(.text): relocation target _restgpr0_23 not defined
runtime/cgo(.text): relocation target _restgpr0_30 not defined

Example build logs:

Do you want me to open a separate issue for this or can this issue be reopened?

@pmur
Copy link
Contributor

pmur commented Apr 13, 2022

I think this is a separate issue. I would prefer a separate issue. Thanks.

@cherrymui
Copy link
Member

Yeah, it is probably better as a separate issue. @nmeum when you open a new issue, could you include the command you used to run "go build"? Thanks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

6 participants