Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/link: ppc64x direct call too far: net(.text).malloc while building kubernetes #19425

Closed
mkumatag opened this issue Mar 6, 2017 · 19 comments
Milestone

Comments

@mkumatag
Copy link

mkumatag commented Mar 6, 2017

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

[root@localhost kubernetes]# go version
go version go1.8 linux/ppc64le
[root@localhost kubernetes]#

What operating system and processor architecture are you using (go env)?

[root@localhost kubernetes]# go env
GOARCH="ppc64le"
GOBIN=""
GOEXE=""
GOHOSTARCH="ppc64le"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/root/k8s_ws"
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/linux_ppc64le"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build603675669=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
PKG_CONFIG="pkg-config"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
[root@localhost kubernetes]#

What did you do?

  1. Download and install golang1.8 from https://storage.googleapis.com/golang/go1.8.linux-ppc64le.tar.gz
  2. clone latest kubernetes code - https://github.com/kubernetes/kubernetes.git
  3. Make it.

What did you expect to see?

make should pass without any issues but failed to generate one if the binary.

What did you see instead?

[root@localhost kubernetes]# make
+++ [0306 09:26:33] Building the toolchain targets:
k8s.io/kubernetes/hack/cmd/teststale
k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [0306 09:26:33] Generating bindata:
test/e2e/generated/gobindata_util.go
~/k8s_ws/src/k8s.io/kubernetes ~/k8s_ws/src/k8s.io/kubernetes/test/e2e/generated
~/k8s_ws/src/k8s.io/kubernetes/test/e2e/generated
+++ [0306 09:26:34] Building go targets for linux/ppc64le:
cmd/libs/go2idl/deepcopy-gen
+++ [0306 09:26:41] Building the toolchain targets:
k8s.io/kubernetes/hack/cmd/teststale
k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [0306 09:26:41] Generating bindata:
test/e2e/generated/gobindata_util.go
~/k8s_ws/src/k8s.io/kubernetes ~/k8s_ws/src/k8s.io/kubernetes/test/e2e/generated
~/k8s_ws/src/k8s.io/kubernetes/test/e2e/generated
+++ [0306 09:26:42] Building go targets for linux/ppc64le:
cmd/libs/go2idl/defaulter-gen
+++ [0306 09:26:49] Building the toolchain targets:
k8s.io/kubernetes/hack/cmd/teststale
k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [0306 09:26:49] Generating bindata:
test/e2e/generated/gobindata_util.go
~/k8s_ws/src/k8s.io/kubernetes ~/k8s_ws/src/k8s.io/kubernetes/test/e2e/generated
~/k8s_ws/src/k8s.io/kubernetes/test/e2e/generated
+++ [0306 09:26:50] Building go targets for linux/ppc64le:
cmd/libs/go2idl/conversion-gen
+++ [0306 09:26:57] Building the toolchain targets:
k8s.io/kubernetes/hack/cmd/teststale
k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [0306 09:26:57] Generating bindata:
test/e2e/generated/gobindata_util.go
~/k8s_ws/src/k8s.io/kubernetes ~/k8s_ws/src/k8s.io/kubernetes/test/e2e/generated
~/k8s_ws/src/k8s.io/kubernetes/test/e2e/generated
+++ [0306 09:26:58] Building go targets for linux/ppc64le:
cmd/libs/go2idl/openapi-gen
+++ [0306 09:27:06] Building the toolchain targets:
k8s.io/kubernetes/hack/cmd/teststale
k8s.io/kubernetes/vendor/github.com/jteeuwen/go-bindata/go-bindata
+++ [0306 09:27:06] Generating bindata:
test/e2e/generated/gobindata_util.go
~/k8s_ws/src/k8s.io/kubernetes ~/k8s_ws/src/k8s.io/kubernetes/test/e2e/generated
~/k8s_ws/src/k8s.io/kubernetes/test/e2e/generated
+++ [0306 09:27:07] Building go targets for linux/ppc64le:
cmd/kube-proxy
cmd/kube-apiserver
cmd/kube-controller-manager
cmd/cloud-controller-manager
cmd/kubelet
cmd/kubeadm
cmd/hyperkube
vendor/k8s.io/kube-aggregator
plugin/cmd/kube-scheduler
cmd/kubectl
federation/cmd/kubefed
cmd/gendocs
cmd/genkubedocs
cmd/genman
cmd/genyaml
cmd/mungedocs
cmd/genswaggertypedocs
cmd/linkcheck
examples/k8petstore/web-server/src
federation/cmd/genfeddocs
vendor/github.com/onsi/ginkgo/ginkgo
test/e2e/e2e.test
cmd/kubemark
vendor/github.com/onsi/ginkgo/ginkgo
test/e2e_node/e2e_node.test
cmd/gke-certificates-controller

k8s.io/kubernetes/federation/cmd/genfeddocs

net(.text): direct call too far: net(.text).malloc 20b3d18
net(.text): direct call too far: net(.text).malloc 20b3cd4
net(.text): direct call too far: net(.text).__errno_location 20b3c94
net(.text): direct call too far: net(.text).getnameinfo 20b3c7c
net(.text): direct call too far: net(.text).getnameinfo 20b3bd4
net(.text): direct call too far: net(.text).__errno_location 20b3b34
net(.text): direct call too far: net(.text).getaddrinfo 20b3b40
net(.text): direct call too far: net(.text).free 20b3ae8
net(.text): direct call too far: net(.text).freeaddrinfo 20b3ac0
net(.text): direct call too far: net(.text).gai_strerror 20b3a7c
net(.text): direct call too far: net(.text).getaddrinfo 20b39a8
os/user(.text): direct call too far: os/user(.text).malloc 20b3990
os/user(.text): direct call too far: os/user(.text).malloc 20b394c
os/user(.text): direct call too far: os/user(.text).getgrouplist 20b3900
os/user(.text): direct call too far: os/user(.text).free 20b38b0
os/user(.text): direct call too far: os/user(.text).getgrgid_r 20b385c
os/user(.text): direct call too far: os/user(.text).getgrnam_r 20b37e4
os/user(.text): direct call too far: os/user(.text).getpwnam_r 20b376c
os/user(.text): direct call too far: os/user(.text).getpwuid_r 20b36f4
os/user(.text): direct call too far: os/user(.text).realloc 20b3688
os/user(.text): direct call too far: os/user(.text).sysconf 20b3624
/usr/local/go/pkg/tool/linux_ppc64le/link: too many errors
!!! [0306 09:29:01] Call tree:
!!! [0306 09:29:01] 1: /root/k8s_ws/src/k8s.io/kubernetes/hack/lib/golang.sh:740 kube::golang::build_binaries_for_platform(...)
!!! [0306 09:29:01] 2: hack/make-rules/build.sh:27 kube::golang::build_binaries(...)
!!! [0306 09:29:01] Call tree:
!!! [0306 09:29:01] 1: hack/make-rules/build.sh:27 kube::golang::build_binaries(...)
!!! [0306 09:29:01] Call tree:
!!! [0306 09:29:01] 1: hack/make-rules/build.sh:27 kube::golang::build_binaries(...)
make: *** [all] Error 1
[root@localhost kubernetes]#

@bradfitz
Copy link
Contributor

bradfitz commented Mar 6, 2017

@laboger, what's the status here? I thought these were all fixed. Guess not?

@bradfitz bradfitz added this to the Go1.9 milestone Mar 6, 2017
@laboger
Copy link
Contributor

laboger commented Mar 6, 2017

@cherrymui This problem is related to the big text problem and I suspect it would happen in arm too unless you did something to fix it. I need help on this one because there is a lot about golang linking I don't understand.

In this case, the problem occurs because there is a call off to something in glibc, so it gets a call stub in the PLT, but the PLT is getting placed too far in the binary so that the bl to the call stub is too far.

If we use -linkmode=external on this one, the error doesn't occur. Actually I didn't even realize that you could call something in glibc when using internal linking.

Any suggestions on solutions for this one? Even if the PLT was put up front there could still be code later on in the binary that is too far from the PLT.

@bradfitz bradfitz changed the title direct call too far: net(.text).malloc while building latest kubernetes with golang 1.8 cmd/link: ppc64x direct call too far: net(.text).malloc while building kubernetes Mar 6, 2017
@randall77
Copy link
Contributor

@ianlancetaylor for linker expertise.

@laboger
Copy link
Contributor

laboger commented Mar 6, 2017

@bradfitz I thought they were all fixed too :(
This one is related to a call to something in the PLT which I haven't seen before.
This just started happening when Kubernetes updated to a new beta version. Didn't happen in 1.6.0-beta.0 but now fails in 1.6.0-beta1.

@cherrymui
Copy link
Member

I tried go build -ldflags=-linkmode=internal k8s.io/kubernetes/federation/cmd/genfeddocs with cgo enabled, for both GOARCH=arm and GOARCH=ppc64le. Both succeeded.
Could you let me know how to reproduce it?

@cherrymui
Copy link
Member

The PLT stubs are generated before the trampoline insertion pass. So the trampoline insertion pass should be able to take care of it. Maybe a relocation case is missing?

@laboger
Copy link
Contributor

laboger commented Mar 7, 2017

The problem happens at the instruction that does the bl to the PLT stub. At the point of trampoline insertion, it doesn't know where the PLT will be, does it? That isn't know until the linker links it all together because I believe the PLT stubs are in a separate section.

This only fails with the latest Kubernetes (after v1.6.0-beta1). Here is what I see after the build has failed:

~/kub/kublatest/kubernetes/_output/local/go$ go build k8s.io/kubernetes/federation/cmd/genfeddocs
//# k8s.io/kubernetes/federation/cmd/genfeddocs
net(.text): direct call from: net(.text) too far: net(.text).malloc 20b3e00
net(.text): direct call from: net(.text) too far: net(.text).malloc 20b3dbc
net(.text): direct call from: net(.text) too far: net(.text).__errno_location 20b3d7c
net(.text): direct call from: net(.text) too far: net(.text).getnameinfo 20b3d64
net(.text): direct call from: net(.text) too far: net(.text).getnameinfo 20b3cbc
net(.text): direct call from: net(.text) too far: net(.text).__errno_location 20b3c1c
net(.text): direct call from: net(.text) too far: net(.text).getaddrinfo 20b3c28
net(.text): direct call from: net(.text) too far: net(.text).free 20b3bd0
net(.text): direct call from: net(.text) too far: net(.text).freeaddrinfo 20b3ba8
net(.text): direct call from: net(.text) too far: net(.text).gai_strerror 20b3b64
net(.text): direct call from: net(.text) too far: net(.text).getaddrinfo 20b3a90
os/user(.text): direct call from: os/user(.text) too far: os/user(.text).malloc 20b3a78
os/user(.text): direct call from: os/user(.text) too far: os/user(.text).malloc 20b3a34
os/user(.text): direct call from: os/user(.text) too far: os/user(.text).getgrouplist 20b39e8
os/user(.text): direct call from: os/user(.text) too far: os/user(.text).free 20b3998
os/user(.text): direct call from: os/user(.text) too far: os/user(.text).getgrgid_r 20b3944
os/user(.text): direct call from: os/user(.text) too far: os/user(.text).getgrnam_r 20b38cc
os/user(.text): direct call from: os/user(.text) too far: os/user(.text).getpwnam_r 20b3854
os/user(.text): direct call from: os/user(.text) too far: os/user(.text).getpwuid_r 20b37dc
os/user(.text): direct call from: os/user(.text) too far: os/user(.text).realloc 20b3770
os/user(.text): direct call from: os/user(.text) too far: os/user(.text).sysconf 20b370c

@laboger
Copy link
Contributor

laboger commented Mar 7, 2017

OK now I see what you were saying earlier, the trampoline code is not handling the SDYNIMPORT type correctly. I'll look into that.

@laboger
Copy link
Contributor

laboger commented Mar 7, 2017

I found that the problem happens as a result of there being C code calling functions in glibc, where the plt call stub is too far and no trampoline is generated for that case. For example, this happens in net.a because C code is generated through cgo for the call to C.malloc. When the C code is processed, the function IsDirectJump() must be true before it will call trampoline(), but that function only checks for Go relocation types and not C relocation types so returns false for C code. I think this same problem could happen on arm at some point, but since ppc64le is 64 bit the programs are in general larger so we are hitting it first.

@laboger
Copy link
Contributor

laboger commented Mar 9, 2017

Here is a simpler summary:

  1. net.a is read in from ldelf.go. This generates a symbol called net(.text) which contains various functions such as _cgo_cb4050e05860_Cfunc__Cmalloc, and some of those make calls to functions like malloc which require a call stub. These symbols are added to Textp first.
  2. All the other packages needed by the program are read in and their symbols are added to Textp.
  3. The call stubs are generated, and one must be created to be used by the function mentioned above, because it does a bl malloc. It is called net(.text).malloc. The call stubs are added after all the packages have been added, and since Kubernetes has a very large number of very large packages, the call stubs end up too far for the code in net(.text). The trampoline processing doesn't handle this because the call stub (target of the bl) hasn't had its code address assigned at the point the code for the bl is processed.

I think the solution would be to just put the call stubs closer to where they are used, and actually I think it should be OK to just put them within the range of the outer sym, i.e., as a sub of net(.text) since the way these call stubs are named, they are unique within the outer sym. net(.text).malloc can only be called within net(.text).

@cherrymui or @ianlancetaylor I need some direction how to do this. I've made a few naive attempts at getting the callstubs in a different location but I can't get it to build.

@cherrymui
Copy link
Member

@laboger thank you for looking into it.

Would it be possible to add call stubs to Textp before other packages?

If that doesn't work, maybe you can pre-compute the size of the program before assigning the addresses, and insert trampolines when assigning addresses for functions in 1. I think the size of the text is known except for trampolines inserted when assigning addresses. You could make some conservative estimate of the trampoline sizes.

I'll take a careful look in the weekend or early next week.

@laboger
Copy link
Contributor

laboger commented Mar 9, 2017

Workaround for this is to add -ldflags '-linkmode=external' for the build of nonstatic binaries built by kubernetes. The static binaries built by kubernetes should not be affected by this problem and should not have this option added.

@cherrymui
Copy link
Member

@laboger CL https://go-review.googlesource.com/c/38131/ puts the stubs to the beginning of Textp. Could you try that? By moving the stubs, the calls to the stubs are actually not too far now in that binary. But I think it should work even if it is too far.

@gopherbot
Copy link

CL https://golang.org/cl/38131 mentions this issue.

@laboger
Copy link
Contributor

laboger commented Mar 13, 2017

Yes I will try it.

@luxas
Copy link

luxas commented Mar 19, 2017

Is it possible that https://go-review.googlesource.com/c/38131/ can be cherry-picked into the go1.8 release branch? The risk seems to be quite low, and the gain pretty high if it fixes net and os/user problems with external linking for ppc64le and maybe ARM.

WDYT?

@cherrymui
Copy link
Member

Issue #19578 is opened for cherry-picking.

@anguslees
Copy link

I see this same issue on arm (also building kubernetes; go 1.8; works with -linkmode=external).
Is there a similar patch around for arm? Can it also be backported to 1.8?

@luxas
Copy link

luxas commented Aug 26, 2017

@anguslees Please open a new issue with more details and cc me.
Really want to pin that down if there's an issue; it's not known I think

@golang golang locked and limited conversation to collaborators Aug 26, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants