Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/link: does not support multiple TOCs in internal linking mode #21961

Closed
mkumatag opened this issue Sep 21, 2017 · 31 comments
Closed

cmd/link: does not support multiple TOCs in internal linking mode #21961

mkumatag opened this issue Sep 21, 2017 · 31 comments
Labels
FrozenDueToAge help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@mkumatag
Copy link

mkumatag commented Sep 21, 2017

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

master

root@localhost:~/go# git rev-parse HEAD
eca45997dfd6cd14a59fbdea2385f6648a0dc786
root@localhost:~/go#

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

ppc64le

What did you do?

export PATH=/usr/local/go/bin/:$PATH
export GOROOT_BOOTSTRAP=go env GOROOT

root@localhost:~/go/bin# go version
go version go1.8.3 linux/ppc64le

cloned the latest code here:

root@localhost:~/go/src# pwd
/root/go/src

run make.bash
root@localhost:~/go/src# ./make.bash

Use the build go binary:

root@localhost:~/go/bin# pwd
/root/go/bin
root@localhost:~/go/bin# ./go version
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0xc88320 pc=0x112c8]

runtime stack:
invalid spdelta runtime/cgo(.text).sigfillset 0x112c0 0x112c8 0x0 -1
runtime.throw(0x44825a, 0x2a)
	/root/go/src/runtime/panic.go:606 +0x68
runtime.sigpanic()
	/root/go/src/runtime/signal_unix.go:360 +0x264
invalid spdelta runtime/cgo(.text).sigfillset 0x112c0 0x112c8 0x0 -1

goroutine 1 [running]:
runtime.systemstack_switch()
	/root/go/src/runtime/asm_ppc64x.s:209 +0x10 fp=0xc420036758 sp=0xc420036738 pc=0x64e40
runtime.main()
	/root/go/src/runtime/proc.go:128 +0x60 fp=0xc4200367c0 sp=0xc420036758 pc=0x3cf80
runtime.goexit()
	/root/go/src/runtime/asm_ppc64x.s:1353 +0x4 fp=0xc4200367c0 sp=0xc4200367c0 pc=0x67464
root@localhost:~/go/bin#

If possible, provide a recipe for reproducing the error.
A complete runnable program is good.
A link on play.golang.org is best.

What did you expect to see?

Instead, I should see a golang version

What did you see instead?

I see a SIGSEGV

@mkumatag
Copy link
Author

/cc @laboger

@mkumatag
Copy link
Author

@ceseo

@mundaym
Copy link
Member

mundaym commented Sep 21, 2017

@mkumatag which revision of Go are you using? Can you try rolling back to 101fbc2 (git checkout 101fbc2) and retrying? It looks like the ppc64le builder has been failing since 8e5ac83 (cmd/go: stop linking cgo objects together with ld -r).

@mkumatag
Copy link
Author

@mundaym I'm using latest one

root@localhost:~/go# git rev-parse HEAD
eca45997dfd6cd14a59fbdea2385f6648a0dc786
root@localhost:~/go#

@mkumatag
Copy link
Author

I rolled back to 101fbc2 and things are fine,

root@localhost:~/go/bin# ./go version
go version devel +101fbc2 Wed Sep 20 20:27:13 2017 +0000 linux/ppc64le
root@localhost:~/go/bin#

@mundaym @ianlancetaylor not really sure what is wrong with the latest version!

@mundaym mundaym added this to the Go1.10 milestone Sep 21, 2017
@mundaym mundaym added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 21, 2017
@laboger
Copy link
Contributor

laboger commented Sep 21, 2017

Yikes I haven't been watching the build board lately and I see we have been randomly failing in runtime since e7e4a4f and this latest failure of SEGV was introduced with @ianlancetaylor 's change 8e5ac83.

segv happens here:

0x112c0 <runtime/cgo(.text).sigfillset>: std r2,24(r1)
0x112c4 <runtime/cgo(.text).sigfillset+4>: addis r12,r2,100
=> 0x112c8 <runtime/cgo(.text).sigfillset+8>: ld r12,192(r12)
0x112cc <runtime/cgo(.text).sigfillset+12>: mtctr r12
0x112d0 <runtime/cgo(.text).sigfillset+16>: bctr

In golang prior to the change to omit the ld -r where it works, this code looks like this:

=> 0x112c0 <runtime/cgo(.text).sigfillset>: std r2,24(r1)
0x112c4 <runtime/cgo(.text).sigfillset+4>: addis r12,r2,-1
0x112c8 <runtime/cgo(.text).sigfillset+8>: ld r12,32368(r12)
0x112cc <runtime/cgo(.text).sigfillset+12>: mtctr r12
0x112d0 <runtime/cgo(.text).sigfillset+16>: bctr

So my theory is that the TOCs from the individual .o files should be getting merged into a single TOC at some point, but if the external linker isn't involved in the final link, how's it going to happen except by the ld -r? My understanding is that the go tool is linked with the internal linker and not the external linker, so in that case the ld -r is still useful for ppc64le.

This brings up a different concern however, why this wasn't caught in the buildbots?

@mundaym
Copy link
Member

mundaym commented Sep 21, 2017

You mean the trybots? They don't run on ppc64x, they only check that the compilation works.

@laboger
Copy link
Contributor

laboger commented Sep 21, 2017

FYI... this doesn't affect ppc64 because 'go' is built statically there.

@laboger
Copy link
Contributor

laboger commented Sep 22, 2017

@mundaym This breaks the 'go' tool so I was somewhat thinking that it should fail during the build. But I see now that go_bootstrap is built as static but the final go tool is built as a dynamic binary containing cgo using the internal linker which is the case that is broken by the removal of ld -r.

@mkumatag
Copy link
Author

Any updates? builds are still failing with the latest code. @ianlancetaylor any comments?

@laboger
Copy link
Contributor

laboger commented Sep 25, 2017

@ianlancetaylor I don't know what your thoughts are on this. This regression makes the go tool unusable on the golang build so we would like to get this fixed.

I can create a fix to back out your change for ppc64le, i.e., continue to use the ld -r on ppc64le but not on other platforms.

I believe the other option is to fix the internal linker to correctly handle merging of the TOCs, but that might take me a while because I don't know the internal linker well enough.

@ianlancetaylor
Copy link
Contributor

What happens if we simply add sys.PPC64 to this list in cmd/link/internal/ld/config.go?

	if iscgo && SysArch.InFamily(sys.ARM64, sys.MIPS64, sys.MIPS) {
		return true, objabi.GOARCH + " does not support internal cgo"
	}

That seems to me like the right temporary fix unless and until somebody fixes the internal linker. The downside is that we won't generate fully statically linked binaries for pure Go programs on ppc64le. Is that acceptable?

@laboger
Copy link
Contributor

laboger commented Sep 25, 2017

Isn't this going to force external linking only when iscgo is true? So pure Go programs on ppc64le should still be truly statically linked with this change.

I just tried it, and that's what happened with a simple hello.go without cgo, which is pure Go AFAIK?

I think this is the correct temporary fix, but this means the 'go' tool is linked with the external linker and not the internal linker. In both cases it is a dynamic binary. I'm not sure what the downside might be for using a different linker.

@ianlancetaylor
Copy link
Contributor

Sorry, what I wrote was a little confusing. I think this will change the behavior of any pure Go program that imports the net or os/user packages. I think that currently they will use internal linking, and with my suggestion they will use external linking. You're right that they will be dynamically linked either way.

I think the only significant difference is that with this change a pure Go program (that imports the net or os/user package) will only be buildable if the C linker is installed on the system where the Go program is being built. Without this change such a program would be buildable even if the C linker was not installed at all. On GNU/Linux I doubt that is worth worrying about (on Windows, on the other hand, it would be a problem).

Does the change actually fix the problem?

@laboger
Copy link
Contributor

laboger commented Sep 26, 2017

Yes this fixes the problem, sorry I wasn't explicit on that.

I don't have a good feel for how many programs might be affected or who would care. Obviously the building of the 'go' tool requires the 'ld -r' so it must have the C linker on the system. I think we fix it and see who complains.

@ianlancetaylor
Copy link
Contributor

You're right that clearly invoking ld -r means that you have the C linker. I must be misremembering something about this.

@ianlancetaylor ianlancetaylor changed the title SIGSEGV while running go on ppc64le cmd/link: does not support multiple TOCs in internal linking mode Sep 26, 2017
@gopherbot
Copy link

Change https://golang.org/cl/66270 mentions this issue: cmd/link: don't use internal linking mode for cgo on PPC64

gopherbot pushed a commit that referenced this issue Sep 26, 2017
The internal linker doesn't know how to handle multiple TOC sections
in internal linking mode. This used to work because before CL 64793 we
invoked ld -r on multiple objects, and that merged the TOC sections
for us.

Updates #21961

Change-Id: I48260a7195be660016f2f358ebc8cb79652210ab
Reviewed-on: https://go-review.googlesource.com/66270
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
@mwhudson
Copy link
Contributor

This is related to #15409 is it not?

From my POV I'd be fine with deleting all the code related to internal linking support for cgo but that might not be terribly friendly.

@ianlancetaylor
Copy link
Contributor

Yes, it seems likely that fixing this would fix #15409.

Internal linking is quite important on Windows but I agree that it is less interesting on GNU/Linux systems. Still, in general, it is a goal that people can install a Go distribution and, as long as they don't use cgo, use it with no C compiler or linker installed. With the current standard library that means that the internal linker must support at least the code generated for the net and os/user packages.

@laboger
Copy link
Contributor

laboger commented Jun 14, 2018

The recent breakage in the ppc64x builds and the discussion about cross builds made me curious, so I did a cross build of golang for ppc64le and found that the go binary created is statically (and therefore internally) linked. I moved the cross built golang over to a ppc64le machine and the tests all passed. So I think that means an installation of a Go distribution for ppc64le from a cross build on an x86 could be installed with no C compiler or linker.

At this point I don't understand why the go binary is built differently when doing a cross. It seems that it should be built the same way whether cross or natively built. It should still have multiple TOCs in both cases?

@ianlancetaylor
Copy link
Contributor

By default when cross-compiling we assume that we don't have a C cross-compiler available, so we build without cgo support. You should be able to get the same result cross and native, assuming you have a C cross-compiler, if you set CGO_ENABLED=1 in the environment.

@laboger
Copy link
Contributor

laboger commented Aug 13, 2020

@cherrymui @jeremyfaller Do you have a guess on how much work it would be to get internal linking to work with cgo on ppc64le? Maybe some relocation types are missing and then this TOC merging would have to be done.

@cherrymui
Copy link
Member

I'm not familiar with the ABI to tell. I would guess as long as we actually understand the ABI, it won't take much work. Maybe keep track the multiple TOCs, and group them together (?) according to the ABI?

@laboger
Copy link
Contributor

laboger commented Aug 17, 2020

Actually I'm not sure if the title truly explains the problem here. It's been a while so I'm looking back through to try and understand what's really wrong. I know the problem happened because there were two relocatable files to be linked together and they both had references which needed plt generation. They previously worked because 'ld -r' was done for all those files prior to the final link. When the 'ld-r' was removed this quit working and the fix was to disable internal linking with cgo on ppc64le.

Is this the ABI you mean? https://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi-1.9.html

@laboger
Copy link
Contributor

laboger commented Aug 17, 2020

I tried to enable internal linking with cgo on master in cmd/link/internal/ld/config.go and got this error during the build:

 ./make.bash
Building Go cmd/dist using /home/boger/golang/go-linux-ppc64le-bootstrap. (devel +91f997b Wed Jan 6 06:08:34 2016 +0000 linux/ppc64le)
Building Go toolchain1 using /home/boger/golang/go-linux-ppc64le-bootstrap.
Building Go bootstrap cmd/go (go_bootstrap) using Go toolchain1.
Building Go toolchain2 using go_bootstrap and Go toolchain1.
Building Go toolchain3 using go_bootstrap and Go toolchain2.
Building packages and commands for linux/ppc64le.
# cmd/trace
.plt: initialize bounds (16 < 24)
panic: unexpected empty container symbol

goroutine 1 [running]:
cmd/link/internal/loader.(*Loader).AddInteriorSym(0xc000912000, 0x22a5e, 0x22a68)
	/home/boger/golang/link/go/src/cmd/link/internal/loader/loader.go:1678 +0x2d0
cmd/link/internal/ld.(*dodataState).allocateDataSections(0xc000012d80, 0xc000276000)
	/home/boger/golang/link/go/src/cmd/link/internal/ld/data.go:1661 +0x3874
cmd/link/internal/ld.(*Link).dodata(0xc000276000, 0xc0034b6000, 0x23a38, 0x23a38)
	/home/boger/golang/link/go/src/cmd/link/internal/ld/data.go:1503 +0xa20
cmd/link/internal/ld.Main(0x4386a0, 0x10, 0x20, 0x1, 0x1, 0x41, 0x0, 0x0, 0x2a58dc, 0x10, ...)
	/home/boger/golang/link/go/src/cmd/link/internal/ld/main.go:301 +0x1078
main.main()
	/home/boger/golang/link/go/src/cmd/link/main.go:68 +0x228

I added print statements before the panic and the empty container was the .got.

@laboger
Copy link
Contributor

laboger commented Sep 9, 2020

@cherrymui I think I found out why the "initialize bounds" error message is happening in .plt.

In cmd/link/internal/ppc64/asm.go function addpltsym, when an entry is added to the plt using plt.Grow, that doesn't update the size. If I add a call to plt.SetSize to increase the plt size then I can get past this error.

By the way, is there a tool like readelf to see what relocations are in the runtime/cgo(.text) and runtime/cgo(.toc) sections? It looks like the problem is that each input file has a runtime/cgo with (.text) and (.toc) but then at the final link, the relocations assume a single TOC. So some of the r2 addresses and relocated offsets are wrong because of this.

@cherrymui
Copy link
Member

If I add a call to plt.SetSize to increase the plt size then I can get past this error.

That makes sense.

is there a tool like readelf to see what relocations are in the runtime/cgo(.text) and runtime/cgo(.toc) sections?

Sorry I'm not sure I understand what you mean. Why readelf itself doesn't do?

@laboger
Copy link
Contributor

laboger commented Sep 10, 2020

I was able to get readelf to work, I was trying it on the wrong type of file.

I'm trying to understand how the SELFGOT data should be processed because that's the problem I'm trying to resolve next.
In cmd/link/internal/loadelf/ldelf.go Load function is trying to load files and puts entries into the SELFGOT data. I added some prints here and see a few lines like this (the last number on the line is the localSymVersion).

### Reading file: /home/boger/golang/link/go/pkg/linux_ppc64le/runtime/cgo.a(_x004.o) ###
### Setting SELFGOT section: .toc for runtime/cgo(.toc) 250 ###
### Reading file: /home/boger/golang/link/go/pkg/linux_ppc64le/runtime/cgo.a(_x005.o) ###
### Setting SELFGOT section: .toc for runtime/cgo(.toc) 251 ###
### Reading file: /home/boger/golang/link/go/pkg/linux_ppc64le/runtime/cgo.a(_x009.o) ###
### Setting SELFGOT section: .toc for runtime/cgo(.toc) 255 ###

And then in cmd/link/internal/ld/data.go function allocateDataSections it tries to process the SELFGOT entries and there are a few problems with that code:

  • The .got is an entry in the SELFGOT but that has size 0 which causes a failure in the code below it. I added code to skip entries for the .got.
  • There is an expectation that runtime/cgo(.toc) is a symbol to be used as an interior symbol but that doesn't exist anywhere. I get an error saying that the interior symbol doesn't exist in the symtab. All I could find are places where the TOC symbols are created. Should these runtime/cgo(.toc) symbols be created? If not what should the interior symbol be for these.
  • There is an array called DotTOC in ctxt that has been filled in with .TOC. symbols for each symbol version but I don't see where or how those are to be used.
  • Once the program is created there is only one TOC symbol and all the relocations are based on a single TOC. But the code to load r2 and within the callstubs (plus maybe others?) are assuming the runtime/cgo(.toc) for the "version" it was generated. Maybe this is where the DotTOC values should be used but I don't see how they should be.

I don't know if the code in allocateDataSections for processing the SELFGOT for PPC64 is currently used for AIX or if it ever executed at all.

@gopherbot
Copy link

Change https://golang.org/cl/261837 mentions this issue: cmd/link: update plt size appropriately on ppc64

gopherbot pushed a commit that referenced this issue Oct 13, 2020
When attempting to enable internal linking with cgo on ppc64
it was discovered that the plt size was not being updated
after adding entries to it, which resulted in this error:

  .plt: initialize bounds (16 < 24)

This changes fixes that problem.

Updates #21961

Change-Id: Ie17539c329f5a4802e5defd93852dcdde19ded8c
Reviewed-on: https://go-review.googlesource.com/c/go/+/261837
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>
@gopherbot
Copy link

Change https://golang.org/cl/304458 mentions this issue: cmd/link: rework .TOC. handling for ppc64le

@gopherbot
Copy link

Change https://golang.org/cl/304459 mentions this issue: cmd/link: enable internal linker in more cases for ppc64le

gopherbot pushed a commit that referenced this issue Sep 8, 2021
Setup .TOC. to point to the same place for all objects. Today, the linker
assumes all call relocations can use the local function entry point of
imported object files. This requires a consistent pointer across all
objects.

This intentionally computes the .TOC. pointer in all linking configurations.
In some cases the .TOC. is not used today (e.g linking position-dependent go
only code). It is harmless and simple to compute in all cases, so just
do it for easier maintenance.

Notably, .TOC. is used in some cases when static linking is requested on
ppc64le/linux:

* Position-independent C code using a PC-rel relocation against .TOC.. cgo
  generated C object files are usually compiled PIC even if the go binary
  itself is not.

* Anything which causes PLT stub generation. The stubs always generate
  a .TOC. relative relocation.

* The race detector. Today, this links in an externally compiled archive which
  contains position-independent object files.

Similarly, position-independent linking is always punted to the external
linker on ppc64 today.

Updates #21961
Fixes #15409

Change-Id: Ifd8294b9249e16ba8b92eaf876d15d162f9c61fd
Reviewed-on: https://go-review.googlesource.com/c/go/+/304458
Reviewed-by: Cherry Mui <cherryyz@google.com>
Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
@golang golang locked and limited conversation to collaborators Sep 9, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

7 participants