Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/link: Go c-archive doesn't work with new Xcode 15 beta ld-prime linker #60694

Closed
bradfitz opened this issue Jun 8, 2023 · 21 comments
Closed
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done. OS-Darwin OS-iOS GOOS=ios
Milestone

Comments

@bradfitz
Copy link
Contributor

bradfitz commented Jun 8, 2023

We found that Go libraries (-buildmode=c-archive) linked into iOS apps crash on startup (during the dynamic loader?) when using the new Xcode 15 beta's "ld-prime" linker.

The workaround for now is to force the old linker with LDFLAGS=-ld64, so filing this issue for people to find.

/cc @agottardo (who figured out the workaround and can probably provide more details)

@bradfitz bradfitz added the OS-iOS GOOS=ios label Jun 8, 2023
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jun 8, 2023
@prattmic
Copy link
Member

prattmic commented Jun 8, 2023

cc @golang/compiler

@prattmic
Copy link
Member

prattmic commented Jun 8, 2023

cc @golang/ios

(Does this only affect iOS apps, not macOS?)

@agottardo
Copy link

agottardo commented Jun 8, 2023

A screenshot of the crash on device (this is an iPhone 13 mini running iOS 17 Developer Beta 1, build number 21a5248v):

Screenshot 2023-06-06 at 09 33 46

@prattmic yes, this appears to be iOS-only for now, at least based on the Xcode 15 release notes:

A new linker has been written to significantly speed up static linking. It’s the default for all iOS binaries and anyone using the “Mergeable Libraries” feature. The classic linker can still be explicitly requested using -ld64, and will be removed in a future release. (108915312)

I noticed ld-prime throws this warning out during build time, it is the only warning I could see:

ld: warning: '/Users/angott/Library/Developer/Xcode/DerivedData/IPN-eedotjtokdxrvhhkykhmrwietcyi/Build/Products/Debug-iphoneos/libipn-go-arm64.a[arm64]2' has malformed LC_DYSYMTAB, expected 118 undefined symbols to start at index 20591, found 135 undefined symbols starting at index 69

Attached is the crash report from iOS: IPNExtension-crash.ips.zip

I'll try to upload a minimal repro Xcode project as soon as possible.

@cherrymui
Copy link
Member

Thanks for reporting.

From the stack trace it just jumps to runtime.text, which is likely the beginning of the text segment? That is certainly not expected. Maybe it failed to resolve some relocations. The warning below may be related.

ld: warning: '/Users/angott/Library/Developer/Xcode/DerivedData/IPN-eedotjtokdxrvhhkykhmrwietcyi/Build/Products/Debug-iphoneos/libipn-go-arm64.a[arm64][2](go.o)' has malformed LC_DYSYMTAB, expected 118 undefined symbols to start at index 20591, found 135 undefined symbols starting at index 69

This looks like the C linker doesn't like our dynamic symbol table? If possible, could you share the output of otool -l libipn-go-arm64.a? Thanks.

@agottardo
Copy link

otool output attached: otool_output.txt

Will get you a minimal repro project soon!

@cherrymui
Copy link
Member

Thanks.

            cmd LC_DYSYMTAB
        cmdsize 80
      ilocalsym 0
      nlocalsym 20574
     iextdefsym 20574
     nextdefsym 17
      iundefsym 20591
      nundefsym 118

This matches expected 118 undefined symbols to start at index 20591. Not sure where the found 135 undefined symbols starting at index 69 is coming from...

@cherrymui
Copy link
Member

diff --git a/src/cmd/link/internal/ld/macho.go b/src/cmd/link/internal/ld/macho.go
index 2536cc2db8..052ce80446 100644
--- a/src/cmd/link/internal/ld/macho.go
+++ b/src/cmd/link/internal/ld/macho.go
@@ -833,9 +833,9 @@ func asmbMacho(ctxt *Link) {
                ml.data[2] = uint32(linkoff + s1 + s2 + s3 + s4 + s5) /* stroff */
                ml.data[3] = uint32(s6)                               /* strsize */
 
-               machodysymtab(ctxt, linkoff+s1+s2)
-
                if ctxt.LinkMode != LinkExternal {
+                       machodysymtab(ctxt, linkoff+s1+s2)
+
                        ml := newMachoLoad(ctxt.Arch, LC_LOAD_DYLINKER, 6)
                        ml.data[0] = 12 /* offset to string */
                        stringtouint32(ml.data[1:], "/usr/lib/dyld")

Does this patch make any difference? (This is a patch to the Go linker. You'll need to reinstall the linker after applying the patch: go install cmd/link.) I'm not sure we need to emit LC_DYSYMTAB for c-archive.

@agottardo
Copy link

Thanks for looking into this. I'll give your diff a try soon.

Minimal repro app project attached here: LDPrimeRepro.zip

It's a SwiftUI app that upon app launch tries to call the SomeFunc() Go function. Install Xcode 15.0 beta 1, then build and run it onto an iPhone (you might need to do it twice). App will crash immediately upon launch.

@agottardo
Copy link

@cherrymui applied your patch and rebuilt Go. It got rid of the warning when building the repro project, but I am still hitting the same crash:

Screenshot 2023-06-08 at 14 55 51

@cherrymui
Copy link
Member

@agottardo Thanks for sharing. I'll look into it.

Also, I downloaded Xcode 15 beta, ran it on a macOS ARM64 machine using ld-prime with Go, and found a number of issues. While some of them are minor, I got SIGILL for the c-shared and plugin mode, which may also be related. I'll look into them. Thanks.

@dr2chase dr2chase added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 9, 2023
@cherrymui
Copy link
Member

cherrymui commented Jun 10, 2023

I think I understand it now. It seems ld-prime doesn't honor the initializer we set in __mod_init_func. For building c-archive and c-shared mode, we set an initializer to initialize the Go runtime. For an example (code at the end), we have

$ objdump -s -j __mod_init_func a.a 

a.a(go.o):	file format mach-o 64-bit x86-64
Contents of section __DATA,__mod_init_func:
 ce360 80520500 00000000                    .R......

which points to

$ nm a.a | grep 55280
0000000000055280 t __rt0_amd64_darwin_lib

which is the initializer entry symbol to initialize the runtime. (I'm using a macOS AMD64 machine, but the problem looks the same.)

Linking with ld64,

$ cc c.c a.a

$ objdump -s -j __init_offsets ./a.out

./a.out:	file format mach-o 64-bit x86-64
Contents of section __TEXT,__init_offsets:
 10005ad70 20770500                              w..

$ nm ./a.out | grep 57720
0000000100057720 t __rt0_amd64_darwin_lib

We have the same initializer symbol in __init_offsets, as expected. And it runs fine.

But linking with ld-prime,

$ cc -Wl,-ld_prime c.c a.a

$ objdump -s -j __init_offsets ./a.out

./a.out:	file format mach-o 64-bit x86-64
Contents of section __TEXT,__init_offsets:
 10005ad90 c0240000                             .$..

$ nm ./a.out | grep 0024c0
00000001000024c0 t _go:buildid
00000001000024c0 t _runtime.text

We have __init_offsets points to the start of the Go text, not the entry address we set. The dynamic linker will try to run the initializer there. But there go:buildid is actually the build ID, not a function, so it will usually SIGILL or SIGBUS.

I'll see if this is an intentional breaking behavior change or a bug for ld-prime. If we have to work around it, perhaps one possibility is putting the entry symbol (or a jump instruction to the entry symbol) at the start of the text section...

==========
Reproducer code:
a.go

package main

import "C"

func init() { println("Go init"); X = 123 }

func main() {}

//export F
func F() { println(X) }

var X int

c.c

#include "a.h"

int main() { F(); return 0; }

Build with

$ go build -buildmode=c-archive a.go
$ cc -Wl,-ld_prime c.c a.a

@cherrymui
Copy link
Member

cherrymui commented Jun 10, 2023

This hack seems to make it work. Basically, for __mod_init_func, we need to use symbol-targeted relocation, instead of section+offset-targeted relocation (ld-prime just drops the offset, making it point to the start of the section...)... This seems to work for both ld64 and ld-prime.

diff --git a/src/cmd/link/internal/amd64/asm.go b/src/cmd/link/internal/amd64/asm.go
index 7082c839ee..58b29fb6ea 100644
--- a/src/cmd/link/internal/amd64/asm.go
+++ b/src/cmd/link/internal/amd64/asm.go
@@ -469,7 +469,7 @@ func machoreloc1(arch *sys.Arch, out *ld.OutBuf, ldr *loader.Loader, s loader.Sy
        rs := r.Xsym
        rt := r.Type
 
-       if ldr.SymType(rs) == sym.SHOSTOBJ || rt == objabi.R_PCREL || rt == objabi.R_GOTPCREL || rt == objabi.R_CALL {
+       if ldr.SymType(rs) == sym.SHOSTOBJ || rt == objabi.R_PCREL || rt == objabi.R_GOTPCREL || rt == objabi.R_CALL || ldr.SymType(s) == sym.SINITARR {
                if ldr.SymDynid(rs) < 0 {
                        ldr.Errorf(s, "reloc %d (%s) to non-macho symbol %s type=%d (%s)", rt, sym.RelocName(arch, rt), ldr.SymName(rs), ldr.SymType(rs), ldr.SymType(rs))
                        return false
diff --git a/src/cmd/link/internal/arm64/asm.go b/src/cmd/link/internal/arm64/asm.go
index f2a2b32232..2c3c994169 100644
--- a/src/cmd/link/internal/arm64/asm.go
+++ b/src/cmd/link/internal/arm64/asm.go
@@ -580,7 +580,7 @@ func machoreloc1(arch *sys.Arch, out *ld.OutBuf, ldr *loader.Loader, s loader.Sy
        if ldr.SymType(rs) == sym.SHOSTOBJ || rt == objabi.R_CALLARM64 ||
                rt == objabi.R_ARM64_PCREL_LDST8 || rt == objabi.R_ARM64_PCREL_LDST16 ||
                rt == objabi.R_ARM64_PCREL_LDST32 || rt == objabi.R_ARM64_PCREL_LDST64 ||
-               rt == objabi.R_ADDRARM64 || rt == objabi.R_ARM64_GOTPCREL {
+               rt == objabi.R_ADDRARM64 || rt == objabi.R_ARM64_GOTPCREL || ldr.SymType(s) == sym.SINITARR {
                if ldr.SymDynid(rs) < 0 {
                        ldr.Errorf(s, "reloc %d (%s) to non-macho symbol %s type=%d (%s)", rt, sym.RelocName(arch, rt), ldr.SymName(rs), ldr.SymType(rs), ldr.SymType(rs))
                        return false
diff --git a/src/cmd/link/internal/ld/data.go b/src/cmd/link/internal/ld/data.go
index d651e2e346..891d7c7358 100644
--- a/src/cmd/link/internal/ld/data.go
+++ b/src/cmd/link/internal/ld/data.go
@@ -368,7 +368,7 @@ func (st *relocSymState) relocsym(s loader.Sym, P []byte) {
                                                o = 0
                                        }
                                } else if target.IsDarwin() {
-                                       if ldr.SymType(rs) != sym.SHOSTOBJ {
+                                       if ldr.SymType(rs) != sym.SHOSTOBJ && ldr.SymType(s) != sym.SINITARR {
                                                o += ldr.SymValue(rs)
                                        }
                                } else if target.IsWindows() {

@agottardo
Copy link

Great! I’ll try your patch as soon as I get a chance :)

@gopherbot
Copy link

Change https://go.dev/cl/502617 mentions this issue: cmd/link: don't generate DYSYMTAB when external linking on Mach-O

@gopherbot
Copy link

Change https://go.dev/cl/502616 mentions this issue: cmd/link: use symbol-targeted relocation for initializers on Mach-O

@mknyszek mknyszek added this to the Go1.22 milestone Jun 14, 2023
@gopherbot
Copy link

Change https://go.dev/cl/503935 mentions this issue: cmd/link: always use symbol-targeted relocations on Mach-O

@cherrymui cherrymui changed the title runtime, cmd/compile: Go doesn't work with new Xcode 15 beta ld-prime linker cmd/link: Go c-archive doesn't work with new Xcode 15 beta ld-prime linker Jul 12, 2023
@cherrymui
Copy link
Member

FYI, I filed #61229 for other issues with the new Apple linker, and included CLs for workarounds there. (This issue is currently specifically c-archive mode.)

gopherbot pushed a commit that referenced this issue Jul 27, 2023
When external linking, the external linker will generate it.

Updates #60694.
For #61229.

Change-Id: I086a7628dd9baa84b46315641746fc3640473f2b
Reviewed-on: https://go-review.googlesource.com/c/go/+/502617
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@gopherbot
Copy link

Change https://go.dev/cl/513757 mentions this issue: [release-branch.go1.21] cmd/link: don't generate DYSYMTAB when external linking on Mach-O

gopherbot pushed a commit that referenced this issue Jul 27, 2023
…al linking on Mach-O

When external linking, the external linker will generate it.

Updates #60694.
For #61229.

Change-Id: I086a7628dd9baa84b46315641746fc3640473f2b
Reviewed-on: https://go-review.googlesource.com/c/go/+/502617
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
(cherry picked from commit f55e7e1)
Reviewed-on: https://go-review.googlesource.com/c/go/+/513757
gopherbot pushed a commit that referenced this issue Jul 31, 2023
In Mach-O object files, there are two kinds of relocations:
"external" relocation, which targets a symbol, and "non-external"
relocation, which targets a section. For targeting symbols not in
the current object, we must use symbol-targeted relocations. For
targeting symbols defined in the current object, for some
relocation types, both kinds can be used. We currently use
section-targeted relocations for R_ADDR targeting locally defined
symbols.

Modern Apple toolchain seems to prefer symbol-targeted relocations.
Also, Apple's new linker, ld-prime, seems to not handle section-
targeted relocations well in some cases. So this CL switches to
always generate symbol-targeted relocations. This also simplifies
the code.

One exception is that DWARF tools seem to handle only section-
targeted relocations. So generate those in DWARF sections.

This CL supersedes CL 502616.

Fixes #60694.
For #61229.

Change-Id: I3b74df64f21114635061bcd89114392b3a2d588b
Reviewed-on: https://go-review.googlesource.com/c/go/+/503935
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jul 31, 2023
@gopherbot
Copy link

Change https://go.dev/cl/514535 mentions this issue: [release-branch.go1.21] cmd/link: use symbol-targeted relocation for initializers on Mach-O

bradfitz pushed a commit to tailscale/go that referenced this issue Aug 2, 2023
…al linking on Mach-O

When external linking, the external linker will generate it.

Updates golang#60694.
For golang#61229.

Change-Id: I086a7628dd9baa84b46315641746fc3640473f2b
Reviewed-on: https://go-review.googlesource.com/c/go/+/502617
Reviewed-by: Than McIntosh <thanm@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
(cherry picked from commit f55e7e1)
Reviewed-on: https://go-review.googlesource.com/c/go/+/513757
@albertoAround
Copy link

So actually which is the way to apply this fix? Do I have to wait for a major release or is there a way to use a specific commit where this is already fixed?

gopherbot pushed a commit that referenced this issue Aug 2, 2023
…initializers on Mach-O

Apple's new linker, ld-prime from Xcode 15 beta, when handling
initializers in __mod_init_func, drops the offset in the data,
resolving the relocation to the beginning of the section. The
latest version of ld-prime rejects non-zero addend. We need to use
symbol-targeted "external" relocations, so that it doesn't need
an addend and can be resolved correctly. This also works fine with
ld64.

Fixes #60694.
For #61229.

Change-Id: Ida2be6aa4c91bfcd142b755e2ec63aabfbbd77a6
Reviewed-on: https://go-review.googlesource.com/c/go/+/502616
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
(cherry picked from commit bad9ca8)
Reviewed-on: https://go-review.googlesource.com/c/go/+/514535
@cherrymui
Copy link
Member

@albertoAround the fix will be included in Go 1.21, which is to be released next week. You could also checkout the master branch and build the Go toolchain (you could also cherry-pick CL https://golang.org/cl/502616 but it is probably better and simpler to just use the master branch). Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done. OS-Darwin OS-iOS GOOS=ios
Projects
None yet
Development

No branches or pull requests

9 participants