Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/link: the value stored in pcHeader in runtime.firstmoduledata is different from runtime.pclntab when using external links on macOS arm64 cgo #69428

Closed
Zxilly opened this issue Sep 12, 2024 · 12 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.

Comments

@Zxilly
Copy link
Member

Zxilly commented Sep 12, 2024

Go version

go version devel go1.24-2a10a5351b Wed Aug 14 12:32:08 2024 +0800 windows/amd64

Output of go env in your module/workspace:

set GO111MODULE=on
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\zxilly\AppData\Local\go-build
set GOENV=C:\Users\zxilly\AppData\Roaming\go\env
set GOEXE=.exe
set GOEXPERIMENT=
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOINSECURE=
set GOMODCACHE=C:\Users\zxilly\go\pkg\mod
set GONOPROXY=1
set GONOSUMDB=
set GOOS=windows
set GOPATH=C:\Users\zxilly\go
set GOPRIVATE=
set GOPROXY=https://proxy.golang.org,direct
set GOROOT=E:/Temp/go
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLCHAIN=auto
set GOTOOLDIR=E:\Temp\go\pkg\tool\windows_amd64
set GOVCS=
set GOVERSION=devel go1.24-2a10a5351b Wed Aug 14 12:32:08 2024 +0800
set GODEBUG=
set GOTELEMETRY=local
set GOTELEMETRYDIR=C:\Users\zxilly\AppData\Roaming\go\telemetry
set GCCGO=gccgo
set GOAMD64=v1
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=NUL
set GOWORK=
set CGO_CFLAGS=-O2 -g
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-O2 -g
set CGO_FFLAGS=-O2 -g
set CGO_LDFLAGS=-O2 -g
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=C:\Users\zxilly\AppData\Local\Temp\go-build119807883=/tmp/go-build -gno-record-gcc-switches

What did you do?

I'm trying to extract the moduledata.pcHeader value from the binaries, and the integration test shows this error since go1.21 release.

Prior to go 1.21, the runtime.text symbol would still be present in the binary even if the -s flag was passed, masking the problem.

What did you see happen?

the value stored in pcHeader in runtime.firstmoduledata is different from runtime.pclntab when using external links on macOS arm64 cgo

Can be reproduced with the following code:

package main

import (
	"debug/macho"
	"encoding/binary"
	"fmt"
)

func main() {
	f, err := macho.Open("bin-darwin-1.23-arm64-cgo")
	if err != nil {
		panic(err)
	}

	pclntabAddr := uint64(0)
	moduledataAddr := uint64(0)

	for _, s := range f.Symtab.Syms {
		if s.Name == "runtime.firstmoduledata" {
			moduledataAddr = s.Value
		}
		if s.Name == "runtime.pclntab" {
			pclntabAddr = s.Value
		}
	}

	if moduledataAddr == 0 {
		panic("runtime.firstmoduledata not found")
	}
	if pclntabAddr == 0 {
		panic("runtime.pclntab not found")
	}

	// read first 8 bytes of runtime.moduledata
	data := make([]byte, 8)
	for _, prog := range f.Sections {
		if prog.Addr <= moduledataAddr && moduledataAddr+8-1 <= prog.Addr+prog.Size {
			if _, err := prog.ReadAt(data, int64(moduledataAddr-prog.Addr)); err != nil {
				panic(err)
			}
			break
		}
	}

	// transfer the pclntabAddr to a byte array, little endian
	pclntabAddrBinary := make([]byte, 8)
	binary.LittleEndian.PutUint64(pclntabAddrBinary, pclntabAddr)

	fmt.Println("runtime.firstmoduledata.pcHeader ptr:")
	for _, b := range pclntabAddrBinary {
		fmt.Printf("%02x ", b)
	}
	fmt.Println()
	// print the first 8 bytes of runtime.pclntab as hex
	fmt.Println("runtime.pclntab:")
	for _, b := range data {
		fmt.Printf("%02x ", b)
	}
	fmt.Println()
}

Will results to

runtime.firstmoduledata.pcHeader ptr:
40 26 2a 00 01 00 00 00 
runtime.pclntab:
40 26 2a 00 00 00 10 00 

Some scripts indicates similiar problem on most recent major Go releases:

// format
// pclntab addr value
// generate search hash
// real value in the binary

19
pclntabAddr 4295817152
c0 f7 0c 00 01 00 00 00 
c0 f7 0c 00 00 00 10 00

20
pclntabAddr 4295805984
20 cc 0c 00 01 00 00 00 
20 cc 0c 00 00 00 10 00

21
pclntabAddr 4298061728
a0 37 2f 00 01 00 00 00 
a0 37 2f 00 00 00 10 00

22
mdAddr 4298416864 mdSect __noptrdata off 6112
pclntabAddr 4297689728
80 8a 29 00 01 00 00 00 
80 8a 29 00 00 00 10 00

23
pclntabAddr 4297729600
40 26 2a 00 01 00 00 00 
40 26 2a 00 00 00 10 00

This only happens on macOS arm64 with external link.

Here's some binary as input:

https://github.com/Zxilly/go-testdata/releases/download/latest/bin-darwin-1.22-arm64-strip-cgo
https://github.com/Zxilly/go-testdata/releases/download/latest/bin-darwin-1.22-arm64-strip-pie-cgo
https://github.com/Zxilly/go-testdata/releases/download/latest/bin-darwin-1.23-arm64-strip-cgo
https://github.com/Zxilly/go-testdata/releases/download/latest/bin-darwin-1.23-arm64-strip-pie-cgo

What did you expect to see?

These value should keep same as other arch.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Sep 12, 2024
@Zxilly
Copy link
Member Author

Zxilly commented Sep 12, 2024

This is an advisory issue rather than a bug, as it doesn't seem to affect the operation of the Go binary.

@ianlancetaylor
Copy link
Member

CC @cherrymui @golang/runtime

@timothy-king timothy-king added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 12, 2024
@cherrymui
Copy link
Member

I think the moduledata references pcheader with a dynamic relocation (or the equivalent on Mach-O, a "bind" or "rebase" entry). The data you read directly from moduledata is without the relocation applied. You'll need to decode and apply the relocation.

It is possible that the Go linker and C linker (or even different versions of the C linker) use different relocation mechanism, and the pre-relocated data may or may not be meaningful. They are semantically equivalent so at program run time it always point to the right data.

I'm not sure if there is anything we can do.

@Zxilly
Copy link
Member Author

Zxilly commented Sep 12, 2024

If that's the reason, I'm wondering why buildmode being exe or pie doesn't change this behaviour. I looked up some other issues, is it because Apple doesn't support generating non-pie binaries anymore either?
Similarly, this behaviour is not observed on amd64.

@cherrymui
Copy link
Member

is it because Apple doesn't support generating non-pie binaries anymore either?

That is correct.

Again, the un-relocated data varies depending on architecture, link mode, linker version, etc.. It may happen to be useful, or not.

@Zxilly
Copy link
Member Author

Zxilly commented Sep 13, 2024

I investigated the issue further and it doesn't seem to be a relocation issue, please check the attached image, the corresponding sections for moduledata and pclntab all have Nreloc of 0

image

image

This can also be verified by llvm-otool

bin-darwin-1.23-arm64-pie-cgo:
Mach header
      magic cputype cpusubtype  caps    filetype ncmds sizeofcmds      flags
 0xfeedfacf 16777228          0  0x00           2    21       3440 0x00200085
Load command 0
      cmd LC_SEGMENT_64
  cmdsize 72
  segname __PAGEZERO
   vmaddr 0x0000000000000000
   vmsize 0x0000000100000000
  fileoff 0
 filesize 0
  maxprot 0x00000000
 initprot 0x00000000
   nsects 0
    flags 0x0
Load command 1
      cmd LC_SEGMENT_64
  cmdsize 632
  segname __TEXT
   vmaddr 0x0000000100000000
   vmsize 0x0000000000268000
  fileoff 0
 filesize 2523136
  maxprot 0x00000005
 initprot 0x00000005
   nsects 7
    flags 0x0
Section
  sectname __text
   segname __TEXT
      addr 0x0000000100002080
      size 0x0000000000239340
    offset 8320
     align 2^4 (16)
    reloff 0
    nreloc 0
     flags 0x80000400
 reserved1 0
 reserved2 0
Section
  sectname __stubs
   segname __TEXT
      addr 0x000000010023b3c0
      size 0x0000000000000594
    offset 2339776
     align 2^2 (4)
    reloff 0
    nreloc 0
     flags 0x80000408
 reserved1 0 (index into indirect symbol table)
 reserved2 12 (size of stubs)
Section
  sectname __rodata
   segname __TEXT
      addr 0x000000010023b960
      size 0x000000000001c28c
    offset 2341216
     align 2^5 (32)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Section
  sectname __const
   segname __TEXT
      addr 0x0000000100257bf0
      size 0x0000000000006bb0
    offset 2456560
     align 2^4 (16)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Section
  sectname __cstring
   segname __TEXT
      addr 0x000000010025e7a0
      size 0x0000000000008129
    offset 2484128
     align 2^0 (1)
    reloff 0
    nreloc 0
     flags 0x00000002
 reserved1 0
 reserved2 0
Section
  sectname __unwind_info
   segname __TEXT
      addr 0x00000001002668cc
      size 0x0000000000001670
    offset 2517196
     align 2^2 (4)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Section
  sectname __eh_frame
   segname __TEXT
      addr 0x0000000100267f40
      size 0x00000000000000b8
    offset 2522944
     align 2^3 (8)
    reloff 0
    nreloc 0
     flags 0x6800000b
 reserved1 0
 reserved2 0
Load command 2
      cmd LC_SEGMENT_64
  cmdsize 552
  segname __DATA_CONST
   vmaddr 0x0000000100268000
   vmsize 0x00000000000e8000
  fileoff 2523136
 filesize 950272
  maxprot 0x00000003
 initprot 0x00000003
   nsects 6
    flags 0x10
Section
  sectname __got
   segname __DATA_CONST
      addr 0x0000000100268000
      size 0x00000000000003c8
    offset 2523136
     align 2^3 (8)
    reloff 0
    nreloc 0
     flags 0x00000006
 reserved1 119 (index into indirect symbol table)
 reserved2 0
Section
  sectname __const
   segname __DATA_CONST
      addr 0x00000001002683c8
      size 0x0000000000002ce0
    offset 2524104
     align 2^3 (8)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Section
  sectname __rodata
   segname __DATA_CONST
      addr 0x000000010026b0c0
      size 0x0000000000036820
    offset 2535616
     align 2^5 (32)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Section
  sectname __typelink
   segname __DATA_CONST
      addr 0x00000001002a18e0
      size 0x0000000000000b88
    offset 2758880
     align 2^5 (32)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Section
  sectname __itablink
   segname __DATA_CONST
      addr 0x00000001002a2480
      size 0x00000000000001a8
    offset 2761856
     align 2^5 (32)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Section
  sectname __gopclntab
   segname __DATA_CONST
      addr 0x00000001002a2640
      size 0x00000000000abd88
    offset 2762304
     align 2^5 (32)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Load command 3
      cmd LC_SEGMENT_64
  cmdsize 552
  segname __DATA
   vmaddr 0x0000000100350000
   vmsize 0x0000000000040000
  fileoff 3473408
 filesize 98304
  maxprot 0x00000003
 initprot 0x00000003
   nsects 6
    flags 0x0
Section
  sectname __data
   segname __DATA
      addr 0x0000000100350000
      size 0x0000000000008e00
    offset 3473408
     align 2^5 (32)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Section
  sectname __go_buildinfo
   segname __DATA
      addr 0x0000000100358e00
      size 0x00000000000002c0
    offset 3509760
     align 2^4 (16)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Section
  sectname __noptrdata
   segname __DATA
      addr 0x00000001003590c0
      size 0x000000000000be80
    offset 3510464
     align 2^5 (32)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Section
  sectname __bss
   segname __DATA
      addr 0x0000000100364f40
      size 0x0000000000026ce8
    offset 0
     align 2^5 (32)
    reloff 0
    nreloc 0
     flags 0x00000001
 reserved1 0
 reserved2 0
Section
  sectname __noptrbss
   segname __DATA
      addr 0x000000010038bc40
      size 0x0000000000003760
    offset 0
     align 2^5 (32)
    reloff 0
    nreloc 0
     flags 0x00000001
 reserved1 0
 reserved2 0
Section
  sectname __common
   segname __DATA
      addr 0x000000010038f3a0
      size 0x0000000000000020
    offset 0
     align 2^3 (8)
    reloff 0
    nreloc 0
     flags 0x00000001
 reserved1 0
 reserved2 0
Load command 4
      cmd LC_SEGMENT_64
  cmdsize 72
  segname __LINKEDIT
   vmaddr 0x0000000100390000
   vmsize 0x000000000005d542
  fileoff 5865472
 filesize 382274
  maxprot 0x00000001
 initprot 0x00000001
   nsects 0
    flags 0x0
Load command 5
      cmd LC_DYLD_CHAINED_FIXUPS
  cmdsize 16
  dataoff 5865472
 datasize 2352
Load command 6
      cmd LC_DYLD_EXPORTS_TRIE
  cmdsize 16
  dataoff 5867824
 datasize 7184
Load command 7
     cmd LC_SYMTAB
 cmdsize 24
  symoff 5884664
   nsyms 7451
  stroff 6004840
 strsize 194336
Load command 8
            cmd LC_DYSYMTAB
        cmdsize 80
      ilocalsym 0
      nlocalsym 6935
     iextdefsym 6935
     nextdefsym 383
      iundefsym 7318
      nundefsym 133
         tocoff 0
           ntoc 0
      modtaboff 0
        nmodtab 0
   extrefsymoff 0
    nextrefsyms 0
 indirectsymoff 6003880
  nindirectsyms 240
      extreloff 0
        nextrel 0
      locreloff 0
        nlocrel 0
Load command 9
          cmd LC_LOAD_DYLINKER
      cmdsize 32
         name /usr/lib/dyld (offset 12)
Load command 10
     cmd LC_UUID
 cmdsize 24
    uuid 7537C0A0-53D9-B259-D966-C77D22FB7B3B
Load command 11
       cmd LC_BUILD_VERSION
   cmdsize 32
  platform 1
       sdk 14.5
     minos 14.0
    ntools 1
      tool 3
   version 1053.12
Load command 12
      cmd LC_SOURCE_VERSION
  cmdsize 16
  version 0.0
Load command 13
       cmd LC_MAIN
   cmdsize 24
  entryoff 474800
 stacksize 0
Load command 14
          cmd LC_LOAD_DYLIB
      cmdsize 104
         name /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (offset 24)
   time stamp 2 Thu Jan  1 08:00:02 1970
      current version 2503.1.0
compatibility version 150.0.0
Load command 15
          cmd LC_LOAD_DYLIB
      cmdsize 56
         name /usr/lib/libresolv.9.dylib (offset 24)
   time stamp 2 Thu Jan  1 08:00:02 1970
      current version 1.0.0
compatibility version 1.0.0
Load command 16
          cmd LC_LOAD_DYLIB
      cmdsize 56
         name /usr/lib/libSystem.B.dylib (offset 24)
   time stamp 2 Thu Jan  1 08:00:02 1970
      current version 1345.120.2
compatibility version 1.0.0
Load command 17
      cmd LC_FUNCTION_STARTS
  cmdsize 16
  dataoff 5875008
 datasize 9656
Load command 18
      cmd LC_DATA_IN_CODE
  cmdsize 16
  dataoff 5884664
 datasize 0
Load command 19
      cmd LC_CODE_SIGNATURE
  cmdsize 16
  dataoff 6199184
 datasize 48562
Load command 20
      cmd LC_SEGMENT_64
  cmdsize 1032
  segname __DWARF
   vmaddr 0x0000000000000000
   vmsize 0x0000000000000000
  fileoff 3571712
 filesize 2283036
  maxprot 0x00000007
 initprot 0x00000000
   nsects 12
    flags 0x0
Section
  sectname __zdebug_line
   segname __DWARF
      addr 0x00000001003dc000
      size 0x0000000000066f7e
    offset 3571712
     align 2^0 (1)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Section
  sectname __zdebug_ranges
   segname __DWARF
      addr 0x0000000100442f7e
      size 0x000000000000fd8e
    offset 3993470
     align 2^0 (1)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Section
  sectname __zdebug_loc
   segname __DWARF
      addr 0x0000000100452d0c
      size 0x000000000007cb4a
    offset 4058380
     align 2^0 (1)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Section
  sectname __zdebug_aranges
   segname __DWARF
      addr 0x00000001004cf856
      size 0x0000000000000508
    offset 4569174
     align 2^0 (1)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Section
  sectname __zdebug_info
   segname __DWARF
      addr 0x00000001004cfd5e
      size 0x00000000000d39f2
    offset 4570462
     align 2^0 (1)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Section
  sectname __zdebug_frame
   segname __DWARF
      addr 0x00000001005a3750
      size 0x000000000000a7f6
    offset 5437264
     align 2^0 (1)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Section
  sectname __zdebug_abbrev
   segname __DWARF
      addr 0x00000001005adf46
      size 0x0000000000000531
    offset 5480262
     align 2^0 (1)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Section
  sectname __zdebug_str
   segname __DWARF
      addr 0x00000001005ae477
      size 0x000000000001a212
    offset 5481591
     align 2^0 (1)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Section
  sectname __zapple_names
   segname __DWARF
      addr 0x00000001005c8689
      size 0x0000000000033da1
    offset 5588617
     align 2^0 (1)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Section
  sectname __apple_namespac
   segname __DWARF
      addr 0x00000001005fc42a
      size 0x0000000000000024
    offset 5801002
     align 2^0 (1)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Section
  sectname __zapple_types
   segname __DWARF
      addr 0x00000001005fc44e
      size 0x000000000000d1aa
    offset 5801038
     align 2^0 (1)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Section
  sectname __apple_objc
   segname __DWARF
      addr 0x00000001006095f8
      size 0x0000000000000024
    offset 5854712
     align 2^0 (1)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0

@cherrymui
Copy link
Member

Mach-O doesn't use relocations in that sense. They have "rebase" and "bind" tables instead. Try objdump --macho --bind and objdump --macho --rebase with the macOS system objdump (from Xcode).

@cherrymui
Copy link
Member

For a binary I built, I have

$ nm x | grep firstmoduledata
00000001000f7b00 s _runtime.firstmoduledata
$ nm x | grep runtime.pclntab
000000010009f820 s _runtime.pclntab
$ objdump -m --rebase ./x | grep 0x1000F7B00
...
__DATA   __noptrdata        0x1000F7B00  rebase ptr   0x10009F820

The rebase entry does point to the right address.

As you mentioned above, this is not a bug. And I don't think there is much we can do. Thanks.

@cherrymui cherrymui closed this as not planned Won't fix, can't repro, duplicate, stale Sep 13, 2024
@Zxilly
Copy link
Member Author

Zxilly commented Sep 14, 2024

Thanks for the suggestion. But I tried llvm-objdump --macho --bind and llvm-objdump --macho --rebase on my samples and it seems that the rebase table does not exist in these files, but the address discrepancy is still there.
Specifically, I am referring to the file

https://github.com/Zxilly/go-testdata/releases/download/latest/bin-darwin-1.23-arm64-strip-pie-cgo

Can you please suggest me some other directions?

I got

PS T:\> llvm-objdump --macho --rebase .\bin-darwin-1.23-arm64-strip-pie-cgo
.\bin-darwin-1.23-arm64-strip-pie-cgo:

Rebase table:
segment  section            address     type
PS T:\> llvm-objdump --macho --bind .\bin-darwin-1.23-arm64-strip-pie-cgo
.\bin-darwin-1.23-arm64-strip-pie-cgo:

Bind table:
segment  section            address    type       addend dylib            symbol

on this file.

@Zxilly
Copy link
Member Author

Zxilly commented Sep 14, 2024

I guess it was related to Chained Fixups

@cherrymui
Copy link
Member

You're probably right that chained fixups are related. That is yet another way of expressing dynamic relocations in Mach-O.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

6 participants