Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

debug/pe: TestInternalLinkerDWARF fail in windows #64200

Closed
qiulaidongfeng opened this issue Nov 16, 2023 · 12 comments
Closed

debug/pe: TestInternalLinkerDWARF fail in windows #64200

qiulaidongfeng opened this issue Nov 16, 2023 · 12 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. OS-Windows release-blocker
Milestone

Comments

@qiulaidongfeng
Copy link
Contributor

qiulaidongfeng commented Nov 16, 2023

What version of Go are you using (go version)?

$ go version
go version devel go1.22-d6ef98b8 Thu Nov 16 05:53:55 2023 +0000 windows/amd64

Does this issue reproduce with the latest release?

yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
set GO111MODULE=auto
set GOARCH=amd64
set GOBIN=
set GOCACHE=D:\tmp\go-build
set GOENV=C:\Users\26454\AppData\Roaming\go\env
set GOEXE=.exe
set GOEXPERIMENT=
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOINSECURE=
set GOMODCACHE=D:\file\gofile\pkg\mod
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=D:\file\gofile
set GOPRIVATE=
set GOPROXY=https://goproxy.cn,direct
set GOROOT=C:\Users\26454\.go\current
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLCHAIN=local
set GOTOOLDIR=C:\Users\26454\.go\current\pkg\tool\windows_amd64
set GOVCS=
set GOVERSION=devel go1.22-d6ef98b8 Thu Nov 16 05:53:55 2023 +0000
set GCCGO=gccgo
set GOAMD64=v3
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=
set GOWORK=
set CGO_CFLAGS=-O2 -g
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-O2 -g
set CGO_FFLAGS=-O2 -g
set CGO_LDFLAGS=-O2 -g
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=C:\Users\26454\AppData\Local\Temp\go-build186784699=/tmp/go-build -gno-record-gcc-switches

What did you do?

go test debug/pe

What did you expect to see?

test pass.

What did you see instead?

--- FAIL: TestInternalLinkerDWARF (0.99s)
    file_test.go:389: building test executable for linktype 2 failed: exit status 1 # command-line-arguments
        panic: runtime error: index out of range [10] with length 10

        goroutine 1 [running]:
        cmd/link/internal/loader.(*Relocs).At(...)
                C:/Users/26454/.go/current/src/cmd/link/internal/loader/loader.go:1885
        cmd/link/internal/loadpe.findHandlerInXDataAMD64(0xc00009a000, 0x1812c, 0x44)
                C:/Users/26454/.go/current/src/cmd/link/internal/loadpe/seh.go:108 +0x2b2
        cmd/link/internal/loadpe.processSEHAMD64(0xc00009a000, 0x1812d)
                C:/Users/26454/.go/current/src/cmd/link/internal/loadpe/seh.go:54 +0x1a5
        cmd/link/internal/loadpe.processSEH(0xc00009a000?, 0xc000018144?, 0x29a2e40?, 0xc0?)
                C:/Users/26454/.go/current/src/cmd/link/internal/loadpe/seh.go:33 +0x79
        cmd/link/internal/loadpe.Load(0xc00009a000, 0x77ed20, 0x71, 0xc0023eae50, {0x581b85, 0x4}, 0x5fe8d0?, {0xc0023eeaf0, 0x64})
                C:/Users/26454/.go/current/src/cmd/link/internal/loadpe/ldpe.go:605 +0x18cd
        cmd/link/internal/ld.ldobj.func3(0xc00014e000, 0xc0023eae50?, {0x581b85?, 0x0?}, 0xc0023eeaf0?, {0xc0023eeaf0?, 0xc0023eeaf0?})
                C:/Users/26454/.go/current/src/cmd/link/internal/ld/lib.go:2223 +0x6b
        cmd/link/internal/ld.hostObject(0xc00014e000, {0x581b85, 0x4}, {0xc0023eeaf0, 0x64})
                C:/Users/26454/.go/current/src/cmd/link/internal/ld/lib.go:2397 +0x378
        cmd/link/internal/ld.loadWindowsHostArchives(0xc00014e000)
                C:/Users/26454/.go/current/src/cmd/link/internal/ld/lib.go:695 +0x10d
        cmd/link/internal/ld.(*Link).loadlib(0xc00014e000)
                C:/Users/26454/.go/current/src/cmd/link/internal/ld/lib.go:641 +0x7c5
        cmd/link/internal/ld.Main(_, {0x20, 0x20, 0x1, 0x7, 0x10, 0x0, {0xc0000086b9, 0x1, 0x1}, ...})
                C:/Users/26454/.go/current/src/cmd/link/internal/ld/main.go:347 +0x122b
        main.main()
                C:/Users/26454/.go/current/src/cmd/link/main.go:72 +0xdfb
FAIL
FAIL    debug/pe        9.143s
FAIL
@bcmills bcmills added OS-Windows compiler/runtime Issues related to the Go compiler and/or runtime. labels Nov 16, 2023
@bcmills
Copy link
Contributor

bcmills commented Nov 16, 2023

(CC @golang/windows)

@qmuntal
Copy link
Contributor

qmuntal commented Nov 16, 2023

Bug introduced in CL 534555. This logic is wrong:

idx := sort.Search(xrels.Count(), func(i int) bool {
return int64(xrels.At(i).Off()) >= targetOff
})
if idx == 0 {
return 0
}

sort.Search will return idx == xrels.Count() if no symbol is found, but the code is checking if idx == 0 instead. While working on this I couldn't trigger this code path, and the Go builders don't complain neither, I supose that it depends on the GCC toolchain.

Would be good to understand how is that no relocation was found, maybe a malformed .pdata section? Either way, let's first fix the bug, we can investigate it later. @dagood could you help here submitting the fix? Thanks.

@bcmills bcmills added this to the Go1.22 milestone Nov 16, 2023
@gopherbot
Copy link

Change https://go.dev/cl/543076 mentions this issue: cmd/link/internal/loadpe: fix xrels search "not found" detection

@dagood
Copy link
Contributor

dagood commented Nov 16, 2023

Submitted the simple fix at https://go-review.googlesource.com/c/go/+/543076. (No extra tests.)

@qiulaidongfeng, what C compiler and version are you using?

It sounds like this would be useful information to be able to reproduce the issue and perhaps figure out a test case. I also haven't seen this before.

@qiulaidongfeng
Copy link
Contributor Author

The test still failed
--- FAIL: TestInternalLinkerDWARF (3.05s)
file_test.go:389: building test executable for linktype 2 failed: exit status 1 # command-line-arguments
crt2(.xdata): unhandled relocation for __C_specific_handler (type 46 (SDYNIMPORT) rtype 88 (R_PEIMAGEOFF))
crt2(.xdata): unhandled relocation for __C_specific_handler (type 46 (SDYNIMPORT) rtype 88 (R_PEIMAGEOFF))
FAIL
FAIL debug/pe 28.196s

gcc --version
gcc (MinGW-W64 x86_64-msvcrt-posix-seh, built by Brecht Sanders) 12.1.0
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Get from here: https://winlibs.com/

@qmuntal
Copy link
Contributor

qmuntal commented Nov 17, 2023

Hmm, this error means that the __C_specific_handler function is only reachable via a SEH handler, so the Go linker will not add it to the final binary unless it is found by loadpe.findHandlerInXDataAMD64, in which case the handler is made reachable if the function being guarded is also reachable (this happens here).

We should understand why loadpe.findHandlerInXDataAMD64 can't find the handler when calling sort.Search, as it should be there. @dagood can you try to reproduce the issue and fix it? Else we can just revert CL 534555 and posrpone this improvement till go1.23.

@dagood
Copy link
Contributor

dagood commented Nov 17, 2023

Confirmed with winlibs-x86_64-posix-seh-gcc-12.1.0-llvm-14.0.6-mingw-w64msvcrt-10.0.0-r3:

=== RUN   TestInternalLinkerDWARF
    file_test.go:389: building test executable for linktype 2 failed: exit status 1 # command-line-arguments
        crt2(.xdata): unhandled relocation for __C_specific_handler (type 46 (SDYNIMPORT) rtype 88 (R_PEIMAGEOFF))
        crt2(.xdata): unhandled relocation for __C_specific_handler (type 46 (SDYNIMPORT) rtype 88 (R_PEIMAGEOFF))
--- FAIL: TestInternalLinkerDWARF (1.88s)

I tested a few other versions using MSVCRT (12.2.0, 12.3.0, 13.2.0) and got the same result.

Using UCRT on the same 12.1.0 version runs successfully.
winlibs-x86_64-posix-seh-gcc-12.1.0-llvm-14.0.6-mingw-w64ucrt-10.0.0-r3

(Side note: these builds are listed on https://www.mingw-w64.org/downloads/, so they aren't really obscure. This is something to consider when setting up more CI to validate more C compilers and/or a list of what we're testing.)

My understanding is that UCRT is newer, and MSVCRT is useful to target older versions of Windows, so swapping might not be possible in some situations.

I'm planning to look into this further. Advice is of course quite welcome!

@dagood
Copy link
Contributor

dagood commented Nov 21, 2023

An update on what I've found so far. It looks like the first different thing about the MSVCRT vs. UCRT build is that when building with the MSVCRT MinGW, Go includes crt2.o because this condition is met:

// Link crt2.o (if present) to resolve "atexit" when
// using LLVM-based compilers.
isunresolved := symbolsAreUnresolved(ctxt, []string{"atexit"})
if isunresolved[0] {
if p := ctxt.findLibPath("crt2.o"); p != "none" {
hostObject(ctxt, "crt2", p)
}
}

This is where the failing search happens.

MinGW, providing crt2.o, is unfortunately not able to parse its xdata all that well. .\bin\objdump.exe -x .\x86_64-w64-mingw32\lib\crt2.o shows this:

...
Dump of .xdata
 0000000000000000 (rva: 00000000): 0000000000000000 - 0000000000000017
warning: xdata section corrupt
 0000000000000008 (rva: 00000008): 0000000000000017 - 000000000000009a
warning: xdata section corrupt
 0000000000000014 (rva: 00000014): 000000000000009a - 00000000000000f6
warning: xdata section corrupt
 0000000000000020 (rva: 00000020): 00000000000000f6 - 0000000000000125
        Version: 1, Flags: UNW_FLAG_EHANDLER
        Nbr codes: 3, Prologue size: 0x08, Frame offset: 0x0, Frame reg: rbp
          pc+0x08: alloc small area: rsp = rsp - 0x30
          pc+0x04: FPReg: rbp = rsp + 0x0 (info = 0x0)
          pc+0x01: push rbp
        Handler: 0000000000000000.
        User data:
          000: 01 00 00 00 05 01 00 00 1b 01 00 00 00 00 00 00
          010: 1b 01 00 00
 0000000000000044 (rva: 00000044): 0000000000000125 - 0000000000000154
        Version: 1, Flags: UNW_FLAG_EHANDLER
        Nbr codes: 3, Prologue size: 0x08, Frame offset: 0x0, Frame reg: rbp
          pc+0x08: alloc small area: rsp = rsp - 0x30
          pc+0x04: FPReg: rbp = rsp + 0x0 (info = 0x0)
          pc+0x01: push rbp
        Handler: 0000000000000000.
        User data:
          000: 01 00 00 00 34 01 00 00 4a 01 00 00 00 00 00 00
          010: 4a 01 00 00
 0000000000000068 (rva: 00000068): 0000000000000154 - 000000000000047f
warning: xdata section corrupt
 0000000000000074 (rva: 00000074): 000000000000047f - 0000000000000583
warning: xdata section corrupt
 0000000000000080 (rva: 00000080): 0000000000000583 - 000000000000068a
warning: xdata section corrupt
 000000000000008c (rva: 0000008c): 000000000000068a - 00000000000006b9
warning: xdata section corrupt

...

So, one part of the puzzle: is this .o file not conforming to what this Go code expects? (Still looking deeper.)


Another part is that some of the Go logic doesn't seem to match the docs and .NET CoreCLR:

codes := data[3]
if codes%2 != 0 {
// There are always an even number of unwind codes, even if the last one is unused.
codes += 1
}

In https://learn.microsoft.com/en-us/cpp/build/exception-handling-x64?view=msvc-170, the number of codes is byte 3 of UNWIND_INFO, taking into account the bit field:

Size Value
UBYTE: 3 Version
UBYTE: 5 Flags
UBYTE Size of prolog
UBYTE Count of unwind codes
... ...

Changing it to data[2] makes the value match what's in the xdata dump.

.NET CoreCLR matches that at https://github.com/dotnet/runtime/blob/02812350ddacde36e7f05f0aabd7c43b3095bc65/src/coreclr/inc/win64unwind.h#L99-L106.

I'm not able to track down a source for the if codes%2 != 0 { check. The xdata dump claims that there are 3 codes, but according to this logic there are actually 4 codes. I hoped looking at raw xdata dump data would clear it up, but it's not yet clear to me what this situation would look like.

Both of these affect the targetOff so I would expect some effect on the search. However, so far I haven't seen any change in the end results after making these changes. (Still looking deeper.)


I also noticed this:

UNW_FLAG_EHANDLER = 1 << 3
UNW_FLAG_UHANDLER = 2 << 3
UNW_FLAG_CHAININFO = 3 << 3

https://learn.microsoft.com/en-us/windows/win32/api/winnt/nf-winnt-rtlvirtualunwind says these should be 1, 2, 4, because they're flags. (To be clear, the bitshift is fine, it's how this code deals with the bit field.)

4 isn't being used in these entries and the CHAININFO execution path doesn't get hit in my repro, so I don't think it's related to this issue.

@qmuntal
Copy link
Contributor

qmuntal commented Nov 21, 2023

Changing it to data[2] makes the value match what's in the xdata dump.

Good catch. I suppose the code works even with this bug because the relocation search doesn't look for an exact offset match. It will certainly find the wrong relocation sometimes.

I'm not able to track down a source for the if codes%2 != 0 { check. The xdata dump claims that there are 3 codes, but according to this logic there are actually 4 codes

There are 3 logical codes, but 4 codes in disk, as that array should contain an even number of entries. See this part of the docs:

For alignment purposes, this array always has an even number of entries, and the final entry is potentially unused. In that case, the array is one longer than indicated by the count of unwind codes field.

@qmuntal
Copy link
Contributor

qmuntal commented Nov 21, 2023

@dagood It looks like I got confused counting bytes. You will have to also fix targetOff and unwStaticDataSize (this last one doesn't make much sense). The right one is targetOff := add + 4 + 2*int64(codes). This might solve the whole issue.

@dagood
Copy link
Contributor

dagood commented Nov 22, 2023

Thanks! Yep, it seems to work now with any MinGW version I throw at it.

I think unwStaticDataSize still makes sense as 4, for targetOff := add + unwStaticDataSize + 2*int64(codes) and the earlier bounds check. In fact, I think that 2 could also use a name, like unwCodeSize.

I'll put up a CL shortly.

@gopherbot
Copy link

Change https://go.dev/cl/544415 mentions this issue: cmd/link/internal/loadpe: fix .xdata unwind info parsing

qiulaidongfeng pushed a commit to qiulaidongfeng/go that referenced this issue Nov 22, 2023
Unwind info in .xdata was being parsed incorrectly, causing targetOff to
be incorrect and miss finding data in .xdata that it should have found.
This causes a linker issue when using the MinGW MSVCRT compiler.

Contains several fixes based on the exception handling docs: the offset
used to get the number of unwind codes, the calculation of the target
offset based on the dynamic size of the unwind data, and the
UNW_FLAG_CHAININFO flag's value.

Fixes golang#64200

Change-Id: I6483d921b2bf8a2512a95223bf3c8ce8bc63dc4a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. OS-Windows release-blocker
Projects
None yet
Development

No branches or pull requests

5 participants