Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

debug/elf: DWARF relocations should not always be applied #46673

Closed
vikmik opened this issue Jun 9, 2021 · 2 comments
Closed

debug/elf: DWARF relocations should not always be applied #46673

vikmik opened this issue Jun 9, 2021 · 2 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@vikmik
Copy link
Contributor

vikmik commented Jun 9, 2021

What version of Go are you using (go version)?

$ go version
go version go1.16.4 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/vic/.cache/go-build"
GOENV="/home/vic/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/vic/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/vic/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/lib/go-1.16"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go-1.16/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.16.4"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/vic/go/src/github.com/golang/go/src/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build3524548244=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Download and extract https://www.rpmfind.net/linux/fedora/linux/updates/34/Everything/x86_64/debug/Packages/k/kernel-debuginfo-5.12.9-300.fc34.x86_64.rpm

( to extract with rpm2cpio in the current directory: rpm2cpio kernel-debuginfo-5.12.9-300.fc34.x86_64.rpm | cpio -idmv )

Then run the following code (change the path as needed)

package main

import (
    "debug/dwarf"
    "debug/elf"
    "fmt"
)

func main() {
    e, err := elf.Open("usr/lib/debug/lib/modules/5.12.9-300.fc34.x86_64/vmlinux")
    if err != nil {
        fmt.Print(err)
        return
    }   
    dw, err := e.DWARF()
    if err != nil {
        fmt.Print(err)
        return
    }   
    reader := dw.Reader()
    reader.Seek(0xc24858)
    entry, err := reader.Next()
    if err != nil {
        fmt.Print(err)
        return
    }   
    fmt.Printf("%#s\n", entry.AttrField(dwarf.AttrName).Val.(string))
}

What did you expect to see?

arch_local_irq_restore

Why? This is what llvm-dwarfdump prints for the DW_AT_name attribute for the DIE at offset 0x00c24858:

$ llvm-dwarfdump usr/lib/debug/lib/modules/5.12.9-300.fc34.x86_64/vmlinux --debug-info=0x00c24858 | grep DW_AT_name
              DW_AT_name	("arch_local_irq_restore")

What did you see instead?

CK__tp_func_mce_record430

What seems to be the problem:

Note: this is a kernel debug file. It is compiled with --emit-relocs because of KASLR, and as a result has relocations for everything

The apparent problem is that the relocations for the .debug_str offsets in .rela.debug_info do not match the corresponding values in .debug_info. Let's illustrate that with the following:

$ llvm-dwarfdump usr/lib/debug/lib/modules/5.12.9-300.fc34.x86_64/vmlinux --debug-info=0x00c24858 --verbose
usr/lib/debug/lib/modules/5.12.9-300.fc34.x86_64/vmlinux:	file format elf64-x86-64

.debug_info contents:

0x00c24858: DW_TAG_subprogram [78] *
              DW_AT_name [DW_FORM_strp]	( .debug_str[0x001dd207] = "arch_local_irq_restore")
              DW_AT_decl_file [DW_FORM_data1]	("/usr/src/debug/kernel-5.12.9/linux-5.12.9-300.fc34.x86_64/./arch/x86/include/asm/irqflags.h")
              DW_AT_decl_line [DW_FORM_data1]	(142)
              DW_AT_decl_column [DW_FORM_data1]	(0x1d)
              DW_AT_prototyped [DW_FORM_flag_present]	(true)
              DW_AT_inline [DW_FORM_data1]	(DW_INL_declared_inlined)
              DW_AT_sibling [DW_FORM_ref4]	(cu + 0x1351d => {0x00c24872})

Notice the DW_AT_name value, which points to .debug_str + 0x001dd207

Now let's look at the relocations for this attribute (it is in .rela.debug_info). Adding 1 to 0x00c24858 (the offset of the DIE) to find the offset of the DW_AT_name attribute in .debug_info:

$ readelf --relocs usr/lib/debug/lib/modules/5.12.9-300.fc34.x86_64/vmlinux | grep '^000000c24859' 
000000c24859  002a0000000a R_X86_64_32       0000000000000000 .debug_str + 883e

=> after relocations are applied, the DW_AT_name attribute points to .debug_str + 883e.
Dumping the contents of .debug_string at that offset, we indeed find the non-sensical output ( CK__tp_func_mce_record430 ):

$ readelf --debug-dump=str ./usr/lib/debug/lib/modules/5.12.9-300.fc34.x86_64/vmlinux | grep -A2 0x00008830
  0x00008830 64647265 73736162 6c655f5f 5f53434b ddressable___SCK
  0x00008840 5f5f7470 5f66756e 635f6d63 655f7265 __tp_func_mce_re
  0x00008850 636f7264 34333000 5f5f554e 49515545 cord430.__UNIQUE

The most relevant discussion I found on the internet is the following thread:
https://gcc.gnu.org/pipermail/gcc/2020-December/234392.html

I have 2 take-aways from this:

  • Some (but not all!) distros kernel have their DWARF relocations busted in the debug package. I have confirmed that this happens on CentOS/Fedora (old and new ones), but not on Debian kernels for example
  • Even if one might see this as a bug in the distro packaging, it seems that Golang's behavior is out of place here. Tools like dwarfdump, llvm-dwarfdump and various symbolizers do not apply relocations for this.
    For example, addr2line -f -e ./usr/lib/debug/lib/modules/5.12.9-300.fc34.x86_64/vmlinux 0xffffffff8106bcf1 looks up the same DW_AT_name attribute and correctly outputs "arch_local_irq_restore".

So it seems that the recommendation from this is "do not apply DWARF rela relocations when the binary is ET_EXEC".

I just want to discuss whether you think this is a reasonable way forward (@ianlancetaylor , you seem to be the one looking at these parts of the code these days? :) ). I'll be happy to contribute a fix after discussion here

@ianlancetaylor ianlancetaylor added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 10, 2021
@ianlancetaylor ianlancetaylor added this to the Backlog milestone Jun 10, 2021
@ianlancetaylor
Copy link
Contributor

I think it would be fine to skip the apply-relocations step for an ET_EXEC file.

vikmik added a commit to vikmik/go that referenced this issue Jun 11, 2021
Some ET_EXEC binaries might have relocations for non-loadable sections
like .debug_info. These relocations must not be applied, because:
* They may be incorrect
* The correct relocations were already applied at link time

Binaries in Linux Kernel debug packages like Fedora/Centos kernel-debuginfo
are such examples. Relocations for .debug_* sections are included in the
final binaries because they are compiled with --emit-relocs, but the resulting
relocations are incorrect and shouldn't be used when reading DWARF sections.

Fixes golang#46673
@gopherbot
Copy link

Change https://golang.org/cl/327009 mentions this issue: debug/elf: don't apply DWARF relocations for ET_EXEC binaries

@golang golang locked and limited conversation to collaborators Jun 14, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants