Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/link: dynamic linking cannot be stepped through using gdb #38378

Closed
WangLeonard opened this issue Apr 11, 2020 · 8 comments
Closed

cmd/link: dynamic linking cannot be stepped through using gdb #38378

WangLeonard opened this issue Apr 11, 2020 · 8 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@WangLeonard
Copy link
Contributor

What version of Go are you using (go version)?

$ go version
go version go1.13.3 linux/amd64

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOENV="/root/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/root/py3/Project/GOPATH"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/root/py3/Project/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/root/py3/Project/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build442453090=/tmp/go-build -gno-record-gcc-switches"

What did you do?

A simple demo

package main

import (
	"fmt"
	"os"
	"time"
)

var a = 0

//go:noinline
func TestFunc() {
	a++
	time.Sleep(time.Millisecond * 100)
}

func main() {
	fmt.Println(os.Getpid())
	for i := 0; i < 100000; i++ {
		TestFunc()
	}
}

go install -buildmode=shared -linkshared std

go build -linkshared main.go

./main

Then I use gdb to debug, when I use s for single step debugging, it will appear:

which has no line number information.

gdb attach 3839

What did you expect to see?

Can be debugged step by step.

What did you see instead?

Type "apropos word" to search for commands related to "word"...
attach: No such file or directory.
Attaching to process 3839
[New LWP 3840]
[New LWP 3841]
[New LWP 3842]
[New LWP 3843]
[New LWP 3844]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007fb989b424b3 in runtime.futex ()
   from /root/py3/Project/go/pkg/linux_amd64_dynlink/libstd.so
(gdb) b main.TestFunc
Breakpoint 1 at 0x55563d1f77a0
(gdb) c
Continuing.

Thread 1 "main" hit Breakpoint 1, 0x000055563d1f77a0 in main.TestFunc ()
(gdb) s
Single stepping until exit from function main.TestFunc,
which has no line number information.


Thread 1 "main" hit Breakpoint 1, 0x000055563d1f77a0 in main.TestFunc ()
(gdb) s
Single stepping until exit from function main.TestFunc,
which has no line number information.
0x00007fb989b2c680 in time.Sleep ()
   from /root/py3/Project/go/pkg/linux_amd64_dynlink/libstd.so

Even if I use go build -gcflags = '-N -l' -linkshared main.go , still get the same error.

When I use the static link go build main.go, it can be stepped normally.

The process of my analysis:

Add -x option to re-compile, it can be seen that when ld is called, -w is added

root/py3/Project/go/pkg/tool/linux_amd64/link -o $WORK/b001/exe/a.out -importcfg $WORK/b001/importcfg.link -installsuffix dynlink -buildmode=exe -buildid=0NYYkm8PmmuyphRyKmYC/4b2T8RfBMeRtkph2OZPF/9y-yYjcMEG_-re1WmH11/0NYYkm8PmmuyphRyKmYC -linkshared -w -extld=gcc /root/.cache/go-build/9d/9d922fc73643e08291edcd472d78431a941d02ea4a67c48a58d6e240635f2934-d
/root/py3/Project/go/pkg/tool/linux_amd64/buildid -w $WORK/b001/exe/a.out # internal

In the source code, if the -w option is added, the .debug_line information will not be generated

func dwarfGenerateDebugSyms(ctxt *Link) {
	if !dwarfEnabled(ctxt) {
		return
	}
	……

	// Write per-package line and range tables and start their CU DIEs.
	debugLine := ctxt.Syms.Lookup(".debug_line", 0)
	debugLine.Type = sym.SDWARFSECT
	debugRanges := ctxt.Syms.Lookup(".debug_ranges", 0)
	debugRanges.Type = sym.SDWARFRANGE
	debugRanges.Attr |= sym.AttrReachable
	syms = append(syms, debugLine)
	……
}

I guess -w was added here, and there is a legacy TODO

func buildModeInit() {
	gccgo := cfg.BuildToolchainName == "gccgo"
	……
	if cfg.BuildLinkshared {
		if gccgo {
			codegenArg = "-fPIC"
		} else {
			switch platform {
			case "linux/386", "linux/amd64", "linux/arm", "linux/arm64", "linux/ppc64le", "linux/s390x":
				forcedAsmflags = append(forcedAsmflags, "-D=GOBUILDMODE_shared=1")
			default:
				base.Fatalf("-linkshared not supported on %s\n", platform)
			}
			codegenArg = "-dynlink"
			// TODO(mwhudson): remove -w when that gets fixed in linker.
			forcedLdflags = append(forcedLdflags, "-linkshared", "-w")
		}
	}
}

I want to ask, for dynamically compiled programs, cannot use gdb for single-step debugging, Is there any consideration or any bug?

Is there a supporting plan for this?

This scene is very important to me, I am looking forward to receiving your reply, thank you.

@andybons
Copy link
Member

@jeremyfaller

@andybons andybons changed the title dynamic linking cannot be stepped through using gdb cmd/link: dynamic linking cannot be stepped through using gdb Apr 13, 2020
@andybons andybons added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Apr 13, 2020
@andybons andybons added this to the Unplanned milestone Apr 13, 2020
@thanm
Copy link
Contributor

thanm commented Apr 13, 2020

I suspect that supporting this scenario is not something that can added easily or with tiny tweaks to the Go linker; some work will be required (possibly quite a lot of work).

First a bit of a digression on DWARF type generation in general:

DWARF type generation (e.g. creation of the part of DWARF .debug_info that relates to types of variables, function, etc) has an implicit assumption that the compiler has a complete picture of the type in question. Inside a DWARF type DIE (the unit of DWARF that describes a variable/value/function type) you can have references to other types, but those references have to be to some other well-defined DWARF type DIE -- you can't say something like "this struct type ABC has a field F refers to type X, where X is defined in some other shared library, please wait and look for it later".

As a result, most compilers handle this by emitting DWARF for a translation unit that is a complete picture of everything the compiler has seen referenced as part of the compile. So for example let's say you are compiling a C++ source file with a type like

  typedef struct {
     int somefield;
     SomeStruct content;
     std::set<int> visited;
  } Mumble;

For C/C++ it is a given that at the point we see a type T, we have seen textual definitions of all other types referenced by T. For the type above, that means we've seen the entire definition of "SomeStruct" (including its dependencies) and all the complex stuff that goes into "std::set".

When a C++ compiler emits DWARF for an object file whose source file refers to the type above, it creates DWARF types for everything referenced by that type. [Note: some C++ compilers will omit DWARF for types that are entirely unreferenced, but this is something of a grey area.]. This generally results in huge amounts of duplicate DWARF info, since many source files include the same large set of headers. The result is that you get huge object files, long compile times, and giant amounts of duplicated DWARF info in the final executable.

The Go compiler, in contrast, emits DWARF type info only for types defined in the package being compiled. For example, when compiling this package:

 package p

 <imports>
 
 type Q struct {
   x int64
   mu sync.Mutex
   b bytes.Buffer
   ...
 }

The Go compiler will not emit DWARF type info for "sync.Mutex" (since this type is not defined in package p). It will emit a mostly-complete DWARF type unit for "Q", but with references (via relocations) to the other types that it uses (in contrast to C/C++, where the compiler would emit DWARF type records for the transitive closure of the referents of Q). The Go compiler relies on the Go linker to combine together all of the type DIEs into a single consistent unit and resolve the references between them.

Doing things this way (as opposed to the C/C++ "emit DWARF for everything you have ever seen" strategy) is one of the reasons why Go builds are lightning fast, especially as compared with building a C/C++ program of equivalent size.

Inside the Go linker, a reachability analysis (also called "dead code") is performed at an early stage to collect up only those functions and variables that are actually used (transitively reachable from main.main). It then examines the set of reachable funcs/variables to see which types they use, and then finally it emits DWARF info for only those types. Consider this example:

 package mumble

 type Used struct {
   field AlsoUsed
 }

 type AlsoUsed struct {
   B bool
   X []byte
 }

 type NotUsed struct {
   something interface{}
 }

 func Hallo() Used {
   ...
 }

Here there are three types declared, but only two are referenced by some reachable function (let's assume that no other function in the program refers to the type "mumble.NotUsed"). The linker will pick up on this fact during DWARF generation, and will not generate any DWARF for the unused type.

This results in DWARF type info that is much more compact than what you would get from (for example) a C++ compiler, which is a major benefit for Go users. So with the Go toolchain you get two huge benefits: A) super fast builds, and B) nice compact binaries.

Getting back to the -linkshared case:

When you run "go build -linkshared main.go", the compiler emits DWARF type information for things in main.go, and then hands things off to the Go linker. At that point however the linker is presented with an incomplete picture with respect to DWARF types -- there will be references/relocations in the main.go object file to types that are defined in the standard library... but the standard library has already been compiled and linked.

Thus in order for things to work properly, we would have to invent some way for the linker to dig into the contents of libstd.so's generated type info to find the set of stdlib types referenced by the transitive closure of types reachable from main.go, then emit that into the resulting executable. DWARF type generation would need to be changed so that any time once type refers to another, we can handle the case where the thing being referred to has to be excavated from an already-compiled shared library.

This seems possible, but would require some doing.

It is also worth noting that -linkshared is a pretty unusual case for most Go users, meaning that fixing problems like this have not been a big priority.

@cherrymui
Copy link
Member

Maybe a workaround is to generate DWARF line tables but not type info? The original issue is about single-stepping, so this will probably make it work.

@thanm
Copy link
Contributor

thanm commented Apr 14, 2020

That's certainly one way to go. Clang/LLVM supports that I believe (e.g. the -gline-tables-only option).

@WangLeonard
Copy link
Contributor Author

@thanm Thank you for your detailed reply.
I need to take a moment to understand this.
I found that golang plugin mechanism can correctly generate debug information,
As far as I know, plugins is also dynamically links, is there any difference between the handling of plugins and linkshared?

@cherrymui
Copy link
Member

@thanm I guess we don't want to introduce another flag. We could do something like the following:

  • let cmd/go stop passing -w to the linker
  • if linkshared, the linker suppresses DWARF type generation but still generates DWARF line tables
  • if -w is explicitly requested, still disable all DWARF generation

Then the default case would work.

@cherrymui
Copy link
Member

@WangLeonard the difference is that plugins are self-contained. When building a plugin, you have the full transitive dependencies.

@seankhliao
Copy link
Member

Obsoleted by #47788

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

6 participants