Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/internal/dwarf: incorrect or missing dwarf information in libstd.so on ppc64le, amd64 #20328

Closed
laboger opened this issue May 11, 2017 · 18 comments
Labels
Debugging FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@laboger
Copy link
Contributor

laboger commented May 11, 2017

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go tip

What operating system and processor architecture are you using (go env)?

Ubuntu 16.04 ppc64le

What did you do?

go install -buildmode=shared std
go build -linkshared hello.go
gdb ./hello
break main
run

What did you expect to see?

Normal debugging with gdb as with other Go programs.

What did you see instead?

Errors from gdb and unable to debug the program at all. The following error message prevented break points from being set.
....
Error in re-setting breakpoint 1: Dwarf Error: Cannot find DIE at 0x0 referenced from DIE at 0xc8 [in module /home/boger/golang/base/go/pkg/linux_ppc64le_dynlink/libstd.so]

@laboger laboger changed the title cmd/internal/dwarf: incorrect or missing dwarf information in libstd.so on ppc64le cmd/internal/dwarf: incorrect or missing dwarf information in libstd.so on ppc64le, amd64 Jun 1, 2017
@laboger
Copy link
Contributor Author

laboger commented Jun 1, 2017

Same failure happens on x86.

Here is the output from objdump for the dwarf info:

<1>: Abbrev Number: 2 (DW_TAG_subprogram)
DW_AT_name : sync/atomic.(*Value).Load
DW_AT_low_pc : 0x9eb610
DW_AT_high_pc : 0x9eb680
DW_AT_frame_base : 1 byte block: 9c (DW_OP_call_frame_cfa)
DW_AT_external : 1
<2>: Abbrev Number: 5 (DW_TAG_formal_parameter)
DW_AT_name : v
DW_AT_location : 4 byte block: 9c 11 20 22 (DW_OP_call_frame_cfa; DW_OP_consts: 32; DW_OP_plus)
DW_AT_type : <0x0>
<2>: Abbrev Number: 5 (DW_TAG_formal_parameter)
DW_AT_name : x
DW_AT_location : 4 byte block: 9c 11 28 22 (DW_OP_call_frame_cfa; DW_OP_consts: 40; DW_OP_plus)
<101> DW_AT_type : <0x0>

This is only happening with DW_TAG_formal_parameter entries. I can see that DW_AT_type should not be 0 and that is what gdb doesn't like. There are lots of formal parameters that are correct in the dwarf output.

If I build hello as a program using external linking but not -linkshared then the dwarf is correct and gdb works fine.

@ianlancetaylor
Copy link
Contributor

CC @heschik

@ianlancetaylor ianlancetaylor added this to the Go1.9Maybe milestone Jun 1, 2017
@heschi
Copy link
Contributor

heschi commented Jun 1, 2017

Ian, you scared me a little before I realized this bug was 3 weeks old :) Probably not my fault then.

I'm not too familiar with this, but it looks to me like debugging this program is simply a non-starter. I agree that the type offsets for many of the formal_parameters in libstd.so are bogus, but that's the least of the problems here. The actual program has no debug information at all:

WARNING: no debugging symbols found in helloworld.
Either the binary was compiled without debugging information
or the debugging information was removed (e.g., with strip or strip -g).
Debugger capabilities will be very limited.
...
Breakpoint 1, 0x0000000000400da0 in main.main ()
(gdb) next
Single stepping until exit from function main.main,
which has no line number information.

Hopefully someone more familiar with the dynamically linked modes can comment on whether this is expected. Otherwise I can take a look at it.

@ianlancetaylor
Copy link
Contributor

CC @mwhudson

@mwhudson
Copy link
Contributor

mwhudson commented Jun 2, 2017

Yes, we can't generate dwarf for things that are linked against other shared libraries. We should be able to generate dwarf for libstd.so (as that doesn't link to any go shared libraries) but as you can see that's buggy. The lack of DWARF at all for the executables makes this somehow less important-seeming though.

IIRC the problem with DWARF for things that link against Go was things like how to represent a global variable of a type that is defined in another shared library. I don't know enough about DWARF to know the answer, do you need to duplicate the description or can you refer to DWARF in another .so somehow?

@laboger
Copy link
Contributor Author

laboger commented Jun 2, 2017

I hadn't noticed the hello program when building with -linkshared didn't have debug information because I mainly do low level debugging anyway so having it there doesn't matter. But that seems like a bug because the output from 'go tool compile -h' says that debug information is provided by default. I have verified that the go.o file that is being sent to the linker doesn't contain the dwarf if I use -linkshared, so it is not the linker stripping it out.

As a result of this discussion I realized that I can just strip out the debug info from libstd.so and do the debugging I need so that is a good enough workaround for what I am trying to do now. The problem with the bad dwarf information was that gdb didn't work at all.

In case you are curious here is the dwarf output from a program created not using -linkshared, displaying the same symbol as above:

<1><74>: Abbrev Number: 2 (DW_TAG_subprogram)
<75> DW_AT_name : sync/atomic.(*Value).Store
<90> DW_AT_low_pc : 0x57e50
<98> DW_AT_high_pc : 0x57fa0
DW_AT_frame_base : 1 byte block: 9c (DW_OP_call_frame_cfa)
DW_AT_external : 1
<2>: Abbrev Number: 5 (DW_TAG_formal_parameter)
DW_AT_name : v
DW_AT_location : 4 byte block: 9c 11 20 22 (DW_OP_call_frame_cfa; DW_OP_consts: 32; DW_OP_plus)
DW_AT_type : <0x2eb86>

<1><2eb86>: Abbrev Number: 19 (DW_TAG_pointer_type)
<2eb87> DW_AT_name : *sync/atomic.Value
<2eb9a> DW_AT_type : <0x2e316>
<2eb9e> Unknown AT value: 2900: 22

@laboger
Copy link
Contributor Author

laboger commented Jun 2, 2017

FYI... even if I do this, the debug symbols are not in the hello executable:

go build -linkshared -gcflags '-dwarf' hello.go

@heschi
Copy link
Contributor

heschi commented Jun 2, 2017

I don't know enough about DWARF to know the answer, do you need to duplicate the description or can you refer to DWARF in another .so somehow?

Yes to the latter. http://www.dwarfstd.org/doc/DWARF4.pdf#page=163 is the relevant part of the DWARF4 spec. AIUI you just emit a relocation to a named symbol, same as if you were calling a function, and then the debugger is responsible for resolving that relocation when it reads the debug information. My guess at the implementation in the Go linker would be:

  • Change dwarf.newdie to put type symbols into the symbol table (don't set AttrNotInSymbolTable) if the output is a shared library. This will make them available for the program to reference.
  • In writelines, change the if BuildmodeShared block to emit a standard relocation type (R_ADDR?) that can be left unresolved.

None of this explains why libstd doesn't reference its own types, of course.

Maybe we should open a separate bug to investigate adding the debug sections to shared mode executables? As you say, there really isn't much point in worrying about this until that's done.

@laboger
Copy link
Contributor Author

laboger commented Jun 2, 2017

Oh I see now that -w (disable dwarf) is passed into the linker when -linkshared is used. This code in
cmd/go/internal/work/build.go:

                    // TODO(mwhudson): remove -w when that gets fixed in linker.
                    cfg.BuildLdflags = append(cfg.BuildLdflags, "-linkshared", "-w")

@mwhudson
Copy link
Contributor

mwhudson commented Jun 5, 2017

I don't know enough about DWARF to know the answer, do you need to duplicate the description or can you refer to DWARF in another .so somehow?
Yes to the latter. http://www.dwarfstd.org/doc/DWARF4.pdf#page=163 is the relevant part of the DWARF4 spec. AIUI you just emit a relocation to a named symbol, same as if you were calling a function, and then the debugger is responsible for resolving that relocation when it reads the debug information. My guess at the implementation in the Go linker would be:

That's good to know. For some reason, I've never gotten to the point of absorbing a good mental model of how DWARF works...

Change dwarf.newdie to put type symbols into the symbol table (don't set AttrNotInSymbolTable) if the output is a shared library. This will make them available for the program to reference.
In writelines, change the if BuildmodeShared block to emit a standard relocation type (R_ADDR?) that can be left unresolved.

+1

None of this explains why libstd doesn't reference its own types, of course.

FWIW, this is new-ish, libstd.so used (i.e. 1.7, probably) to have semi-reasonable debug data.

Maybe we should open a separate bug to investigate adding the debug sections to shared mode executables? As you say, there really isn't much point in worrying about this until that's done.

I think this would make sense yes.

(I've debugged shared libraries a whole bunch with gdb of course, but I guess like Lynn, it's mostly been in the "single stepping assembly" sort of fashion)

@laboger
Copy link
Contributor Author

laboger commented Jun 5, 2017

I tried out a go 1.7 toolchain and verified that the problem with the missing DIE in libstd.so does not happen there. It does happen in 1.8 and master.

@bradfitz bradfitz added the NeedsFix The path to resolution is known, but the work has not been done. label Jul 20, 2017
@bradfitz bradfitz modified the milestones: Go1.10, Go1.9Maybe Jul 20, 2017
@rsc rsc modified the milestones: Go1.10, Go1.11 Nov 22, 2017
@jcajka
Copy link
Contributor

jcajka commented Dec 12, 2017

I'm also hitting this issue on AArch64 with go1.9 and master(on Fedora 28/27). Actually I'm debugging other issue that is arising with shared bins on arm64, some seg faulting(issue pending). It is unfortunate that this breaks gdb, it hangs on Dwarf Error: Cannot find DIE at 0x0 referenced from DIE at 0xab [in module /usr/lib/golang/pkg/linux_arm64_dynlink/libstd.so] for me, while si through(looking if it is also bug in gdb, IMHO it shouldn't hang on incorrect/mangled/missing debug info).

I have bit played with the dwarf_test in the cmd/link and it will detect this issue when executed with shared flags. IMHO it would be good to enable it in long term(when dwarf generation is fixed/disable for dynamic libs).

Is there any progress on this issue? Is there way i can help?

@heschi
Copy link
Contributor

heschi commented Dec 12, 2017

No progress that I'm aware of, and it's too late for 1.10 at this point. Note that if you just don't want gdb to freeze, @laboger said running the strip command on libstd.so was enough to fix that.

@laboger
Copy link
Contributor Author

laboger commented Dec 13, 2017

I found this regression was introduced by this commit:

commit 795ad07b3be3cb51e07d502409f815f7d1f97305
Author: Michael Matloob <matloob@golang.org>
Date:   Thu Jul 28 13:04:41 2016 -0400

    cmd: generate DWARF for functions in compile instead of link.
    
    This is a copy of golang.org/cl/22092 by Ryan Brown.
    
    Here's his original comment:
    On my machine this increases the average time for 'go build cmd/go' from
    2.25s to 2.36s. I tried to measure compile and link separately but saw
    no significant change.
    
    Change-Id: If0d2b756d52a0d30d4eda526929c82794d89dd7b
    Reviewed-on: https://go-review.googlesource.com/25311
    Run-TryBot: Michael Matloob <matloob@golang.org>
    TryBot-Result: Gobot Gobot <gobot@golang.org>
    Reviewed-by: David Crawshaw <crawshaw@golang.org>

@gopherbot gopherbot modified the milestones: Go1.11, Unplanned May 23, 2018
@laboger
Copy link
Contributor Author

laboger commented Jul 6, 2018

This error no longer occurs on ppc64le on latest.

If someone can verify if it works for arm64 it can be closed.

@heschi
Copy link
Contributor

heschi commented Jul 9, 2018

Really? I can't seem to cross-compile a shared library for ppc64le, but at least in amd64, there's still no DWARF in a -linkshared hello world binary. If it's fixed I would be curious to know what fixed it.

@laboger
Copy link
Contributor Author

laboger commented Jul 9, 2018

In my previous post I meant that I no longer see the error message and failure from gdb when trying to debug a program built with -linkshared on latest.

The dwarf is still not being generated in the main program when built with -linkshared and I see no way to do that, so that part has not been fixed.

@seankhliao
Copy link
Member

Obsoleted by #47788

@golang golang locked and limited conversation to collaborators Oct 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Debugging FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

9 participants