Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: inaccurate DWARF location information -> misleading profile #39667

Open
alandonovan opened this issue Jun 17, 2020 · 1 comment
Open
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@alandonovan
Copy link
Contributor

alandonovan commented Jun 17, 2020

During a recent profiling session, I noticed spurious non-zero samples against lines I knew to be unreachable in the test scenario. It appears that the compiler's DWARF information is incorrect. I don't know to what extent the rest of the profiler's information is reliable.

I'm using today's tip:

$ go version
go version devel +8b98498a58 Wed Jun 17 19:48:45 2020 +0000 linux/amd64

It's easy to reproduce deterministically, though the test case is far from minimal. The profiled application is go.starlark.net/cmd/starlark. The executable file is here called 'x'.

$ cd $GOROOT/src/go.starlark.net
$ git log | head -n 1
commit c6daab680f283fdf16a79c99d8e86a1562400761
$ go build -o x ./cmd/starlark

There are 49 program counter locations that purport to belong to interp.go:333. One of them is address 0x593311.

$ objdump --dwarf=decodedline x | grep 'starlark/interp.go .* 333 ' | wc -l 
49
adonovan2:go.starlark.net% objdump --dwarf=decodedline x | grep 'starlark/interp.go .* 333 ' | grep 0x593311
/go.starlark.net/starlark/interp.go          333            0x593311        

The actual code at interp.go:333 (https://github.com/google/starlark-go/blob/c6daab680f283fdf16a79c99d8e86a1562400761/starlark/interp.go#L333) is a case in a byte code interpreter that handles the ITERPUSH bytecode, which is not reached in the profiled scenario. (I confirmed this by replacing it with a panic statement; it did not panic.)

$ nl -ba starlark/interp.go | grep  -A1 -B8 333
   325			case compile.ITERPUSH:
   326				x := stack[sp-1]
   327				sp--
   328				iter := Iterate(x)
   329				if iter == nil {
   330					err = fmt.Errorf("%s value is not iterable", x.Type())
   331					break loop
   332				}
   333				iterstack = append(iterstack, iter)  # <-- spuriously associated with 0x593311
   334	

But disassembling the program shows that address 0x593311 is in fact in code for the body of the loop at lines 96-104 (https://github.com/google/starlark-go/blob/c6daab680f283fdf16a79c99d8e86a1562400761/starlark/interp.go#L96). (The add 7 instruction is the giveaway.) I've shown the whole basic block.

$ objdump -S x
...
  593305:       48 83 c1 07             add    $0x7,%rcx    # <-- a clue
  593309:       4c 8b 84 24 d0 02 00    mov    0x2d0(%rsp),%r8
  593310:       00 
  593311:       4c 8b bc 24 30 01 00    mov    0x130(%rsp),%r15     # <-- our instruction
  593318:       00 
  593319:       48 89 8c 24 d0 00 00    mov    %rcx,0xd0(%rsp)
  593320:       00 
  593321:       89 74 24 70             mov    %esi,0x70(%rsp)
  593325:       48 8b 8c 24 78 03 00    mov    0x378(%rsp),%rcx
  59332c:       00 
  59332d:       48 8b b4 24 a0 01 00    mov    0x1a0(%rsp),%rsi
  593334:       00 
  593335:       4d 39 cb                cmp    %r9,%r11
  593338:       0f 86 1d 4b 00 00       jbe    597e5b <go.starlark.net/starlark.(*Function).CallInternal+0x4fdb>

Here's the corresponding source:

$ nl -ba starlark/interp.go | grep -A7 ' 96'
    96				for s := uint(0); ; s += 7 {
    97					b := code[pc]
    98					pc++
    99					arg |= uint32(b&0x7f) << s
   100					if b < 0x80 {
   101						break
   102					}
   103				}

Running the program in gdb and breakpointing that instruction appears to confirm this inference. Picking some of the other instructions associated with interp.go:333, several appear to be logic for the binary tree in the control flow graph for 'switch op'.

All the affected PC addresses appear to be loads of 0x130(SP). Perhaps the liveness analysis is using that stack slot as a temporary as well as for 'iterstack', and somehow location information is leaking between the two variables assigned to the same slot?

I realize a perfect debug view is basically impossible to achieve, but I wonder if there is an easy fix in this case.

@heschi
Copy link
Contributor

heschi commented Jun 17, 2020

@dr2chase owns line numbering these days.

@heschi heschi removed their assignment Jun 17, 2020
@dr2chase dr2chase self-assigned this Jun 17, 2020
@cagedmantis cagedmantis added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 23, 2022
@cagedmantis cagedmantis added this to the Backlog milestone Jun 23, 2022
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Status: Triage Backlog
Development

No branches or pull requests

5 participants