Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: TestCgoPprof fails on Ubuntu 16.10 #15714

Closed
mwhudson opened this issue May 17, 2016 · 17 comments
Closed

runtime: TestCgoPprof fails on Ubuntu 16.10 #15714

mwhudson opened this issue May 17, 2016 · 17 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@mwhudson
Copy link
Contributor

Please answer these questions before submitting your issue. Thanks!

  1. What version of Go are you using (go version)?

go version devel +495e3c6 Tue May 17 04:02:11 2016 +0000 linux/amd64

  1. What operating system and processor architecture are you using (go env)?

Ubuntu 16.10

  1. What did you do?

go test runtime -run TestCgoPprof

  1. What did you expect to see?

ok runtime 1.061s

  1. What did you see instead?
--- FAIL: TestCgoPprof (1.06s)
    crash_cgo_test.go:260: 20ms of 20ms total (  100%)
              flat  flat%   sum%        cum   cum%
              20ms   100%   100%       20ms   100%  [testprogcgo.exe]
    crash_cgo_test.go:263: missing cpuHog in pprof output
FAIL
FAIL    runtime 1.068s

Not really at all sure what's going on here. Ubuntu 16.10 has a very new toolchain, so that might be the difference?

@ianlancetaylor ianlancetaylor added this to the Go1.7 milestone May 17, 2016
@ianlancetaylor
Copy link
Contributor

The test assumes that cmd/pprof can decode the debug info of the executable, to map the PC valeu to the name cpuHog. What does that debug info look like?

@mwhudson
Copy link
Contributor Author

I'm no expert at DWARF, but it looks OK to me?

(gdb) disassemble cpuHog+10
Dump of assembler code for function cpuHog:
   0x0000000000109703 <+0>: push   %rbp
   0x0000000000109704 <+1>: mov    %rsp,%rbp
   0x0000000000109707 <+4>: lea    0x2c722a(%rip),%rax        # 0x3d0938 <salt1>
   0x000000000010970e <+11>:    mov    (%rax),%eax
   0x0000000000109710 <+13>:    mov    %eax,-0x8(%rbp)
   0x0000000000109713 <+16>:    movl   $0x0,-0x4(%rbp)
   0x000000000010971a <+23>:    jmp    0x109741 <cpuHog+62>
   0x000000000010971c <+25>:    cmpl   $0x0,-0x8(%rbp)
   0x0000000000109720 <+29>:    jle    0x10972e <cpuHog+43>
   0x0000000000109722 <+31>:    mov    -0x8(%rbp),%eax
   0x0000000000109725 <+34>:    imul   -0x8(%rbp),%eax
   0x0000000000109729 <+38>:    mov    %eax,-0x8(%rbp)
   0x000000000010972c <+41>:    jmp    0x10973d <cpuHog+58>
   0x000000000010972e <+43>:    mov    -0x8(%rbp),%eax
   0x0000000000109731 <+46>:    lea    0x1(%rax),%edx
   0x0000000000109734 <+49>:    mov    -0x8(%rbp),%eax
   0x0000000000109737 <+52>:    imul   %edx,%eax
   0x000000000010973a <+55>:    mov    %eax,-0x8(%rbp)
   0x000000000010973d <+58>:    addl   $0x1,-0x4(%rbp)
   0x0000000000109741 <+62>:    cmpl   $0x1869f,-0x4(%rbp)
   0x0000000000109748 <+69>:    jle    0x10971c <cpuHog+25>
   0x000000000010974a <+71>:    lea    0x2c71c7(%rip),%rax        # 0x3d0918 <salt2>
   0x0000000000109751 <+78>:    mov    -0x8(%rbp),%edx
   0x0000000000109754 <+81>:    mov    %edx,(%rax)
   0x0000000000109756 <+83>:    nop
   0x0000000000109757 <+84>:    pop    %rbp
   0x0000000000109758 <+85>:    retq   
End of assembler dump.

I've uploaded the file here: http://people.canonical.com/~mwh/testprogcgo if you want a look.

Building testprogcgo on 16.10, copying it to 16.04, running and looking at the profile there doesn't work. More surprisingly, building it on 16.04, copying to 16.10, running and looking at the profile there doesn't work, but then building it on 16.04, running it there and then copying the binary and profile to 16.10 also doesn't work, so maybe there is more than one problem here.

@ianlancetaylor
Copy link
Contributor

Can you also upload the profile file?

@mwhudson
Copy link
Contributor Author

I got confused about my executables so I built another one, made a profile,
and uploaded both to http://people.canonical.com/~mwh/issue-15714/

On 19 May 2016 at 02:17, Ian Lance Taylor notifications@github.com wrote:

Can you also upload the profile file?


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#15714 (comment)

@quentinmit quentinmit added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 26, 2016
@quentinmit
Copy link
Contributor

@ianlancetaylor Have you had a chance to look at the uploaded profile? Is this a 1.7 blocker since it (doesn't seem to be) our regression?

@ianlancetaylor
Copy link
Contributor

I looked now. The problem is that the binary was linked as a PIE. The profiling data records the PC values for the executable as it was run, which have nothing to do with the addresses in the binary itself. For this to work the pprof tool needs to know the base address of the executable when it is run. Or, the profiler needs to record PC values as an offset from the load address. This must be a known problem but off hand I don't know what the usual solution is.

@mwhudson For 1.7 do you want to try changing buildTestProg to pass -extldflags=-fno-pie? Unfortunately we can't do that unconditionally, we can only do it if it works.

@rsc
Copy link
Contributor

rsc commented May 27, 2016

Switching to the new profile format might help but that's not going to happen for Go 1.7. Is there some way to know that we're running a PIE binary? The runtime could subtract the base addresses from the profile, maybe.

@rsc rsc modified the milestones: Go1.7Maybe, Go1.7 May 27, 2016
@mwhudson
Copy link
Contributor Author

I'll try fixing the test, but it'll be Monday next week before I get to it.
There's only two copies of the "is -no-pie accepted" in the code, must be
room for one more...

I don't know how to find the base address for the binary off the top of my
head.

On 27 May 2016 at 13:57, Russ Cox notifications@github.com wrote:

Switching to the new profile format might help but that's not going to
happen for Go 1.7. Is there some way to know that we're running a PIE
binary? The runtime could subtract the base addresses from the profile,
maybe.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#15714 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AApBFjDV9XR9C8-idTkEX864KtuS7I4zks5qFk-YgaJpZM4If7q5
.

@ianlancetaylor
Copy link
Contributor

You could elf.Open /proc/self/exe, look up a variable in the symbol table, and subtract the value in the symbol table from the actual address of the variable. Doesn't sound like 1.7 material to me.

@ianlancetaylor
Copy link
Contributor

Keeping the issue at 1.7maybe in hopes of fixing the test.

@rsc
Copy link
Contributor

rsc commented May 27, 2016

Which of these cases is true?

  1. On Ubuntu 16.10, Go profiling is broken only for program linked with the host linker and an explicit buildmode flag overriding the default build mode (or some other kind of explicit flag).
  2. On Ubuntu 16.10, Go profiling is broken by default for every program linked with the host linker (so any program using cgo outside the standard library).

If its (1), no big deal, it can wait for Go 1.8. If it's (2) and there is an easy fix, that seems more serious. I was reading the original report as if (2) were the case; maybe I misunderstood.

I will leave the 1.7 decision to @ianlancetaylor since I am about to go away for the summer.

@ianlancetaylor
Copy link
Contributor

@mwhudson Can you see if https://golang.org/cl/23525 fixes the problem?

@gopherbot
Copy link

CL https://golang.org/cl/23525 mentions this issue.

@Limdi
Copy link

Limdi commented May 28, 2016

On archlinux I get the same test-failure at tip using all.bash and compiling with 1.6.2.
23525 fixes this for me.

ok runtime 12.990s
ok runtime/debug 0.057s
ok runtime/internal/atomic 0.159s
ok runtime/internal/sys 0.002s
ok runtime/pprof 1.303s
ok runtime/trace 2.529s

@codesenberg
Copy link
Contributor

codesenberg commented Jan 30, 2017

Recently, I encountered similar problem, when I tried to build Go from source on my Ubuntu VM:

Distributor ID:	Ubuntu
Description:	Ubuntu 16.04.1 LTS
Release:	16.04
Codename:	xenial

Go version used:
go version devel +4cffe2b Sun Jan 29 23:31:20 2017 +0000 linux/amd64
Bootstrapped with:
go version go1.8rc3 linux/amd64

So far I have seen three subtly different ways in which tests failed: first, second and third one.

Here are binaries and corresponding profiles.

I ran all.bash several times and it failed consistently. Looks like those tests are somewhat flaky.
Should I file a new issue?

@bradfitz
Copy link
Contributor

@codesenberg, commenting on closed issues is not effective: we don't track closed issues. If you want somebody to look at it, file a new bug.

@codesenberg
Copy link
Contributor

@bradfitz got it.

@golang golang locked and limited conversation to collaborators Jan 30, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

8 participants