Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: TestCgoCCodeSIGPROF fails on RHEL7.2 7.2 3.10.0-313.el7.ppc64le #14732

Closed
laboger opened this issue Mar 9, 2016 · 22 comments
Closed
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@laboger
Copy link
Contributor

laboger commented Mar 9, 2016

I'm seeing this build failure only on RHEL 7.2 3.10.0-313.el7.ppc64le. I've successfully built on another RHEL7.2 with a later kernel as well as other distros without the error. I know there was a restriction on certain Linux kernel versions but couldn't find where that is documented.

Please answer these questions before submitting your issue. Thanks!

  1. What version of Go are you using (go version)?
    Failure starts with this commit and continues to happen with latest
    go version devel +1716162 Fri Mar 4 21:06:31 2016 +0000 linux/ppc64le
  2. What operating system and processor architecture are you using (go env)?
    RHEL 7.2 ppc64le
    3.10.0-313.el7.ppc64le cmd/cgo: fails with gcc 4.4.1 #1 SMP Tue Sep 1 00:35:49 EDT 2015 ppc64le ppc64le ppc64le GNU/Linux
    Note that I am only seeing this failure on this system. I've built on various Ubuntu systems, a SLES12 and even another RHEL 7.2 system with a different Linux version without the failure.
  3. What did you do?
    Pulled the golang master source from this commit and tried to build using src/all.bash.
  4. What did you expect to see?
    Everything building and all tests finished, ending with:
    ALL TESTS PASSED
  5. What did you see instead?
    panic: test timed out after 3m0s
goroutine 22506 [running]:
panic(0x264b00, 0xc820422030)
        /home/boger/golang/plain/go/src/runtime/panic.go:500 +0x3a8 fp=0xc820033ef0 sp=0xc820033e58
testing.startAlarm.func1()                                                                         
        /home/boger/golang/plain/go/src/testing/testing.go:724 +0x14c fp=0xc820033f90 sp=0xc820033ef0
runtime.goexit()                                                                                     
        /home/boger/golang/plain/go/src/runtime/asm_ppc64x.s:1056 +0x4 fp=0xc820033f90 sp=0xc820033f90
created by time.goFunc                                                                                
        /home/boger/golang/plain/go/src/time/sleep.go:129 +0x4c                                       

goroutine 1 [chan receive, 2 minutes, locked to thread]:
runtime.gopark(0x366720, 0xc8201633d8, 0x335a93, 0xc, 0x17, 0x3)
        /home/boger/golang/plain/go/src/runtime/proc.go:262 +0x19c fp=0xc8200459f8 sp=0xc8200459b8
runtime.goparkunlock(0xc8201633d8, 0x335a93, 0xc, 0x3fffb7d20017, 0x3)                            
        /home/boger/golang/plain/go/src/runtime/proc.go:268 +0x58 fp=0xc820045a48 sp=0xc8200459f8 
runtime.chanrecv(0x2594c0, 0xc820163380, 0xc820045c80, 0xc820045b01, 0x90000)                     
        /home/boger/golang/plain/go/src/runtime/chan.go:474 +0x560 fp=0xc820045ae8 sp=0xc820045a48
runtime.chanrecv1(0x2594c0, 0xc820163380, 0xc820045c80)                                           
        /home/boger/golang/plain/go/src/runtime/chan.go:357 +0x30 fp=0xc820045b30 sp=0xc820045ae8 
testing.RunTests(0x366108, 0x468ac0, 0xb4, 0xb4, 0xdf101)                                         
        /home/boger/golang/plain/go/src/testing/testing.go:582 +0x828 fp=0xc820045cd0 sp=0xc820045b30
testing.(*M).Run(0xc820045ec8, 0xd56e8)                                                              
        /home/boger/golang/plain/go/src/testing/testing.go:515 +0xb4 fp=0xc820045d78 sp=0xc820045cd0 
runtime_test.TestMain(0xc820045ec8)                                                                  
        /home/boger/golang/plain/go/src/runtime/crash_test.go:24 +0x30 fp=0xc820045e10 sp=0xc820045d78
main.main()                                                                                           
        runtime/_test/_testmain.go:960 +0x104 fp=0xc820045f18 sp=0xc820045e10                         
runtime.main()                                                                                        
        /home/boger/golang/plain/go/src/runtime/proc.go:188 +0x314 fp=0xc820045f80 sp=0xc820045f18    
runtime.goexit()                                                                                      
        /home/boger/golang/plain/go/src/runtime/asm_ppc64x.s:1056 +0x4 fp=0xc820045f80 sp=0xc820045f80

goroutine 17 [syscall, 2 minutes, locked to thread]:
runtime.goexit()                                    
        /home/boger/golang/plain/go/src/runtime/asm_ppc64x.s:1056 +0x4 fp=0xc820034fb0 sp=0xc820034fb0

goroutine 2 [force gc (idle), 2 minutes]:
runtime.gopark(0x366720, 0x46bf00, 0x32eeaa, 0xf, 0x14, 0x1)
        /home/boger/golang/plain/go/src/runtime/proc.go:262 +0x19c fp=0xc820020708 sp=0xc8200206c8
runtime.goparkunlock(0x46bf00, 0x32eeaa, 0xf, 0xc820000114, 0x1)                                  
        /home/boger/golang/plain/go/src/runtime/proc.go:268 +0x58 fp=0xc820020758 sp=0xc820020708 
runtime.forcegchelper()                                                                           
        /home/boger/golang/plain/go/src/runtime/proc.go:229 +0xdc fp=0xc8200207a0 sp=0xc820020758 
runtime.goexit()                                                                                  
        /home/boger/golang/plain/go/src/runtime/asm_ppc64x.s:1056 +0x4 fp=0xc8200207a0 sp=0xc8200207a0
created by runtime.init.4                                                                             
        /home/boger/golang/plain/go/src/runtime/proc.go:218 +0x38                                     

goroutine 3 [GC sweep wait]:
runtime.gopark(0x366720, 0x46c140, 0x32d79a, 0xd, 0x72e14, 0x1)
        /home/boger/golang/plain/go/src/runtime/proc.go:262 +0x19c fp=0xc820020ef8 sp=0xc820020eb8
runtime.goparkunlock(0x46c140, 0x32d79a, 0xd, 0x14, 0x1)                                          
        /home/boger/golang/plain/go/src/runtime/proc.go:268 +0x58 fp=0xc820020f48 sp=0xc820020ef8 
runtime.bgsweep(0xc820048000)                                                                     
        /home/boger/golang/plain/go/src/runtime/mgcsweep.go:79 +0x19c fp=0xc820020f98 sp=0xc820020f48
runtime.goexit()                                                                                     
        /home/boger/golang/plain/go/src/runtime/asm_ppc64x.s:1056 +0x4 fp=0xc820020f98 sp=0xc820020f98
created by runtime.gcenable                                                                           
        /home/boger/golang/plain/go/src/runtime/mgc.go:191 +0x5c                                      

goroutine 4 [finalizer wait]:
runtime.gopark(0x366720, 0x4875d0, 0x32e44f, 0xe, 0x14, 0x1)
        /home/boger/golang/plain/go/src/runtime/proc.go:262 +0x19c fp=0xc8200216c8 sp=0xc820021688
runtime.goparkunlock(0x4875d0, 0x32e44f, 0xe, 0x2000000014, 0x1)                                  
        /home/boger/golang/plain/go/src/runtime/proc.go:268 +0x58 fp=0xc820021718 sp=0xc8200216c8 
runtime.runfinq()                                                                                 
        /home/boger/golang/plain/go/src/runtime/mfinal.go:158 +0xc0 fp=0xc8200217a0 sp=0xc820021718
runtime.goexit()                                                                                   
        /home/boger/golang/plain/go/src/runtime/asm_ppc64x.s:1056 +0x4 fp=0xc8200217a0 sp=0xc8200217a0
created by runtime.createfing                                                                         
        /home/boger/golang/plain/go/src/runtime/mfinal.go:139 +0x84                                   

goroutine 5 [timer goroutine (idle)]:
runtime.gopark(0x366720, 0x46c240, 0x3354d8, 0x16, 0x367b14, 0x1)
        /home/boger/golang/plain/go/src/runtime/proc.go:262 +0x19c fp=0xc820021eb0 sp=0xc820021e70
runtime.goparkunlock(0x46c240, 0x3354d8, 0x16, 0x14, 0x1)                                         
        /home/boger/golang/plain/go/src/runtime/proc.go:268 +0x58 fp=0xc820021f00 sp=0xc820021eb0 
runtime.timerproc()                                                                               
        /home/boger/golang/plain/go/src/runtime/time.go:202 +0x150 fp=0xc820021fa0 sp=0xc820021f00
runtime.goexit()                                                                                  
        /home/boger/golang/plain/go/src/runtime/asm_ppc64x.s:1056 +0x4 fp=0xc820021fa0 sp=0xc820021fa0
created by runtime.addtimerLocked                                                                     
        /home/boger/golang/plain/go/src/runtime/time.go:116 +0x1cc                                    

goroutine 22504 [syscall, 2 minutes]:
syscall.Syscall6(0x72, 0xc33e, 0xc8203c0a78, 0x0, 0xc820071050, 0x0, 0x0, 0x0, 0xc8201633e0, 0x16c044)
        /home/boger/golang/plain/go/src/syscall/asm_linux_ppc64x.s:41 +0xc fp=0xc8203c09a0 sp=0xc8203c0980
syscall.wait4(0xc33e, 0xc8203c0a78, 0x0, 0xc820071050, 0x90, 0x0, 0x0)                                    
        /home/boger/golang/plain/go/src/syscall/zsyscall_linux_ppc64le.go:172 +0x70 fp=0xc8203c0a20 sp=0xc8203c09a0                                                                                                         
syscall.Wait4(0xc33e, 0xc8203c0ad8, 0x0, 0xc820071050, 0xc820424da0, 0x0, 0x0)                                
        /home/boger/golang/plain/go/src/syscall/syscall_linux.go:256 +0x54 fp=0xc8203c0a80 sp=0xc8203c0a20    
os.(*Process).wait(0xc820424da0, 0x2, 0x0, 0x0)                                                               
        /home/boger/golang/plain/go/src/os/exec_unix.go:22 +0xe8 fp=0xc8203c0b48 sp=0xc8203c0a80              
os.(*Process).Wait(0xc820424da0, 0x0, 0x0, 0x0)                                                               
        /home/boger/golang/plain/go/src/os/doc.go:49 +0x34 fp=0xc8203c0b88 sp=0xc8203c0b48                    
os/exec.(*Cmd).Wait(0xc82016b180, 0x0, 0x0)                                                                   
        /home/boger/golang/plain/go/src/os/exec/exec.go:396 +0x208 fp=0xc8203c0c70 sp=0xc8203c0b88            
os/exec.(*Cmd).Run(0xc82016b180, 0x0, 0x0)                                                                    
        /home/boger/golang/plain/go/src/os/exec/exec.go:262 +0x70 fp=0xc8203c0cb8 sp=0xc8203c0c70             
os/exec.(*Cmd).CombinedOutput(0xc82016b180, 0x0, 0x0, 0x0, 0x0, 0x0)                                          
        /home/boger/golang/plain/go/src/os/exec/exec.go:453 +0x330 fp=0xc8203c0d60 sp=0xc8203c0cb8            
runtime_test.runTestProg(0xc820070fc0, 0x32ca31, 0xb, 0x331b41, 0xf, 0x0, 0x0)                                
        /home/boger/golang/plain/go/src/runtime/crash_test.go:68 +0x1b8 fp=0xc8203c0e50 sp=0xc8203c0d60       
runtime_test.TestCgoCCodeSIGPROF(0xc820070fc0)                                                                
        /home/boger/golang/plain/go/src/runtime/crash_cgo_test.go:206 +0x58 fp=0xc8203c0f30 sp=0xc8203c0e50   
testing.tRunner(0xc820070fc0, 0x468d78)                                                                       
        /home/boger/golang/plain/go/src/testing/testing.go:473 +0xdc fp=0xc8203c0f80 sp=0xc8203c0f30          
runtime.goexit()                                                                                              
        /home/boger/golang/plain/go/src/runtime/asm_ppc64x.s:1056 +0x4 fp=0xc8203c0f80 sp=0xc8203c0f80        
created by testing.RunTests                                                                                   
        /home/boger/golang/plain/go/src/testing/testing.go:581 +0x7fc                                         

goroutine 157 [GC worker (idle)]:
runtime.gopark(0xc820422000, 0xc8200d0f88, 0x32f56c, 0x10, 0x265714, 0x0)
        /home/boger/golang/plain/go/src/runtime/proc.go:262 +0x19c fp=0xc8200d0f00 sp=0xc8200d0ec0
runtime.gcBgMarkWorker(0xc820018000)                                                              
        /home/boger/golang/plain/go/src/runtime/mgc.go:1403 +0xec fp=0xc8200d0f98 sp=0xc8200d0f00 
runtime.goexit()                                                                                  
        /home/boger/golang/plain/go/src/runtime/asm_ppc64x.s:1056 +0x4 fp=0xc8200d0f98 sp=0xc8200d0f98
created by runtime.gcBgMarkStartWorkers                                                               
        /home/boger/golang/plain/go/src/runtime/mgc.go:1330 +0xa0                                     

goroutine 22505 [syscall, 2 minutes]:
syscall.Syscall(0x3, 0x4, 0xc82017d600, 0x200, 0x0, 0xc82042814b, 0x5f1e4)
        /home/boger/golang/plain/go/src/syscall/asm_linux_ppc64x.s:17 +0xc fp=0xc820032b18 sp=0xc820032af8
syscall.read(0x4, 0xc82017d600, 0x200, 0x200, 0xc8204236b0, 0x0, 0x0)                                     
        /home/boger/golang/plain/go/src/syscall/zsyscall_linux_ppc64le.go:783 +0x70 fp=0xc820032b70 sp=0xc820032b18                                                                                                         
syscall.Read(0x4, 0xc82017d600, 0x200, 0x200, 0x0, 0x0, 0x0)                                                  
        /home/boger/golang/plain/go/src/syscall/syscall_unix.go:161 +0x4c fp=0xc820032bc8 sp=0xc820032b70     
os.(*File).read(0xc820024260, 0xc82017d600, 0x200, 0x200, 0xc82017d600, 0x0, 0x0)                             
        /home/boger/golang/plain/go/src/os/file_unix.go:228 +0x54 fp=0xc820032c20 sp=0xc820032bc8             
os.(*File).Read(0xc820024260, 0xc82017d600, 0x200, 0x200, 0x25c9a0, 0x0, 0x0)                                 
        /home/boger/golang/plain/go/src/os/file.go:95 +0x84 fp=0xc820032c90 sp=0xc820032c20                   
bytes.(*Buffer).ReadFrom(0xc8200488c0, 0x3fffb74e4518, 0xc820024260, 0x0, 0x0, 0x0)                           
        /home/boger/golang/plain/go/src/bytes/buffer.go:176 +0x23c fp=0xc820032d58 sp=0xc820032c90            
io.copyBuffer(0x3fffb74e46e0, 0xc8200488c0, 0x3fffb74e4518, 0xc820024260, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)       
        /home/boger/golang/plain/go/src/io/io.go:374 +0x138 fp=0xc820032e30 sp=0xc820032d58                   
io.Copy(0x3fffb74e46e0, 0xc8200488c0, 0x3fffb74e4518, 0xc820024260, 0xc82001ff10, 0x0, 0x0)                   
        /home/boger/golang/plain/go/src/io/io.go:350 +0x5c fp=0xc820032ea0 sp=0xc820032e30                    
os/exec.(*Cmd).writerDescriptor.func1(0x0, 0x0)                                                               
        /home/boger/golang/plain/go/src/os/exec/exec.go:236 +0xbc fp=0xc820032f38 sp=0xc820032ea0             
os/exec.(*Cmd).Start.func1(0xc82016b180, 0xc820424b60)                                                        
        /home/boger/golang/plain/go/src/os/exec/exec.go:344 +0x34 fp=0xc820032f80 sp=0xc820032f38             
runtime.goexit()                                                                                              
        /home/boger/golang/plain/go/src/runtime/asm_ppc64x.s:1056 +0x4 fp=0xc820032f80 sp=0xc820032f80        
created by os/exec.(*Cmd).Start                                                                               
        /home/boger/golang/plain/go/src/os/exec/exec.go:345 +0x8b4                         
@bradfitz bradfitz added this to the Unplanned milestone Apr 10, 2016
@laboger
Copy link
Contributor Author

laboger commented May 5, 2016

Not sure what happens here, I haven't seen this error in a while. It seemed like a flaky problem, probably something different in the OS. We can close this if you want and if it happens again I can open a new one. It only happened on RHEL 7.2.

@mikioh mikioh changed the title runtime: regression in crash_test.go on RHEL7.2 7.2 3.10.0-313.el7.ppc64le runtime: TestCgoCCodeSIGPROF fails on RHEL7.2 7.2 3.10.0-313.el7.ppc64le Jun 15, 2016
@mikioh
Copy link
Contributor

mikioh commented Jun 15, 2016

I think this happens on other platforms. I just tested tip (go version devel +0ec6256) w/ https://go-review.googlesource.com/24110/ on openbsd/386 and got the following:

--- FAIL: TestCgoCCodeSIGPROF (60.12s)
        crash_test.go:105: testprogcgo CgoCCodeSIGPROF exit status: exit status 2
        crash_cgo_test.go:213: expected "OK\n" got SIGQUIT: quit
                PC=0x1c056c27 m=4

                goroutine 0 [idle]:
                runtime.thrsleep(0x4c3249cc, 0x3, 0x0, 0x0, 0x4c3249cc, 0x0, 0x0, 0x3c036fdc, 0x1c0675c2, 0x7c37bf5f, ...)
                        /go/src/runtime/sys_openbsd_386.s:332 +0x7
                runtime.semasleep(0xffffffff, 0xffffffff, 0x4c324780)
                        /go/src/runtime/os_openbsd.go:119 +0x175
                runtime.notesleep(0x4c324828)
                        /go/src/runtime/lock_sema.go:166 +0xc1
                runtime.stopm()
                        /go/src/runtime/proc.go:1567 +0xd7
                runtime.startlockedm(0x4c3003c0)
                        /go/src/runtime/proc.go:1744 +0xa3
                runtime.schedule()
                        /go/src/runtime/proc.go:2106 +0x236
                runtime.park_m(0x4c3000f0)
                        /go/src/runtime/proc.go:2156 +0x142
                runtime.mcall(0x0)
                        /go/src/runtime/asm_386.s:265 +0x43

                goroutine 1 [chan receive]:
                main.CgoCCodeSIGPROF()
                        /go/src/runtime/testdata/testprogcgo/aprof.go:51 +0xd1
                main.main()
                        /go/src/runtime/testdata/testprogcgo/main.go:34 +0x175

                goroutine 17 [runnable, locked to thread]:
                runtime.goexit()
                        /go/src/runtime/asm_386.s:1612 +0x1

                goroutine 5 [syscall]:
                os/signal.signal_recv(0x0)
                        /go/src/runtime/sigqueue.go:116 +0x118
                os/signal.loop()
                        /go/src/os/signal/signal_unix.go:22 +0x14
                created by os/signal.init.1
                        /go/src/os/signal/signal_unix.go:28 +0x30

                goroutine 6 [runnable]:
                main._Cfunc_GoNop()
                        _/go/src/runtime/testdata/testprogcgo/_obj/_cgo_gotypes.go:116
                main.CgoCCodeSIGPROF.func1(0x4c3160c0)
                        /go/src/runtime/testdata/testprogcgo/aprof.go:42 +0xe9
                created by main.CgoCCodeSIGPROF
                        /go/src/runtime/testdata/testprogcgo/aprof.go:46 +0x50

                goroutine 7 [syscall]:
                runtime.CPUProfile(0x0, 0x0, 0x0)
                        /go/src/runtime/cpuprof.go:447 +0x2b
                runtime/pprof.profileWriter(0x3c0842a0, 0x4c35a000)
                        /go/src/runtime/pprof/pprof.go:618 +0x14
                created by runtime/pprof.StartCPUProfile
                        /go/src/runtime/pprof/pprof.go:612 +0x11d

                goroutine 18 [syscall, locked to thread]:
                runtime.goexit()
                        /go/src/runtime/asm_386.s:1612 +0x1

                eax    0x4
                ebx    0x0
                ecx    0x4c300f00
                edx    0xffffffff
                edi    0x4c318000
                esi    0x7c07f000
                ebp    0x4c324780
                esp    0x7c37bef8
                eip    0x1c056c27
                eflags 0x206
                cs     0x2b
                fs     0x7c54005b
                gs     0x4190063
FAIL

Without https://go-review.googlesource.com/24110/, it prints "panic: test timed out after 3m0s" and similar stack trace.

@mikioh mikioh modified the milestones: Go1.7Maybe, Unplanned Jun 15, 2016
@mikioh
Copy link
Contributor

mikioh commented Jun 15, 2016

Perhaps this issue is a duplicate of #14809.

@ianlancetaylor
Copy link
Contributor

@mikioh Can you try to recreate this with the environment variable GODEBUG=schedtrace=100,scheddetail=1 ?

@mikioh
Copy link
Contributor

mikioh commented Jun 15, 2016

@ianlancetaylor Here you are, a 15Meg plain text file.
TestCgoCCodeSIGPROF.txt; go version devel +cdfffc6 w/o CL 24110

@ianlancetaylor
Copy link
Contributor

Oh, bother, I'm sorry. That is the scheduling trace for the runtime.test program, but I need the scheduling trace for the program that it runs. And it's not there because testEnv in runtime/crash_test.go is careful to not pass down the GODEBUG variable.

How about this: change testEnv to set GODEBUG=schedtrace=100,scheddetail=1 and go test -test.run=TestCgoCCodeSIGPROF. Sorry for the wasted effort.

@ianlancetaylor
Copy link
Contributor

Looking at the test code I realized that there is a useless loop that could potentially cause an infinite loop. Please see if the problem still reproduces with https://golang.org/cl/24151. Thanks.

@gopherbot
Copy link

CL https://golang.org/cl/24151 mentions this issue.

@mikioh
Copy link
Contributor

mikioh commented Jun 16, 2016

@ianlancetaylor, CL 24151 doesn't fix the issue.

go test -short
--- FAIL: TestCgoCCodeSIGPROF (60.10s)
        crash_test.go:105: testprogcgo CgoCCodeSIGPROF exit status: exit status 2
        crash_cgo_test.go:213: expected "OK\n" got SIGQUIT: quit
                PC=0x1c056c27 m=4
(snip)

So far it's likely that https://go-review.googlesource.com/20219 is closely related to the issues because both #14732 and #14809 go versions include CL 20219.

@ianlancetaylor
Copy link
Contributor

Thanks for checking. CL 24151 still seems like a good idea anyhow.

20219 is where the failing test was added, so there is a sense in which it is the cause of this issue. It doesn't really help us solve the problem, though. I'm confident that the non-test change of 20219 is not causing any trouble.

gopherbot pushed a commit that referenced this issue Jun 16, 2016
I verified that the test fails if I undo the change that it tests for.

Updates #14732.

Change-Id: Ib30352580236adefae946450ddd6cd65a62b7cdf
Reviewed-on: https://go-review.googlesource.com/24151
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Mikio Hara <mikioh.mikioh@gmail.com>
@ianlancetaylor
Copy link
Contributor

@mikioh How easy is it to reproduce the problem on OpenBSD/386? I don't see any failures on the builder. I have not been able to reproduce this myself on any system.

Any luck getting the schedule trace of the child program per the comment above?

@mikioh mikioh changed the title runtime: TestCgoCCodeSIGPROF fails on RHEL7.2 7.2 3.10.0-313.el7.ppc64le runtime: TestCgoCCodeSIGPROF fails on RHEL7.2 7.2 3.10.0-313.el7.ppc64le Jun 18, 2016
@mikioh
Copy link
Contributor

mikioh commented Jun 23, 2016

@mikioh How easy is it to reproduce the problem on OpenBSD/386?

Pretty much on my openbsd-386 vm.

real mem  = 1073168384 (1023MB)
avail mem = 1040044032 (991MB)

cpu0: Intel(R) Core(TM) i7-3635QM CPU @ 2.40GHz ("GenuineIntel" 686-class) 3.23 GHz
npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16

OpenBSD openbsd-386.local 5.9 GENERIC.MP#1616 i386

Any luck getting the schedule trace of the child program per the comment above?

Will try this weekend.

@mikioh
Copy link
Contributor

mikioh commented Jun 24, 2016

Here you go; TestCgoCCodeSIGPROF.txt

@ianlancetaylor
Copy link
Contributor

Thanks for the scheduling trace. It shows that GOMAXPROCS is 1. The single P is in status Psyscall. It has no associated M.

There are 7 M's. 6 have no associated P. M1 is associated with P0. M1 is locked to goroutine 17. The stack trace at the end shows only

goroutine 17 [runnable, locked to thread]:
runtime.goexit()
    /go/src/runtime/asm_386.s:1612 +0x1

This is probably the goroutine started by CgoCCodeSIGPROF making a cgo call (we might get more information if we had set GOTRACEBACK=system).

Note that although goroutine 17 is shown above as runnable, up until the very end of the trace its state is Gsyscall, suggesting that it is in the middle of the cgo call.

So P0.m == -1 while M1.p == 0. M1 is locked to G17. P0 and G17 are in syscall state. P0's syscalltick field increases steadily.

I may see part of the problem. The program has a loop that just calls into C. Because it keeps making calls into C, syscalltick keeps going up. Because of that, the sysmon thread never sees the thread as blocked. But almost every time the sysmon thread runs, the P is in syscall state, so the sysmon thread also never sees the thread as having run for too long.

Before https://golang.org/cl/24151 I think that would have been enough to hang the program. Before that CL, the program was in an infinite loop that will not get preempted. On a system with GOMAXPROCS == 1, that could cause this failure.

However, you are seeing this problem even after CL 24151, and I don't have an explanation for that. After that CL the program should not hang, because it should not run for more than 1 second.

Does it fix the problem if you add a call runtime.GOMAXPROCS(2) at the start of the function CgoCCodeSIGPROF in runtime/testdata/testprogcgo/aprof.go?

@mikioh
Copy link
Contributor

mikioh commented Jun 29, 2016

Does it fix the problem if you...

Sure, looks like export GOMAXPROCS=2 fixes the issue on single-core CPU hosts.

@ianlancetaylor
Copy link
Contributor

Still don't know what is happening here. Postponing to 1.8.

@ianlancetaylor ianlancetaylor modified the milestones: Go1.8Early, Go1.7Maybe Jul 7, 2016
@YoshikiShibata
Copy link

Sometimes I have a similar problem when I rebuild from the tip, on Mac OS X (10.11.5) on MacBook 12:

GOMAXPROCS=2 runtime -cpu=1,2,4

--- FAIL: TestCgoCCodeSIGPROF (0.72s)
crash_test.go:105: testprogcgo CgoCCodeSIGPROF exit status: exit status 2
crash_cgo_test.go:213: expected "OK\n" got runtime: invalid pc-encoded table f=runtime.sigtramp pc=0x405e52d targetpc=0x405e52d tab=[0/0]0x0
value=0 until pc=0x405e504
value=40 until pc=0x405e52d
fatal error: invalid runtime symbol table

    goroutine 0 [idle]:
    invalid spdelta runtime.sigtramp 0x405e500 0x405e52d 0x32b2a -1
    runtime.throw(0x40e0c58, 0x1c)
        /Users/yoshiki/tools/go/src/runtime/panic.go:566 +0x95
    runtime.pcvalue(0x411f1d0, 0x32b2a, 0x405e52d, 0xc420031480, 0x1, 0x98)
        /Users/yoshiki/tools/go/src/runtime/symtab.go:472 +0x4cd
    runtime.funcspdelta(0x411f1d0, 0x405e52d, 0xc420031480, 0x98)
        /Users/yoshiki/tools/go/src/runtime/symtab.go:511 +0x4f
    runtime.gentraceback(0x405e465, 0x7000000807b0, 0x0, 0xc4200004e0, 0x0, 0xc4200316b0, 0x40, 0x0, 0x0, 0x6, ...)
        /Users/yoshiki/tools/go/src/runtime/traceback.go:264 +0x163b
    runtime.sigprof(0x405e465, 0x7000000807b0, 0x0, 0xc4200004e0, 0xc420028000)
        /Users/yoshiki/tools/go/src/runtime/proc.go:3116 +0x17e
    runtime.sighandler(0xc40000001b, 0xc420031e80, 0xc420031ee8, 0xc4200004e0)
        /Users/yoshiki/tools/go/src/runtime/signal_amd64x.go:49 +0xb3
    runtime.sigtrampgo(0x4041820, 0x1b0000001e, 0xc420031e80, 0xc420031ee8)
        /Users/yoshiki/tools/go/src/runtime/signal_darwin.go:91 +0x281
    invalid spdelta runtime.sigtramp 0x405e500 0x405e52d 0x32b2a -1
    runtime.sigtramp(0x1b0000001e00, 0xc420031e8000, 0xc420031ee800, 0x7000000807f800, 0x0, 0x0, 0x0, 0x0, 0x1490000008500, 0x1089ea00002, ...)
        /Users/yoshiki/tools/go/src/runtime/sys_darwin_amd64.s:246 +0x2d

    goroutine 17 [syscall, locked to thread]:
    runtime.goexit()
        /Users/yoshiki/tools/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc420036fb8 sp=0xc420036fb0

    goroutine 1 [chan receive]:
    main.CgoCCodeSIGPROF()
        /Users/yoshiki/tools/go/src/runtime/testdata/testprogcgo/aprof.go:49 +0xf3
    main.main()
        /Users/yoshiki/tools/go/src/runtime/testdata/testprogcgo/main.go:34 +0x1f1

    goroutine 19 [syscall]:
    os/signal.signal_recv(0x0)
        /Users/yoshiki/tools/go/src/runtime/sigqueue.go:116 +0x157
    os/signal.loop()
        /Users/yoshiki/tools/go/src/os/signal/signal_unix.go:22 +0x22
    created by os/signal.init.1
        /Users/yoshiki/tools/go/src/os/signal/signal_unix.go:28 +0x41

    goroutine 20 [runnable]:
    main._Cfunc_GoNop()
        _/Users/yoshiki/tools/go/src/runtime/testdata/testprogcgo/_obj/_cgo_gotypes.go:118
    main.CgoCCodeSIGPROF.func1(0xc42005c060)
        /Users/yoshiki/tools/go/src/runtime/testdata/testprogcgo/aprof.go:41 +0xaa
    created by main.CgoCCodeSIGPROF
        /Users/yoshiki/tools/go/src/runtime/testdata/testprogcgo/aprof.go:44 +0x61

    goroutine 21 [syscall]:
    runtime.CPUProfile(0x0, 0x0, 0x0)
        /Users/yoshiki/tools/go/src/runtime/cpuprof.go:447 +0x2d
    runtime/pprof.profileWriter(0x4145240, 0xc420084000)
        /Users/yoshiki/tools/go/src/runtime/pprof/pprof.go:615 +0x26
    created by runtime/pprof.StartCPUProfile
        /Users/yoshiki/tools/go/src/runtime/pprof/pprof.go:609 +0x122

    goroutine 34 [syscall, locked to thread]:
    runtime.goexit()
        /Users/yoshiki/tools/go/src/runtime/asm_amd64.s:2086 +0x1

FAIL
FAIL runtime 61.228s
2016/07/20 21:09:05 Failed: exit status 1

@ianlancetaylor
Copy link
Contributor

@YoshikiShibata That is a different problem. Could you open a new issue for it? Thanks.

@quentinmit quentinmit added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 28, 2016
@quentinmit
Copy link
Contributor

@ianlancetaylor We're halfway through Go 1.8 now; have you been able to investigate this bug yet?

@rsc rsc added this to the Go1.8 milestone Oct 20, 2016
@rsc rsc removed this from the Go1.8Early milestone Oct 20, 2016
@rsc rsc modified the milestones: Go1.9, Go1.8 Nov 11, 2016
@ianlancetaylor
Copy link
Contributor

I still don't see any way this could possibly happen.

@mikioh If you can still recreate this, can you get an strace -f of the failing program? You should be able to recreate it using

cd runtime/testdata/testprogcgo
go build
strace -f ./testprogcgo CgoCCodeSIGPROF

@mikioh
Copy link
Contributor

mikioh commented Nov 20, 2016

With either GOMAXPROCS=1 or greater than 1, CgoCCodeSIGPROF with tip now passes on openbsd/386.

@ianlancetaylor
Copy link
Contributor

Thanks! I'm going to close this unless and until someone reports a reproduction on tip.

@golang golang locked and limited conversation to collaborators Nov 22, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

8 participants