runtime: SIGSEGV in gentraceback during SIGPROF handling of cgo callback #50936
Labels
FrozenDueToAge
NeedsFix
The path to resolution is known, but the work has not been done.
release-blocker
Milestone
When running a goroutine,
mp.curg
points to the user goroutine, andmp.curg.m
points back tomp
.The transition points are:
mp.curg
and thenmp.curg.m
.mp.curg.m
and thenmp.curg
.Thus, there are two small critical sections that may have
mp.curg != nil
butmp.curg.m == nil
.When handling a SIGPROF during cgo execution, we pass
mp.curg
togentraceback
, specifically with the conditionmp.ncgo > 0 && mp.curg != nil && mp.curg.syscallpc != 0 && mp.curg.syscallsp != 0
.gentraceback
then assumesgp.m != nil
for_TraceJumpStack
, and would crash on acurg
in one of these critical sections.This is reachable during
cgocallback
, which callsexitsyscall
without decrementingncgo
[1]. If no Ps are available,exitsyscall
callsexitsyscall0
, which calls bothdropg
and (eventually)execute
. Thus if a SIGPROF lands in the critical sections,gentraceback
will crash [2].I have managed to reproduce this locally with a program that:
(3) makes this difficult for me to make a regression test, but I may be able to eliminate this and just get a lower failure rate.
[1] Standard
cgocall
decrementsncgo
prior toexitsyscall
, though reentrant calls would also make this condition reachable asncgo
will still be > 0.[2] Interestingly, this also meets the qualifications for
cgoSigtramp
to call the C traceback function, which will actually end up doing a traceback of Go code. This isn't a bug, just an odd edge case.cc @cherrymui @aclements @mknyszek
The text was updated successfully, but these errors were encountered: