Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: misc/cgo/test.TestSetgid panic with unexpected return pc for runtime.sigpanic on linux-arm64-packet since 2022-06-04 #53374

Closed
bcmills opened this issue Jun 14, 2022 · 20 comments
Assignees
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. release-blocker
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Jun 14, 2022

fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x2 addr=0xffff537fd898 pc=0xffff537fd898]

runtime stack:
runtime.throw({0xaaaabb5aaf4f?, 0x8?})
	/workdir/go/src/runtime/panic.go:1047 +0x40 fp=0xffff537fd860 sp=0xffff537fd830 pc=0xaaaabb4836d0
runtime: g 0: unexpected return pc for runtime.sigpanic called from 0xffff537fd898
stack: frame={sp:0xffff537fd860, fp:0xffff537fd890} stack=[0xffff52ffdd00,0xffff537fd900)
0x0000ffff537fd760:  0x0100aaaabb483c90  0x000000000000000a 
0x0000ffff537fd770:  0x000000000000001f  0x0000ffff537fd898 
0x0000ffff537fd780:  0x0000ffff537fd898  0x0000000000000002 
0x0000ffff537fd790:  0x0000aaaabb4836d0 <runtime.throw+0x0000000000000040>  0x0000ffff537fd830 
0x0000ffff537fd7a0:  0x0000aaaabb5a811e  0x0000ffff537fd7e8 
0x0000ffff537fd7b0:  0x0000aaaabb483968 <runtime.fatalthrow+0x0000000000000058>  0x00000040001021a0 
0x0000ffff537fd7c0:  0x0000aaaabb484f18 <runtime.printunlock+0x0000000000000048>  0x0000000000000001 
0x0000ffff537fd7d0:  0x0000ffff537fd830  0x0000aaaabb4836d0 <runtime.throw+0x0000000000000040> 
0x0000ffff537fd7e0:  0x00000040001021a0  0x0000ffff537fd828 
0x0000ffff537fd7f0:  0x0000aaaabb4836d0 <runtime.throw+0x0000000000000040>  0x0000ffff537fd808 
0x0000ffff537fd800:  0x00000040001021a0  0x0000aaaabb483980 <runtime.fatalthrow.func1+0x0000000000000000> 
0x0000ffff537fd810:  0x00000040001021a0  0x0000aaaabb4836d0 <runtime.throw+0x0000000000000040> 
0x0000ffff537fd820:  0x0000ffff537fd830  0x0000ffff537fd858 
0x0000ffff537fd830:  0x0000aaaabb49ac24 <runtime.sigpanic+0x00000000000001e4>  0x0000ffff537fd840 
0x0000ffff537fd840:  0x0000aaaabb4836e0 <runtime.throw.func1+0x0000000000000000>  0x0000aaaabb5aaf4f 
0x0000ffff537fd850:  0x000000000000002a  0x0000ffff537fd898 
0x0000ffff537fd860: <0x0000ffff537fd898  0x0000aaaabb5aaf4f 
0x0000ffff537fd870:  0x0000000000000008  0x0000000000000002 
0x0000ffff537fd880:  0x0000000000000000  0x0000ffff537fd898 
0x0000ffff537fd890: >0x0000ffff537fd898  0x0000aaaabb4805ac <runtime.minit+0x000000000000001c> 
0x0000ffff537fd8a0:  0x0000aaaabb48894c <runtime.mstart1+0x000000000000004c>  0x0000ffff812f7000 
0x0000ffff537fd8b0:  0x0000ffff58b270d0  0x0000ffff537fd8e8 
0x0000ffff537fd8c0:  0x0000aaaabb4888d8 <runtime.mstart0+0x0000000000000068>  0x0000ffff812f7000 
0x0000ffff537fd8d0:  0x0000000000000000  0x0000ffff812c7a98 
0x0000ffff537fd8e0:  0x00000040001021a0  0x0000ffff537fd918 
0x0000ffff537fd8f0:  0x0000aaaabb4b3a00 <runtime.mstart+0x0000000000000010>  0x000000770000006e 
runtime.sigpanic()
	/workdir/go/src/runtime/signal_unix.go:819 +0x1e4 fp=0xffff537fd890 sp=0xffff537fd860 pc=0xaaaabb49ac24

goroutine 1 [chan receive]:
runtime.gopark(0x4000091a68?, 0xaaaabb4527c4?, 0x78?, 0x1a?, 0xaaaabb45a20c?)
	/workdir/go/src/runtime/proc.go:363 +0xe4 fp=0x40000919f0 sp=0x40000919d0 pc=0xaaaabb4860a4
runtime.chanrecv(0x4000128070, 0x4000091aff, 0x1)
	/workdir/go/src/runtime/chan.go:583 +0x454 fp=0x4000091a80 sp=0x40000919f0 pc=0xaaaabb453b14
runtime.chanrecv1(0x4000112060?, 0x1?)
	/workdir/go/src/runtime/chan.go:442 +0x14 fp=0x4000091ab0 sp=0x4000091a80 pc=0xaaaabb4536b4
testing.(*T).Run(0x40001024e0, {0xaaaabb5a1d6a?, 0x92716c0d48a5e?}, 0xaaaabb62cd10)
	/workdir/go/src/testing/testing.go:1494 +0x314 fp=0x4000091b50 sp=0x4000091ab0 pc=0xaaaabb516ed4
testing.runTests.func1(0x0?)
	/workdir/go/src/testing/testing.go:1846 +0x70 fp=0x4000091ba0 sp=0x4000091b50 pc=0xaaaabb518b40
testing.tRunner(0x40001024e0, 0x4000091cb8)
	/workdir/go/src/testing/testing.go:1446 +0x10c fp=0x4000091bf0 sp=0x4000091ba0 pc=0xaaaabb51623c
testing.runTests(0x400011e000?, {0xaaaabb71b600, 0x60, 0x60}, {0x4000091d28?, 0xa0aaaabb4635e8?, 0xaaaabb735c00?})
	/workdir/go/src/testing/testing.go:1844 +0x3f0 fp=0x4000091ce0 sp=0x4000091bf0 pc=0xaaaabb518a00
testing.(*M).Run(0x400011e000)
	/workdir/go/src/testing/testing.go:1726 +0x4f0 fp=0x4000091ee0 sp=0x4000091ce0 pc=0xaaaabb5177a0
main.main()
	_testmain.go:243 +0x1d0 fp=0x4000091f70 sp=0x4000091ee0 pc=0xaaaabb59bc90
runtime.main()
	/workdir/go/src/runtime/proc.go:250 +0x24c fp=0x4000091fd0 sp=0x4000091f70 pc=0xaaaabb485cdc
runtime.goexit()
	/workdir/go/src/runtime/asm_arm64.s:1155 +0x4 fp=0x4000091fd0 sp=0x4000091fd0 pc=0xaaaabb4b60e4

goroutine 2 [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/workdir/go/src/runtime/proc.go:363 +0xe4 fp=0x4000042fa0 sp=0x4000042f80 pc=0xaaaabb4860a4
runtime.goparkunlock(...)
	/workdir/go/src/runtime/proc.go:369
runtime.forcegchelper()
	/workdir/go/src/runtime/proc.go:302 +0xac fp=0x4000042fd0 sp=0x4000042fa0 pc=0xaaaabb485f3c
runtime.goexit()
	/workdir/go/src/runtime/asm_arm64.s:1155 +0x4 fp=0x4000042fd0 sp=0x4000042fd0 pc=0xaaaabb4b60e4
created by runtime.init.6
	/workdir/go/src/runtime/proc.go:290 +0x24

goroutine 3 [GC sweep wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
	/workdir/go/src/runtime/proc.go:363 +0xe4 fp=0x4000043770 sp=0x4000043750 pc=0xaaaabb4860a4
runtime.goparkunlock(...)
	/workdir/go/src/runtime/proc.go:369
runtime.bgsweep(0x0?)
	/workdir/go/src/runtime/mgcsweep.go:278 +0xa4 fp=0x40000437b0 sp=0x4000043770 pc=0xaaaabb4714e4
runtime.gcenable.func1()
	/workdir/go/src/runtime/mgc.go:178 +0x28 fp=0x40000437d0 sp=0x40000437b0 pc=0xaaaabb465af8
runtime.goexit()
	/workdir/go/src/runtime/asm_arm64.s:1155 +0x4 fp=0x40000437d0 sp=0x40000437d0 pc=0xaaaabb4b60e4
created by runtime.gcenable
	/workdir/go/src/runtime/mgc.go:178 +0x74

goroutine 4 [GC scavenge wait]:
runtime.gopark(0x400006c000?, 0xaaaabb5dc9f0?, 0x1?, 0x0?, 0x0?)
	/workdir/go/src/runtime/proc.go:363 +0xe4 fp=0x4000043f50 sp=0x4000043f30 pc=0xaaaabb4860a4
runtime.goparkunlock(...)
	/workdir/go/src/runtime/proc.go:369
runtime.(*scavengerState).park(0xaaaabb735c80)
	/workdir/go/src/runtime/mgcscavenge.go:389 +0x5c fp=0x4000043f80 sp=0x4000043f50 pc=0xaaaabb46f4cc
runtime.bgscavenge(0x0?)
	/workdir/go/src/runtime/mgcscavenge.go:617 +0x44 fp=0x4000043fb0 sp=0x4000043f80 pc=0xaaaabb46fa44
runtime.gcenable.func2()
	/workdir/go/src/runtime/mgc.go:179 +0x28 fp=0x4000043fd0 sp=0x4000043fb0 pc=0xaaaabb465a98
runtime.goexit()
	/workdir/go/src/runtime/asm_arm64.s:1155 +0x4 fp=0x4000043fd0 sp=0x4000043fd0 pc=0xaaaabb4b60e4
created by runtime.gcenable
	/workdir/go/src/runtime/mgc.go:179 +0xb8

goroutine 5 [finalizer wait]:
runtime.gopark(0xaaaabb736440?, 0x40000036c0?, 0x0?, 0x0?, 0x1?)
	/workdir/go/src/runtime/proc.go:363 +0xe4 fp=0x4000042580 sp=0x4000042560 pc=0xaaaabb4860a4
runtime.goparkunlock(...)
	/workdir/go/src/runtime/proc.go:369
runtime.runfinq()
	/workdir/go/src/runtime/mfinal.go:180 +0x120 fp=0x40000427d0 sp=0x4000042580 pc=0xaaaabb464d20
runtime.goexit()
	/workdir/go/src/runtime/asm_arm64.s:1155 +0x4 fp=0x40000427d0 sp=0x40000427d0 pc=0xaaaabb4b60e4
created by runtime.createfing
	/workdir/go/src/runtime/mfinal.go:157 +0x84

goroutine 18 [runnable]:
runtime.gopark(0x400003e6c8?, 0x2?, 0xb8?, 0xe5?, 0x400003e6bc?)
	/workdir/go/src/runtime/proc.go:363 +0xe4 fp=0x400003e560 sp=0x400003e540 pc=0xaaaabb4860a4
runtime.selectgo(0x400003e6c8, 0x400003e6b8, 0xaaaabb6586be?, 0x0, 0x51b?, 0x1)
	/workdir/go/src/runtime/select.go:328 +0x684 fp=0x400003e680 sp=0x400003e560 pc=0xaaaabb496d24
misc/cgo/test.runTestSetgid()
	/workdir/go/misc/cgo/test/setgid_linux.go:30 +0xb4 fp=0x400003e6f0 sp=0x400003e680 pc=0xaaaabb592d34
misc/cgo/test.testSetgid(0x4000102680)
	/workdir/go/misc/cgo/test/setgid_linux.go:40 +0x20 fp=0x400003e740 sp=0x400003e6f0 pc=0xaaaabb592de0
misc/cgo/test.TestSetgid(0x0?)
	/workdir/go/misc/cgo/test/cgo_linux_test.go:16 +0x1c fp=0x400003e760 sp=0x400003e740 pc=0xaaaabb55e31c
testing.tRunner(0x4000102680, 0xaaaabb62cd10)
	/workdir/go/src/testing/testing.go:1446 +0x10c fp=0x400003e7b0 sp=0x400003e760 pc=0xaaaabb51623c
testing.(*T).Run.func1()
	/workdir/go/src/testing/testing.go:1493 +0x2c fp=0x400003e7d0 sp=0x400003e7b0 pc=0xaaaabb516f7c
runtime.goexit()
	/workdir/go/src/runtime/asm_arm64.s:1155 +0x4 fp=0x400003e7d0 sp=0x400003e7d0 pc=0xaaaabb4b60e4
created by testing.(*T).Run
	/workdir/go/src/testing/testing.go:1493 +0x300
exit status 2
FAIL	misc/cgo/test	0.029s
2022/06/08 16:27:39 Failed: exit status 1

greplogs -l -e '(?ms)unexpected return pc for runtime\.sigpanic.*exit status 2\nFAIL\s+misc/cgo/test'
2022-06-08T15:47:58-4afb0b9/linux-arm64-packet
2022-06-04T16:11:54-fc66cae/linux-arm64-packet

@bcmills
Copy link
Contributor Author

bcmills commented Jun 14, 2022

(attn @golang/runtime)

Marking as release-blocker for Go 1.19 because this appears to be a regression on linux/arm64, which is a first-class port.

@bcmills bcmills added this to the Go1.19 milestone Jun 14, 2022
@bcmills bcmills added release-blocker NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jun 14, 2022
@prattmic
Copy link
Member

It looks like what happened here is that the stack frame was somehow messed up and we ended up returning to a stack address instead of a valid PC. This caused a SIGSEGV, and while trying to print a traceback we encounted the same messed up frame and reported the "unexpected return pc".

I don't think it was actually sigpanic that made the bad return. I think that is just the frame we injected after the fault occurred.

@aclements
Copy link
Member

aclements commented Jun 15, 2022

I agree with @prattmic 's assessment.

Picking apart 2022-06-04T16:11:54-fc66cae/linux-arm64-packet in detail:

0x0000ffff537fd830: 0x0000aaaabb49ac24 <runtime.sigpanic+0x00000000000001e4> Saved LR from sigpanic => throw("unexpected signal during runtime execution")
0x0000ffff537fd838: 0x0000ffff537fd840 Might be a saved BP
0x0000ffff537fd840: 0x0000aaaabb4836e0 <runtime.throw.func1+0x0000000000000000> Probably an argument to throw => systemstack
0x0000ffff537fd848: 0x0000aaaabb5aaf4f
0x0000ffff537fd850: 0x000000000000002a
0x0000ffff537fd858: 0x0000ffff537fd898 Looks like the saved BP from minit
0x0000ffff537fd860: <0x0000ffff537fd898 Looks like the saved BP from minit
0x0000ffff537fd868: 0x0000aaaabb5aaf4f
0x0000ffff537fd870: 0x0000000000000008
0x0000ffff537fd878: 0x0000000000000002
0x0000ffff537fd880: 0x0000000000000000
0x0000ffff537fd880: 0x0000ffff537fd898 Looks like the saved BP from minit
0x0000ffff537fd890: >0x0000ffff537fd898 Looks like the saved BP from minit

minit frame:
0x0000ffff537fd898: 0x0000aaaabb4805ac <runtime.minit+0x000000000000001c> Should be saved BP from minit, but instead saved LR from minit => minitSignals
0x0000ffff537fd8a0: 0x0000aaaabb48894c <runtime.mstart1+0x000000000000004c> Saved LR from mstart1 => minit; minit leaves RSP pointing here
0x0000ffff537fd8a8: 0x0000ffff812f7000
0x0000ffff537fd8b0: 0x0000ffff58b270d0

mstart1 frame:
0x0000ffff537fd8b8: 0x0000ffff537fd8e8 Saved BP from mstart1
0x0000ffff537fd8c0: 0x0000aaaabb4888d8 <runtime.mstart0+0x0000000000000068> Saved LR from mstart0 => mstart1
0x0000ffff537fd8c8: 0x0000ffff812f7000
0x0000ffff537fd8d0: 0x0000000000000000
0x0000ffff537fd8d8: 0x0000ffff812c7a98
0x0000ffff537fd8e0: 0x00000040001021a0

mstart0 frame:
0x0000ffff537fd8e8: 0x0000ffff537fd918 Saved BP from mstart0
0x0000ffff537fd8f0: 0x0000aaaabb4b3a00 <runtime.mstart+0x0000000000000010> Saved LR from mstart => mstart0
0x0000ffff537fd8f8: 0x000000770000006e

So the mstart0 => mstart1 => minit chain makes sense, but I lose the thread. It looks like we may have called minitSignals, but things aren't quite where I would expect for that.

@aclements
Copy link
Member

Based on the fact that this is happening in TestSetgid, I wonder if the a SIGSETXID came in as a new thread was starting up and we received it the moment minitSignalMask set the signal mask. I haven't yet pieced together if that could actually cause what we're seeing.

@cherrymui
Copy link
Member

I was thinking on that, too, but I also couldn't think of how it could happen. We set up the signal stack before setting the signal mask. So a signal shouldn't clobber the g0 stack.

@aclements
Copy link
Member

It's extra weird because this is a cgo binary, so we should just be using libc's setxid signal handler for all of this. I wonder if the integration with libc on this is messed up on early thread init?

@cherrymui
Copy link
Member

Good point. We start a thread with signals blocked https://cs.opensource.google/go/go/+/master:src/runtime/cgo/gcc_linux_arm64.c;l=28 .
But apparently glibc doesn't really block SIGSETXID? https://sourceware.org/git/?p=glibc.git;a=blob;f=nptl/pthread_sigmask.c;h=20f811cc6b844969eb0e5436659c0ef0655301ea;hb=HEAD#l29

So we can get a signal early on before the signal stack is set?...

@ianlancetaylor
Copy link
Contributor

@cherrymui You're right. That seems entirely possible. Argh.

@aclements
Copy link
Member

Oh no! That does seem plausible. I'm surprised the two failure stacks are so identical, but maybe that just reflects when we made some syscall and the kernel checked the pending signals.

We're running on the g0 stack at this point, so it should be okay to take the signal, but maybe something bad happens in our signal path before or after we hand off to libc?

@gopherbot
Copy link

Change https://go.dev/cl/412474 mentions this issue: cmd/internal/obj/arm64: save LR after decrementing SP

@aclements
Copy link
Member

Cherry, could we test your hypothesis about the signal frame clobbering the stack by doing a self-kill in, say, minit?

@cherrymui
Copy link
Member

The CL above makes it not rely on the LR saved below the SP (even temporarily). Maybe that will fix it.

Or, maybe we can call the actual syscall to block all the signals for thread creation? We'll restore the old signal mask anyway? That may race with a SIGSETXID signal delivering to the parent thread, though, so it may miss the signal and deadlock?

@cherrymui
Copy link
Member

@aclements Maybe. If it is really what the CL above is trying to fix, the signal has to land between the instruction storing the LR and the instruction decrementing the SP, so perhaps it has to be written ... in assembly (or have the assembler insert it)?

@aclements
Copy link
Member

Oh good point. Maybe a stress test where we constantly create new threads and bombard the process with SIGSETXID?

@cherrymui
Copy link
Member

package main

/*
#include <sys/types.h>
#include <unistd.h>
*/
import "C"
import "runtime"

var c = make(chan int, 1000)

func main() {
	for i := 0; i < 1000; i++ {
		go F(i)
	}
	for i := 0; i < 1000; i++ {
		<-c
	}
}

func F(i int) {
	runtime.LockOSThread() // so every goroutine uses a thread
	C.setgid(C.uint(i))
	c <- 1
}

Not sure this is the best way to stress this, but running this on thelinux-arm64-packet builder it has a not-too-small rate to fail with seg fault. Plain seg fault with no extra output, not the unwinder error like above, though.

5s: 12 runs so far, 0 failures
10s: 33 runs so far, 0 failures

/tmp/go-stress-20220615T200050-1378220594


ERROR: signal: segmentation fault


/tmp/go-stress-20220615T200050-1649915091


ERROR: signal: segmentation fault

15s: 50 runs so far, 2 failures (4.00%)

With the CL above, no failure in 1000 run.

I'll see if I can make it more likely to fail.

@cherrymui
Copy link
Member

If I add this to the runtime

diff --git a/runtime/os_linux.go b/tmp/os_linux.go
index 25aea6522d..4328d4371d 100644
--- a/runtime/os_linux.go
+++ b/tmp/os_linux.go
@@ -390,6 +390,10 @@ func gettid() uint32
 // Called to initialize a new m (including the bootstrap m).
 // Called on the new thread, cannot allocate memory.
 func minit() {
+       for i := 0; i < 1000; i++ {
+               dummy()
+       }
+
        minitSignals()
 
        // Cgo-created threads and the bootstrap m are missing a
@@ -398,6 +402,12 @@ func minit() {
        getg().m.procid = uint64(gettid())
 }
 
+//go:noinline
+func dummy() { dummy2() }
+
+//go:noinline
+func dummy2() {}
+
 // Called from dropm to undo the effect of an minit.
 //
 //go:nosplit

The program above fails 100%. I think this is consistent with the theory above.

@prattmic prattmic added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jun 16, 2022
@cherrymui
Copy link
Member

I found that this doesn't fail with Go 1.18, even with the runtime hack above. I think this is due to https://go-review.googlesource.com/c/go/+/379075 changes the prologue. Previously, for small frames it is a single MOVD.W instruction that saves the LR and decrements the SP (followed by an instruction that saves the frame pointer, which isn't really matter here). Now it is two separate instructions. And the functions from mstart to setting the signal stack are probably all have small frames.

So another possibility is to undo CL 379075 for small frames. It is two instructions either way. (Previously MOVD.W storing LR, then MOVD storing FP; now SUB decrementing SP, then STP storing LR and FP.) And make sure all functions we call up to sigaltstack have small frames. (Then maybe we don't need to change the large frame cases.)

@aclements
Copy link
Member

We chatted about this and the plan is to return to the old prologue for small frames as a stop-gap for 1.19, confirm by hand that this generates signal-safe prologues for the handful of functions that run before we have a signal stack, and look into enhancing the traceback metadata for 1.20 to capture ranges where the frame is non-empty but the LR should be retrieved from the LR register instead of the stack.

@gopherbot
Copy link

Change https://go.dev/cl/413428 mentions this issue: cmd/internal/obj/mips,s390x: save LR after decrementing SP

gopherbot pushed a commit that referenced this issue Jun 29, 2022
Following CL 412474, for the rest of the LR architectures. On
MIPS(32/64), S390X, and RISCV, there is no single instruction that
saves the LR and decrements the SP, so we need to insert an
instruction to save the LR after decrementing the SP.

On ARM(32) and PPC64 we already use a single instruction to save
the LR and decrement the SP.

Updates #53374.

Change-Id: I5a2e211026d95edb0e0f7d084ddb784f8077b86d
Reviewed-on: https://go-review.googlesource.com/c/go/+/413428
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
@gopherbot
Copy link

Change https://go.dev/cl/416154 mentions this issue: cmd/internal/obj/loong64: save LR after decrementing SP

jproberts pushed a commit to jproberts/go that referenced this issue Aug 10, 2022
Following CL 412474, for the rest of the LR architectures. On
MIPS(32/64), S390X, and RISCV, there is no single instruction that
saves the LR and decrements the SP, so we need to insert an
instruction to save the LR after decrementing the SP.

On ARM(32) and PPC64 we already use a single instruction to save
the LR and decrement the SP.

Updates golang#53374.

Change-Id: I5a2e211026d95edb0e0f7d084ddb784f8077b86d
Reviewed-on: https://go-review.googlesource.com/c/go/+/413428
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
jproberts pushed a commit to jproberts/go that referenced this issue Aug 10, 2022
…rames

When we create a thread with signals blocked. But glibc's
pthread_sigmask doesn't really allow us to block SIGSETXID. So we
may get a signal early on before the signal stack is set. If we
get a signal on the current stack, it will clobber anything below
the SP. This CL makes it to save LR and decrement SP in a single
MOVD.W instruction for small frames, so we don't write below the
SP.

We used to use a single MOVD.W instruction before CL 379075.
CL 379075 changed to use an STP instruction to save the LR and FP,
then decrementing the SP. This CL changes it back, just this part
(epilogues and large frame prologues are unchanged). For small
frames, it is the same number of instructions either way.

This decreases the size of a "small" frame from 0x1f0 to 0xf0.
For frame sizes in between, it could benefit from using an
STP instruction instead of using the prologue for the "large"
frame case. We don't bother it for now as this is a stop-gap
solution anyway.

This only addresses the issue with small frames. Luckily, all
functions from thread entry to setting up the signal stack have
samll frames.

Other possible ideas:
- Expand the unwind info metadata, separate SP delta and the
  location of the return address, so we can express "SP is
  decremented but the return address is in the LR register". Then
  we can always create the frame first then write the LR, without
  writing anything below the SP (except the frame pointer at SP-8,
  which is minor because it doesn't really affect program
  execution).
- Set up the signal stack immediately in mstart in assembly.

For Go 1.19 we do this simple fix. We plan to do the metadata fix
in Go 1.20 ( golang#53609 ).

Other LR architectures are addressed in CL 413428.

Fix golang#53374.

Change-Id: I9d6582ab14ccb06ac61ad43852943d9555e22ae5
Reviewed-on: https://go-review.googlesource.com/c/go/+/412474
Run-TryBot: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Eric Fang <eric.fang@arm.com>
xen0n pushed a commit to xen0n/go that referenced this issue Mar 17, 2023
Refer to CL 413428 and 412474, for loong64, like mips, s390x and riscv, there
is no single instruction that saves the LR and decrements the SP, so we also
need to insert an instruction to save the LR after decrementing the SP.

Fixes golang#56623.
Updates golang#53374.

Change-Id: I3de040792f0a041d3d2a98ea89c23a2dd2f4ad10
gopherbot pushed a commit that referenced this issue Mar 28, 2023
Refer to CL 413428 and 412474, for loong64, like mips, s390x and riscv, there
is no single instruction that saves the LR and decrements the SP, so we also
need to insert an instruction to save the LR after decrementing the SP.

Fixes #56623.
Updates #53374.

Change-Id: I3de040792f0a041d3d2a98ea89c23a2dd2f4ad10
Reviewed-on: https://go-review.googlesource.com/c/go/+/416154
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Cherry Mui <cherryyz@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: xiaodong liu <teaofmoli@gmail.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
johanbrandhorst pushed a commit to Pryz/go that referenced this issue Mar 28, 2023
Refer to CL 413428 and 412474, for loong64, like mips, s390x and riscv, there
is no single instruction that saves the LR and decrements the SP, so we also
need to insert an instruction to save the LR after decrementing the SP.

Fixes golang#56623.
Updates golang#53374.

Change-Id: I3de040792f0a041d3d2a98ea89c23a2dd2f4ad10
Reviewed-on: https://go-review.googlesource.com/c/go/+/416154
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Cherry Mui <cherryyz@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: xiaodong liu <teaofmoli@gmail.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
xen0n pushed a commit to xen0n/go that referenced this issue May 1, 2023
Refer to CL 413428 and 412474, for loong64, like mips, s390x and riscv, there
is no single instruction that saves the LR and decrements the SP, so we also
need to insert an instruction to save the LR after decrementing the SP.

Fixes golang#56623.
Updates golang#53374.

Change-Id: I3de040792f0a041d3d2a98ea89c23a2dd2f4ad10
Reviewed-on: https://go-review.googlesource.com/c/go/+/416154
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Cherry Mui <cherryyz@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: WANG Xuerui <git@xen0n.name>
Reviewed-by: xiaodong liu <teaofmoli@gmail.com>
Run-TryBot: Cherry Mui <cherryyz@google.com>
Reviewed-by: Meidan Li <limeidan@loongson.cn>
(cherry picked from commit 3ed8a1e)
@golang golang locked and limited conversation to collaborators Jul 6, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. release-blocker
Projects
Development

No branches or pull requests

6 participants