cmd/internal/obj: rework stack unwind metadata on LR machines #53609

cherrymui · 2022-06-29T18:41:43Z

On LR machines, our function prologue generally looks like (stack bounds check and frame pointer handling omitted): store the LR at -frame_size(SP), then decrement SP to SP-frame_size (if they cannot be done on the same instruction, either due to the lack of the instruction on the architecture or the stack frame too large).

If we decrement the SP first, if a signal arrives immediately after the instruction before saving the LR, the runtime will see the junk value at the LR slot and will fail to unwind the stack. So we store the LR first then decrement the SP.

This generally works. But in some cases if the signal stack is not set (e.g. #53374 , also iOS doesn't support sigaltstack), the signal will arrive on the current stack and the kernel will push the signal frame immediately below the SP, clobbering our saved LR. In those cases we have to store the LR again after decrementing the SP. This makes our function prologue a bit inefficient.

We can avoid the problem if we expand the stack unwind metadata to express "the SP has decremented but the return address is still in the LR register". Currently, the metadata is just about SP delta and the runtime always assumes the return address can be found at 0(SP) (for non-leaf functions). With the expanded metadata, we can decrement SP and then store the LR.

This also makes our prologue more similar to C functions.

I plan to do it in Go 1.20.

The text was updated successfully, but these errors were encountered:

gopherbot · 2022-06-29T18:48:34Z

Change https://go.dev/cl/412474 mentions this issue: cmd/internal/obj/arm64: save LR and SP in one instruction for small frames

…rames When we create a thread with signals blocked. But glibc's pthread_sigmask doesn't really allow us to block SIGSETXID. So we may get a signal early on before the signal stack is set. If we get a signal on the current stack, it will clobber anything below the SP. This CL makes it to save LR and decrement SP in a single MOVD.W instruction for small frames, so we don't write below the SP. We used to use a single MOVD.W instruction before CL 379075. CL 379075 changed to use an STP instruction to save the LR and FP, then decrementing the SP. This CL changes it back, just this part (epilogues and large frame prologues are unchanged). For small frames, it is the same number of instructions either way. This decreases the size of a "small" frame from 0x1f0 to 0xf0. For frame sizes in between, it could benefit from using an STP instruction instead of using the prologue for the "large" frame case. We don't bother it for now as this is a stop-gap solution anyway. This only addresses the issue with small frames. Luckily, all functions from thread entry to setting up the signal stack have samll frames. Other possible ideas: - Expand the unwind info metadata, separate SP delta and the location of the return address, so we can express "SP is decremented but the return address is in the LR register". Then we can always create the frame first then write the LR, without writing anything below the SP (except the frame pointer at SP-8, which is minor because it doesn't really affect program execution). - Set up the signal stack immediately in mstart in assembly. For Go 1.19 we do this simple fix. We plan to do the metadata fix in Go 1.20 ( #53609 ). Other LR architectures are addressed in CL 413428. Fix #53374. Change-Id: I9d6582ab14ccb06ac61ad43852943d9555e22ae5 Reviewed-on: https://go-review.googlesource.com/c/go/+/412474 Run-TryBot: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Eric Fang <eric.fang@arm.com>

…rames When we create a thread with signals blocked. But glibc's pthread_sigmask doesn't really allow us to block SIGSETXID. So we may get a signal early on before the signal stack is set. If we get a signal on the current stack, it will clobber anything below the SP. This CL makes it to save LR and decrement SP in a single MOVD.W instruction for small frames, so we don't write below the SP. We used to use a single MOVD.W instruction before CL 379075. CL 379075 changed to use an STP instruction to save the LR and FP, then decrementing the SP. This CL changes it back, just this part (epilogues and large frame prologues are unchanged). For small frames, it is the same number of instructions either way. This decreases the size of a "small" frame from 0x1f0 to 0xf0. For frame sizes in between, it could benefit from using an STP instruction instead of using the prologue for the "large" frame case. We don't bother it for now as this is a stop-gap solution anyway. This only addresses the issue with small frames. Luckily, all functions from thread entry to setting up the signal stack have samll frames. Other possible ideas: - Expand the unwind info metadata, separate SP delta and the location of the return address, so we can express "SP is decremented but the return address is in the LR register". Then we can always create the frame first then write the LR, without writing anything below the SP (except the frame pointer at SP-8, which is minor because it doesn't really affect program execution). - Set up the signal stack immediately in mstart in assembly. For Go 1.19 we do this simple fix. We plan to do the metadata fix in Go 1.20 ( golang#53609 ). Other LR architectures are addressed in CL 413428. Fix golang#53374. Change-Id: I9d6582ab14ccb06ac61ad43852943d9555e22ae5 Reviewed-on: https://go-review.googlesource.com/c/go/+/412474 Run-TryBot: Cherry Mui <cherryyz@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com> Reviewed-by: Eric Fang <eric.fang@arm.com>

gopherbot · 2022-08-16T17:07:48Z

Change https://go.dev/cl/424254 mentions this issue: runtime: drop function context from traceback

Currently, gentraceback tracks the closure context of the outermost frame. This used to be important for "unstarted" calls to reflect function stubs, where "unstarted" calls are either deferred functions or the entry-point of a goroutine that hasn't run. Because reflect function stubs have a dynamic argument map, we have to reach into their closure context to fetch to map, and how to do this differs depending on whether the function has started. This was discovered in issue #25897. However, as part of the register ABI, "go" and "defer" were made much simpler, and any "go" or "defer" of a function that takes arguments or returns results gets wrapped in a closure that provides those arguments (and/or discards the results). Hence, we'll see that closure instead of a direct call to a reflect stub, and can get its static argument map without any trouble. The one case where we may still see an unstarted reflect stub is if the function takes no arguments and has no results, in which case the compiler can optimize away the wrapper closure. But in this case we know the argument map is empty: the compiler can apply this optimization precisely because the target function has no argument frame. As a result, we no longer need to track the closure context during traceback, so this CL drops all of that mechanism. We still have to be careful about the unstarted case because we can't reach into the function's locals frame to pull out its context (because it has no locals frame). We double-check that in this case we're at the function entry. I would prefer to do this with some in-code PCDATA annotations of where to find the dynamic argument map, but that's a lot of mechanism to introduce for just this. It might make sense to consider this along with #53609. Finally, we beef up the test for this so it more reliably forces the runtime down this path. It's fundamentally probabilistic, but this tweak makes it better. Scheduler testing hooks (#54475) would make it possible to write a reliable test for this. For #54466, but it's a nice clean-up all on its own. Change-Id: I16e4f2364ba2ea4b1fec1e27f971b06756e7b09f Reviewed-on: https://go-review.googlesource.com/c/go/+/424254 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by: Michael Pratt <mpratt@google.com> Auto-Submit: Austin Clements <austin@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com>

cherrymui · 2022-12-05T20:23:34Z

Didn't get chance to do it in 1.20. Will do in 1.21.

cherrymui self-assigned this Jun 29, 2022

cherrymui added the NeedsFix The path to resolution is known, but the work has not been done. label Jun 29, 2022

cherrymui added this to the Go1.20 milestone Jun 29, 2022

gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 13, 2022

cherrymui modified the milestones: Go1.20, Go1.21 Dec 5, 2022

cherrymui modified the milestones: Go1.21, Go1.22 Jun 1, 2023

cherrymui modified the milestones: Go1.22, Go1.23 Nov 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmd/internal/obj: rework stack unwind metadata on LR machines #53609

cmd/internal/obj: rework stack unwind metadata on LR machines #53609

cherrymui commented Jun 29, 2022

gopherbot commented Jun 29, 2022

gopherbot commented Aug 16, 2022

cherrymui commented Dec 5, 2022

cmd/internal/obj: rework stack unwind metadata on LR machines #53609

cmd/internal/obj: rework stack unwind metadata on LR machines #53609

Comments

cherrymui commented Jun 29, 2022

gopherbot commented Jun 29, 2022

gopherbot commented Aug 16, 2022

cherrymui commented Dec 5, 2022