Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: SIGSEGV in mstart #47441

Closed
prattmic opened this issue Jul 28, 2021 · 4 comments · Fixed by ferrmin/go#117
Closed

runtime: SIGSEGV in mstart #47441

prattmic opened this issue Jul 28, 2021 · 4 comments · Fixed by ferrmin/go#117
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker
Milestone

Comments

@prattmic
Copy link
Member

On some Google internal workloads on 1.17rc1, we are seeing SIGSEGVs of the form:

#4  <C signal handler>
#5  0x000056222cf1049d in runtime.sigfwd () at /root/gc/src/runtime/sys_linux_amd64.s:327
#6  0x00007f596f4bf6c0 in ?? ()
#7  0x000056222cef2174 in runtime.sigfwdgo (sig=11, info=0x7f596f4bf8f0, ctx=0x7f596f4bf7c0) at /root/gc/src/runtime/signal_unix.go:1032
#8  0x000056222cef0927 in runtime.sigtrampgo (sig=11, info=0x7f596f4bf8f0, ctx=0x56222ee116ad <sys_gettid+13>) at /root/gc/src/runtime/signal_unix.go:418
#9  0x000056222cf10ff0 in runtime.sigtrampgo (sig=11, info=0x7f596f4bf8f0, ctx=0x7f596f4bf7c0) at <autogenerated>:1
#10 0x000056222cf104fd in runtime.sigtramp () at /root/gc/src/runtime/sys_linux_amd64.s:344
#11 <signal handler called>
#12 0x00007f596f4c0500 in ?? ()
#13 0x000056222cf0c665 in runtime.mstart () at /root/gc/src/runtime/asm_amd64.s:248
#14 0x000056222cf10e45 in runtime.mstart () at <autogenerated>:1
#15 0x000056222dd24440 in crosscall_amd64 () at gcc_amd64.S:40
#16 0x0000000000000000 in ?? ()

0x00007f596f4c0500 seems to be an address on the C stack calling mstart, indicating we jumped into the stack.

We don't have many details yet, this is still being investigated. If anyone else has seen similar crashes on 1.17, we'd love to hear.

cc @ianlancetaylor @cherrymui @aclements

@prattmic prattmic added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker labels Jul 28, 2021
@prattmic prattmic added this to the Go1.17 milestone Jul 28, 2021
@ianlancetaylor
Copy link
Contributor

@cherrymui found the problem: a preemption can occur after the call to unlockOSThread in unwindm before g.m.incgo = true in cgocallbackg. That can cause the G to move to a new M, which breaks the assumptions of cgocallbackg.

The preemption point is the first defer in cgocallbackg1. When that deferred function runs (after the second defer, of unwindm), it can be preempted. I believe that this is due to the regabidefer experiment, which wraps most defer functions. Those wrappers are separate functions, and as such are preemption points.

Here is a test case that recreates the observed problem. In order to recreate the problem the test case uses runtime.SetCgoTraceback so that the first defer in cgocallbackg1 is executed.

foo1.go:

package main

/*
extern void cgoTraceback(void* p);
extern void cgoSymbolizer(void* p);
extern void cgoContext(void* p);
extern void GoFunction(int);

static void callGo(int i) {
	GoFunction(i);
}
*/
import "C"

import (
	"fmt"
	"runtime"
	"sync"
	"unsafe"
)

//export GoFunction
func GoFunction(i C.int) {
	fmt.Sprintf("%d\n", i)
}

func main() {
	runtime.SetCgoTraceback(0, unsafe.Pointer(C.cgoTraceback), unsafe.Pointer(C.cgoContext), unsafe.Pointer(C.cgoSymbolizer))
	const funcs = 1e3
	const calls = 1e5
	var wg sync.WaitGroup
	wg.Add(1)
	for i := 0; i < funcs; i++ {
		go func(i int) {
			defer wg.Done()
			for j := 0; j < calls; j++ {
				C.callGo(C.int(i*calls + j))
			}
		}(i)
	}
	wg.Wait()
}

foo2.c:

#include <stdio.h>
#include <signal.h>
#include <stdint.h>
#include <stdlib.h>
#include <string.h>

static void crash(int signum) {
	printf("caught SIGSEGV\n");
	abort();
}

__attribute__((constructor))
void setSignalHandler() {
	struct sigaction sa;

	memset(&sa, 0, sizeof sa);
	if (sigfillset(&sa.sa_mask) != 0) {
		abort();
	}
	sa.sa_handler = crash;
	if (sigaction(SIGSEGV, &sa, NULL) != 0) {
		abort();
	}
}

void cgoTraceback(void* p) {}
void cgoSymbolizer(void* p) {}

struct contextArg {
	uintptr_t context;
};

void cgoContext(struct contextArg* p) {
	if (p->context == 0) {
		p->context = 1;
	}
}

@gopherbot
Copy link

Change https://golang.org/cl/338197 mentions this issue: runtime: avoid possible preemption when returning from Go to C

@gopherbot
Copy link

Change https://golang.org/cl/338270 mentions this issue: cmd/compile: mark defer wrapper nosplit for runtime and nosplit callee

@gopherbot
Copy link

Change https://golang.org/cl/359796 mentions this issue: runtime: add always-preempt maymorestack hook

gopherbot pushed a commit that referenced this issue Nov 5, 2021
This adds a maymorestack hook that forces a preemption at every
possible cooperative preemption point. This would have helped us catch
several recent preemption-related bugs earlier, including #47302,
 #47304, and #47441.

For #48297.

Change-Id: Ib82c973589c8a7223900e1842913b8591938fb9f
Reviewed-on: https://go-review.googlesource.com/c/go/+/359796
Trust: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: David Chase <drchase@google.com>
@golang golang locked and limited conversation to collaborators Oct 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants