Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.13.5 version, sometimes run subprocess hang #39249

Closed
yutongprogram opened this issue May 26, 2020 · 6 comments
Closed

1.13.5 version, sometimes run subprocess hang #39249

yutongprogram opened this issue May 26, 2020 · 6 comments
Labels
FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.

Comments

@yutongprogram
Copy link

yutongprogram commented May 26, 2020

What version of Go are you using (go version)?

1.13.5

What did you do?

cmd := exec.CommandContext(ctx, "bash", "-c", "xxx")
cmd.Run() hang

What did you expect to see?

image

gdb parent prcoess,all thread futex in /usr/local/src/runtime/sys_linux_amd64.s:536
image

strace parent process's thread
image

@davecheney
Copy link
Contributor

Can you please

  1. Provide a short, runnable, code sample that exhibits the problem.
  2. Assert this issue is present with the latest version of go 1.14

@davecheney davecheney added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label May 26, 2020
@yutongprogram
Copy link
Author

yutongprogram commented May 26, 2020

I can't reproduce bugs steadily. I have checked syslog and hardware, nothing useful.
Maybe someone have meet this problem.
So any idea?

It's a deamon process to run user's cmd
code demo

package main

import (
	"context"
	"fmt"
	"log"
	"os"
	"os/exec"
	"os/signal"
	"strings"
	"syscall"
)

func main() {
	ctx, cancel := context.WithCancel(context.TODO())

	cmdStr := strings.Join(os.Args[1:], "")

	cmd := exec.CommandContext(ctx, "bash", "-c", cmdStr)

	exitCh := make(chan int)
	go runJob(cmd, cancel, exitCh)

	var exitCode = 0
	select {
	case exitCode = <-exitCh:
		log.Println("Exit Code: ", exitCode)
	}

	// If pid is negative, but not -1, sig shall be sent to all processes (excluding an unspecified set of system processes)
	// whose process group ID is equal to the absolute value of pid, and for which the process has permission to send a signal.
	err := syscall.Kill(-cmd.Process.Pid, syscall.SIGKILL)
	if err != nil {
		fmt.Println("Warning sending SIGKILL to customer processes group, err: ", err.Error())
	}

	os.Exit(exitCode)
}

func runJob(cmd *exec.Cmd, cancel context.CancelFunc, exitCh chan<- int) {
	defer func() {
		if r := recover(); r != nil {
			// Recovered from panic. This is just a safe net.
			log.Println("Panic when running job! ", r)
		}
		cancel()
	}()

	cmd.SysProcAttr = &syscall.SysProcAttr{Setpgid: true}
	cmd.Stderr = os.Stderr
	cmd.Stdout = os.Stdout

	go signalDeamon(cmd)

	if err := cmd.Run(); err != nil {
		log.Println("Run cmd error:", err.Error())
	}

	success := cmd.ProcessState != nil && cmd.ProcessState.Success()
	defer func(ok bool) {
		var exitCode = 0
		w := cmd.ProcessState.Sys().(syscall.WaitStatus)
		if w.Exited() {
			exitCode = w.ExitStatus()
		} else if w.Signaled() {
			exitCode = int(w.Signal())
		} else {
			exitCode = 1
		}

		exitCh <- exitCode
	}(success)

}

func signalDeamon(cmd *exec.Cmd) {
	c := make(chan os.Signal, 1)
	signal.Notify(c, syscall.SIGHUP, syscall.SIGINT, syscall.SIGTERM)
	s := <-c

	_ = cmd.Process.Signal(s)
}

@davecheney
Copy link
Contributor

davecheney commented May 26, 2020

Thank you for providing a sample. There are a few things that can probably be used to cut down this reproduced to the core issue

Can you do away with the signalDeamon code. If you don't want the parent process to handle those signals you can do something like this

c := make(chan os.Signal, 1)
signal.Notify(c, syscall.SIGHUP, syscall.SIGINT, syscall.SIGTERM)
go func() { <- c }

The child will still respond to those signals and w.Signaled() will report true.

@davecheney
Copy link
Contributor

Also consider returning the exit code directly from runJob

	go runJob(cmd, cancel, exitCh)

	var exitCode = 0
	select {
	case exitCode = <-exitCh:
		log.Println("Exit Code: ", exitCode)
	}

Is equivalent to

exitCode := runJob(cmd, cancel)

As the main goroutine cannot move past the select block until something is sent over exitCh

@yutongprogram
Copy link
Author

I'll try, thanks. if any idea, please @yutongprogram , thinkyou

@mvdan
Copy link
Member

mvdan commented Jun 15, 2021

Closing old issues that still have the WaitingForInfo label where enough details to investigate weren't provided. Feel free to leave a comment with more details and we can reopen.

Also, see https://golang.org/wiki/Questions.

@mvdan mvdan closed this as completed Jun 15, 2021
@golang golang locked and limited conversation to collaborators Jun 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

4 participants