New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os/exec: cmd.CombinedOutput seems stuck on an io.Copy under s390x #33704
Comments
Hello @q2683252 thank you for filing this issue and welcome to the Go project! So perhaps since there isn't Delve support on s390x, it would be great to know the Process ID (PID) of each process and then while it is seemingly stuck waiting to copy, getting a stacktrace of what is happening e.g. The output of each of those can then show you what's going on so --- orig.go 2019-08-18 23:30:51.000000000 -0600
+++ main.go 2019-08-18 23:30:14.000000000 -0600
@@ -13,8 +13,21 @@
bin := filepath.Join(cur_dir, name)
conf := filepath.Join(cur_dir, fmt.Sprintf("../config/%s.yml", name))
cmd := exec.Command(bin, "-c", conf)
+ buf := new(bytes.Buffer)
+ cmd.Stdout = buf
+ cmd.Stderr = buf
oss_logger.Infof("Athena start %v", cmd)
- cmdOutput, err := cmd.CombinedOutput()
+ if err := cmd.Start(); err != nil {
+ oss_logger.Errorf("%s service start error: %v", name, err)
+ return
+ }
+
+ // This is the PID of the subcommand that we shall
+ // send that "SIQUIT" if it stalls for a long time.
+ oss_logger.Info("%s service PID: %d\n", cmd.Process.Pid)
+
+ err = cmd.Wait()
+ cmdOutput := buf.String()
if err != nil {
oss_logger.Errorf("%s service out put %s", name, cmdOutput)
oss_logger.Errorf("%s service failed:%s,service restart", name, err.Error()) so once you have the PID printed out kill -s QUIT XXXX In the meantime I shall ping an s390x expert @mundaym about this bug. |
|
Once you have the PID, if the process is stuck, then send it a signal to cause it to produce a core dump and that will show you what's stopping progress on the forked process. I showed that in my initial comment part d) #33704 (comment) but also here again kill -s QUIT XXXX where XXXX is the printed out PID of the process that's stuck. |
Maybe I didn't express the problem clearly.
It's A got struck not B,so I don't see the point in kill B which does not exist anymore? |
Hi @q2683252 I'm currently on vacation so I can't currently experiment myself unfortunately. Do you how your process (e.g. Can you try Go 1.13rc1? In Go 1.13 we use vfork on s390x (which is what amd64 does) so that might make a difference. |
I will try go1.13 ,and maybe it takes a long time to reappear. |
This is #23019. Closing as dup. |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
What did you expect to see?
I use go to protect a service which restarts the process when it exites.
here is the code.
What did you see instead?
I used ps to find that process had already exited, but it did not restart.I looks like it got struck.
It did not happened under x86 so I think it might have something to do under s390x platform.
I used go tool routine, traces and found this.
It looks like that it's waiting for chan in io.Copy to finish .
unfortunately when I wanted to use delve to go deeper but found that it does not support s390x and gdb with go support can't watch variables the goroutine which is not running and it has it's own problem under s390x.
The text was updated successfully, but these errors were encountered: