Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/signal: spurious timeout in TestTerminalSignal #48906

Closed
dilyanpalauzov opened this issue Oct 11, 2021 · 11 comments
Closed

os/signal: spurious timeout in TestTerminalSignal #48906

dilyanpalauzov opened this issue Oct 11, 2021 · 11 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Testing An issue that has been verified to require only test changes, not just a test failure.

Comments

@dilyanpalauzov
Copy link

I want go compile go 1.17 (commit 2ac3bdf) with “go version go1.16.5 gccgo (GCC) 11.2.1 20211010 linux/amd” on a relatively slow system. I call

$ GOROOT_BOOTSTRAP=/usr/local GO_TEST_TIMEOUT_SCALE=100 ./all.bash

which prints:

ok      net/textproto   0.024s
ok      net/url 0.015s
ok      os      2.817s
ok      os/exec 2.127s
--- FAIL: TestTerminalSignal (511.74s)
    signal_cgo_test.go:147: "PS1='prompt> '\r\n"
    signal_cgo_test.go:147: "bash-4.4# PS1='prompt> '\r\n"
    signal_cgo_test.go:147: "prompt> \r<08050/b968/signal.test -test.run=TestTerminalSignal\r\n"
    signal_cgo_test.go:147: "test program entering read\r\n"
    signal_cgo_test.go:147: "^Z\r\n"
    signal_cgo_test.go:147: "[1]+  Stopped                 GO_TEST_TERMINAL_SIGNALS=1 /tmp/go-build3237808050/b968/signal.test -test.run=TestTerminalSignal\r\n"
    signal_cgo_test.go:147: "prompt> fg\r\n"
    signal_cgo_test.go:147: "GO_TEST_TERMINAL_SIGNALS=1 /tmp/go-build3237808050/b968/signal.test -test.run=TestTerminalSignal\r\n"
    signal_cgo_test.go:147: "\r\n"
    signal_cgo_test.go:147: "read newline\r\n"
    signal_cgo_test.go:230: timed out waiting for shell prompt
FAIL
FAIL    os/signal       661.464s
ok      os/user 0.006s
ok      path    0.021s
ok      path/filepath   0.052s

Any idea how can I fix this?

@bcmills
Copy link
Contributor

bcmills commented Oct 11, 2021

See previously #22845. The race is probably the one here:

// Give the process time to restart.
// This is potentially racy: if the process does not restart
// quickly enough then the byte we send will go to bash rather
// than the program. Unfortunately there isn't anything we can
// look for to know that the program is running again.
// bash will print the program name, but that happens before it
// restarts the program.
time.Sleep(10 * pause)

The failure is logged here:

// Wait for the program to exit.
select {
case <-sawPrompt:
case <-time.After(wait):
t.Fatal("timed out waiting for shell prompt")
}

It's not obvious to me why it inherently needs to be racy, though: why can't the pty.Write after the program resumes occur in a loop instead of just once?

@bcmills bcmills added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Testing An issue that has been verified to require only test changes, not just a test failure. labels Oct 11, 2021
@bcmills bcmills changed the title test os/signal fails with timeout os/signal: spurious timeout in TestTerminalSignal Oct 11, 2021
@dilyanpalauzov
Copy link
Author

If I run at commit 1242f43 just

bin/go test -v src/os/signal/signal_cgo_test.go 
=== RUN   TestTerminalSignal
=== PAUSE TestTerminalSignal
=== CONT  TestTerminalSignal
    signal_cgo_test.go:147: "PS1='prompt> '\r\n"
    signal_cgo_test.go:147: "bash: __git_ps1: command not found\r\n"
    signal_cgo_test.go:147: "\x1b[0;33mrt\x1b[37m@\x1b[32mmail\x1b[37m:\x1b[31m/git/goroot/src/os/signal\x1b[36m# \x1b[0mPS1='prompt> '\r\n"
    signal_cgo_test.go:147: "prompt> GO_TEST_TERMINAL_SIGNALS=1 /tmp/go-build469228520/b001/signal.test -test.\r.run=TestTerminalSignal\r\n"
    signal_cgo_test.go:147: "test program entering read\r\n"
    signal_cgo_test.go:147: "^Z\r\n"
    signal_cgo_test.go:147: "[1]+  Stopped                 GO_TEST_TERMINAL_SIGNALS=1 /tmp/go-build469228520/b001/signal.test -test.run=TestTerminalSignal\r\n"
    signal_cgo_test.go:147: "prompt> fg\r\n"
    signal_cgo_test.go:147: "GO_TEST_TERMINAL_SIGNALS=1 /tmp/go-build469228520/b001/signal.test -test.run=TestTerminalSignal\r\n"
    signal_cgo_test.go:147: "\r\n"
    signal_cgo_test.go:147: "read newline\r\n"
    signal_cgo_test.go:147: "prompt> exit $?\r\n"
    signal_cgo_test.go:147: "exit\r\n"
--- PASS: TestTerminalSignal (0.13s)
PASS
ok      command-line-arguments  (cached)

The \x1b[0;33 sequences are in my PS1 variable.

If run instead strace -s2048 -f bin/go test -v src/os/signal/signal_cgo_test.go &> strace.log the output is in strace.log.gz. Let me add, that I have complied bash myself.

Buf if I run instead cd src &&GOROOT_BOOTSTRAP=/usr GO_TEST_TIMEOUT_SCALE=100 ./all.bash then it goes into timeout.

@dilyanpalauzov
Copy link
Author

gdb backtrace, while the process waits for something:

bt f
#0  runtime.futex () at /git/goroot/src/runtime/sys_linux_amd64.s:520
No locals.
#1  0x00000000004312f6 in runtime.futexsleep (addr=0xfffffffffffffe00, val=0, ns=4616803) at /git/goroot/src/runtime/os_linux.go:44
        ts = <optimized out>
#2  0x000000000040cf67 in runtime.notesleep (n=0xfffffffffffffe00) at /git/goroot/src/runtime/lock_futex.go:160
        gp = <optimized out>
        ns = -1
#3  0x000000000043a6ea in runtime.mPark () at /git/goroot/src/runtime/proc.go:1441
        g = 0xd8bc80 <runtime.g0>
#4  0x000000000043c365 in runtime.stoplockedm () at /git/goroot/src/runtime/proc.go:2602
        _g_ = 0xd8bc80 <runtime.g0>
        status = <optimized out>
        ~R0 = <optimized out>
        ~R0 = <optimized out>
#5  0x000000000043de3d in runtime.schedule () at /git/goroot/src/runtime/proc.go:3299
        _g_ = 0xd8bc80 <runtime.g0>
        pp = <optimized out>
        gp = <optimized out>
        inheritTime = <optimized out>
        tryWakeP = <optimized out>
#6  0x000000000043e58d in runtime.park_m (gp=0xc00010ba00) at /git/goroot/src/runtime/proc.go:3516
        _g_ = 0xd8bc80 <runtime.g0>
#7  0x0000000000463243 in runtime.mcall () at /git/goroot/src/runtime/asm_amd64.s:307
No locals.
#8  0x00000000004679c9 in runtime.newproc (siz=4600261, fn=0x46317b <runtime.rt0_go+315>) at <autogenerated>:1
No locals.
#9  0x0000000000000000 in ?? ()
No symbol table info available.

@dilyanpalauzov
Copy link
Author

dilyanpalauzov commented Dec 25, 2021

ps -eF|grep go prints:

root 1782885 1775890 0 324438 4924 7 21:38 pts/0 00:00:00 /tmp/go-build4124209783/b968/signal.test -test.testlogfile=/tmp/go-build4124209783/b968/testlog.txt -test.paniconexit0 -test.short=true -test.timeout=15h0m0s

If I run signal.test; echo $? from /tmp/go-build4124209783/b968/signal.test it prints PASS 0.

# bin/go test -short=true -tags= -timeout=20s -gcflags=all= src/os/signal/signal_cgo_test.go 
ok      command-line-arguments  0.131s

so this test fails apparently only in combination with other tests on a Linux from Scratch system.

@dilyanpalauzov
Copy link
Author

The change below does not help, and the test fails always on my system.

diff --git a/src/os/signal/signal_cgo_test.go b/src/os/signal/signal_cgo_test.go
--- a/src/os/signal/signal_cgo_test.go
+++ b/src/os/signal/signal_cgo_test.go
@@ -215,7 +215,7 @@ func TestTerminalSignal(t *testing.T) {
        // look for to know that the program is running again.
        // bash will print the program name, but that happens before it
        // restarts the program.
-       time.Sleep(10 * pause)
+       time.Sleep(10 * pause + 10 * time.Second)
 
        // Write some data for the program to read,
        // which should cause it to exit.

@ianlancetaylor
Copy link
Contributor

This line in the output is not promising:

signal_cgo_test.go:147: "bash: __git_ps1: command not found\r\n"

That is the response to sending this line on the pseudo-terminal:

PS1='prompt> '\r\n

We execute bash with --norc and --noprofile. I don't understand why it would not recognize setting the PS1 variable. Do you have an unusual version of bash in your PATH?

@dilyanpalauzov
Copy link
Author

# echo $PATH
/usr/local/sbin:/usr/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/sbin:/git/goroot/bin

My login shell is /usr/local/bin/bash . I rename /bin/bash and /usr/local/bin/bash to bin_bash.txt and usr_local_bin_bash.txt and attach them.

@dilyanpalauzov
Copy link
Author

After calling the test several times on 1.16 and 1.17 branch, I dare to say that TestTerminalSignal fails very rarely on 1.16 and very often on 1.17.

I forgot to add, that I run the (terminal) tests in a TMUX session.

@ianlancetaylor
Copy link
Contributor

Thanks for the binaries, but I'm not going to run them.

When you run bash what happens if you type

PS1=prompt>

?

@dilyanpalauzov
Copy link
Author

ABC:~# PS1=prompt>
-bash: syntax error near unexpected token `newline'
ABC:~# PS1='prompt> '
prompt> echo abc
abc
prompt> 

@dilyanpalauzov
Copy link
Author

After deleting all files reported by git status --ignored: bin/, pkg/, src/cmd/cgo/zdefaultcc.go, src/cmd/go/internal/cfg/zdefaultcc.go, src/cmd/go/internal/cfg/zosarch.go, src/cmd/internal/objabi/zbootstrap.go, src/go/build/zcgo.go, src/internal/buildcfg/zbootstrap.go, src/runtime/internal/sys/zversion.go, left in the directories from go1.16 in-tree build, and then re-compiling, all tests pass.

@golang golang locked and limited conversation to collaborators Dec 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Testing An issue that has been verified to require only test changes, not just a test failure.
Projects
None yet
Development

No branches or pull requests

4 participants