Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: TestCtrlHandler fails when run manually on a windows-amd64-2016 gomote #51602

Closed
bcmills opened this issue Mar 10, 2022 · 16 comments
Closed
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows release-blocker
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Mar 10, 2022

What version of Go are you using (go version)?

A clean build from source at commit 914195c on a windows-amd64-2016 gomote.

Does this issue reproduce with the latest release?

I have no reason to believe otherwise.

What operating system and processor architecture are you using (go env)?

go env Output
gopher@SERVER-2016-V7- C:\workdir\go\src>..\bin\go env
set GO111MODULE=
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\gopher\AppData\Local\go-build
set GOENV=C:\Users\gopher\AppData\Roaming\go\env
set GOEXE=.exe
set GOEXPERIMENT=
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOINSECURE=
set GOMODCACHE=C:\Users\gopher\go\pkg\mod
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=C:\Users\gopher\go
set GOPRIVATE=
set GOPROXY=https://proxy.golang.org,direct
set GOROOT=C:\workdir\go
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLDIR=C:\workdir\go\pkg\tool\windows_amd64
set GOVCS=
set GOVERSION=devel gomote.XXXXX
set GCCGO=gccgo
set GOAMD64=v1
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=C:\workdir\go\src\go.mod
set GOWORK=
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\Users\gopher\AppData\Local\Te
mp\go-build1381519030=/tmp/go-build -gno-record-gcc-switches

What did you do?

gopher@SERVER-2016-V7- C:\workdir\go\src>set PATH=%PATH%;C:\godep\gcc64\bin

gopher@SERVER-2016-V7- C:\workdir\go\src>set GOROOT_BOOTSTRAP=C:\workdir\go1.4

gopher@SERVER-2016-V7- C:\workdir\go\src>make.bat
Building Go cmd/dist using C:\workdir\go1.4
Building Go toolchain1 using C:\workdir\go1.4.
Building Go bootstrap cmd/go (go_bootstrap) using Go toolchain1.
Building Go toolchain2 using go_bootstrap and Go toolchain1.
Building Go toolchain3 using go_bootstrap and Go toolchain2.
Building packages and commands for windows/amd64.
---
Installed Go for windows/amd64 in C:\workdir\go
Installed commands in C:\workdir\go\bin

gopher@SERVER-2016-V7- C:\workdir\go\src>..\bin\go version
go version devel gomote.XXXXX windows/amd64

gopher@SERVER-2016-V7- C:\workdir\go\src>..\bin\go test runtime -run=TestCtrlHandler

What did you expect to see?

The test should either pass or skip itself.

What did you see instead?

--- FAIL: TestCtrlHandler (0.56s)
    signal_windows_test.go:125: failed to kill: exit status 1
FAIL
FAIL    runtime 0.589s
FAIL

The test also fails in the same way when run by dist test:

gopher@SERVER-2016-V7- C:\workdir\go\src>..\bin\go tool dist test go_test:runtime

##### Test execution environment.
# GOARCH: amd64
# CPU: Intel(R) Xeon(R) CPU @ 2.20GHz
# GOOS: windows
# OS Version: 10.0.14393

##### Testing packages.
--- FAIL: TestCtrlHandler (0.98s)
    signal_windows_test.go:125: failed to kill: exit status 1
FAIL
FAIL    runtime 53.691s
FAIL
go tool dist: Failed: exit status 1

I updated the test to log the output from the failing taskkill.exe invocation, and saw:

--- FAIL: TestCtrlHandler (0.51s)
    signal_windows_test.go:126: C:\Windows\system32\taskkill.exe /pid 1508: exit status 1
        ERROR: The process with PID 1508 could not be terminated.
        Reason: This process can only be terminated forcefully (with /F option).

This test was added in CL 261057 for #41884 (CC @ncruce @alexbrainman @zx2c). For some reason it is not failing on the windows-amd64-longtest builder, but the gomote environment is substantially different from the environment actually used by the buildlet (see #32430).

@bcmills bcmills added OS-Windows NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker labels Mar 10, 2022
@bcmills bcmills added this to the Go1.19 milestone Mar 10, 2022
@thepudds
Copy link
Contributor

CC @zx2c4

@gopherbot
Copy link

Change https://go.dev/cl/391799 mentions this issue: runtime: add logging in TestCtrlHandler and skip it

@ncruces
Copy link
Contributor

ncruces commented Mar 14, 2022

I'm out of ideas on how to properly "fix" the test.

The test case works by starting the sub-process in a new console window, and then taskkilling it, which basically looks for the console window that hosts the process, and sends WM_CLOSE to that window. That error message (usually?) comes up when a process is not attached to a console window.

Not sure what's special about gomote that makes this not work? I'd never heard of gomote before today, tbh.

https://go.dev/cl/261057 mostly iterated on variations of the test. Maybe these can guide a look for alternatives? I'd need to be able to replicate the bug myself to try.

@bcmills
Copy link
Contributor Author

bcmills commented Mar 14, 2022

Not sure what's special about gomote that makes this not work?

I am able to reproduce the test failure locally (on a Windows desktop machine) via the following steps:

  1. Install (and start) OpenSSH-Server per https://docs.microsoft.com/en-us/windows-server/administration/openssh/openssh_install_firstuse#install-openssh-using-windows-settings.
  2. Connect to the local machine via SSH over a loopback connection.
  3. From %GOROOT%\src, run ..\bin\go test runtime -run=TestCtrlHandler -v

@ncruces
Copy link
Contributor

ncruces commented Mar 14, 2022

OK, thanks. I'll look into it, but from a quick test it doesn't look good.

I get the same error message if I try to taskkill a command started from an SSH shell with start /wait cmd. So maybe this just doesn't work in this setup.

@bcmills
Copy link
Contributor Author

bcmills commented Mar 14, 2022

FWIW, it appears that in this case the program is running under a pseudo-console:
https://devblogs.microsoft.com/commandline/windows-command-line-introducing-the-windows-pseudo-console-conpty/

I haven't yet found a way (other than the taskkill failure) to programmatically determine whether the test is running in a pseudo-console. But maybe it suffices to just skip the test if taskkill fails?

@ncruces
Copy link
Contributor

ncruces commented Mar 14, 2022

That link was very instructive @bcmills.
I didn't find a way of determining that we're running in a pseudo-console either.
But it does occur to me that, if pseudo-console APIs are available, we might start our test process in a new pseudo-console (instead of in a new console) and then call ClosePseudoConsole (rather than using taskkill).

@bcmills
Copy link
Contributor Author

bcmills commented Mar 14, 2022

If ClosePseudoConsole works, then I'm all for it!

But my suspicion (based on the failure mode here) is that perhaps ClosePseudoConsole doesn't produce a CTRL_CLOSE_EVENT at all. 😕

@ncruces
Copy link
Contributor

ncruces commented Mar 14, 2022

I think it does.

But neither exec nor syscall.CreateProcess are geared to allow doing what InitializeStartupInfoAttachedToConPTY is doing in https://devblogs.microsoft.com/commandline/windows-command-line-introducing-the-windows-pseudo-console-conpty/

@alexbrainman
Copy link
Member

But neither exec nor syscall.CreateProcess are geared to allow doing what InitializeStartupInfoAttachedToConPTY is doing in https://devblogs.microsoft.com/commandline/windows-command-line-introducing-the-windows-pseudo-console-conpty/

If you want to use InitializeProcThreadAttributeList in syscall package, you can do that. We already use InitializeProcThreadAttributeList to implement SysProcAttr.ParentProcess. We can add more code there.

But I am not sure it is worth the trouble.

@bcmills is it possible for runtime.TestCtrlHandler test to determine that it is running inside of gomote? If not, then can we modify gomote to pass an environment variable or something that can be read by the runtime.TestCtrlHandler test?

Thank you.

Alex

@ncruces
Copy link
Contributor

ncruces commented Mar 15, 2022

It's definitely possible, but I agree it's not worth the trouble, unless we explicitly want to support launching a process in a new pseudo-console.

But that's an entire feature request, not a test bug fix.
Python has a bug tracking this: https://bugs.python.org/issue41151

@bcmills
Copy link
Contributor Author

bcmills commented Mar 15, 2022

@alexbrainman, I don't know of any way for the test to determine that it's running under a gomote, and even if there were that's not necessarily the only circumstance in which it would be connected to a pseudo-console (which it seems to me is the real condition we want to check for).

@ncruces
Copy link
Contributor

ncruces commented Mar 15, 2022

@bcmills I have an alternative fix that allows the test to run under SSH.
Can you verify with gomote?

I can make any necessary fixes, including adding the timeout from https://go.dev/cl/391799

@gopherbot
Copy link

Change https://go.dev/cl/392874 mentions this issue: runtime: allow TestCtrlHandler to run in ConPTY

@ncruces
Copy link
Contributor

ncruces commented Mar 15, 2022

I added some commits to #51681, but @gopherbot doesn't seem to be picking them up.

@bcmills
Copy link
Contributor Author

bcmills commented Mar 15, 2022

The bot is badly backlogged. I'll keep an eye on it and send the TryBots for another spin on the Gerrit CL once it's back.

@golang golang locked and limited conversation to collaborators Mar 18, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows release-blocker
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants