Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: Windows service timeout during system startup #23479

Open
kneden opened this issue Jan 18, 2018 · 48 comments
Open

runtime: Windows service timeout during system startup #23479

kneden opened this issue Jan 18, 2018 · 48 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows
Milestone

Comments

@kneden
Copy link

kneden commented Jan 18, 2018

What version of Go are you using (go version)?

go version go1.9.2 windows/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

set GOARCH=amd64
set GOBIN=
set GOEXE=.exe
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOOS=windows
set GOPATH=C:\Users\XXXXX\d\go
set GORACE=
set GOROOT=C:\Go
set GOTOOLDIR=C:\Go\pkg\tool\windows_amd64
set GCCGO=gccgo
set CC=gcc
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\Users\XXXXX\AppData\Local\Temp\go-build933228923=/tmp/go-build -gno-record-gcc-switches
set CXX=g++
set CGO_ENABLED=1
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config

What did you do?

Configured our Go language application to start as a Windows Service, with start type of 'Automatic'. The service is expected to start when the system boots up. Sometimes the service process will not be running after the system has booted up and the 'Windows Logs / 'System' logs in Event Viewer show two logs:

Event ID: 7009
Level: Error
Text:
A timeout was reached (30000 milliseconds) while waiting for the <service name> service to connect.
Event ID: 7000
Level: Error
Text:
The <service name> service failed to start due to the following error: 
The service did not respond to the start or control request in a timely fashion.

These events do not appear in the logs when the service successfully starts.

It appears that Windows terminates the service process when the service fails to respond to the Windows Service start event within the 30 s time frame. Our application contains the Windows Service code based on the sample code from sys/windows/svc/example.

This timeout issue occurs both with 64-bit and 32-bit versions of our executable.

What did you expect to see?

The Go language Windows service start reliably every time the system boots up.

What did you see instead?

The Go language service does not start reliably on some systems on boot-up.

Some observations from systems where the service often fails to start during boot-up:

  1. Manually starting the service once the system is up and running has never failed to start correctly. Timestamps from the logs show the process starting quickly (no more than a few seconds).
  2. Setting the service's start type to 'Automatic (Delayed Start)', where Windows starts the service 120 s after the 'Automatic' services, works without fail. The user is able to login to the desktop well before the service has been started. It is possible to change this global Windows setting to a lower value (eg 60 s) to make it less noticeable to the user, but because it will affect all delayed start services, it is not an acceptable solution. Some search results claim a per-service delay value can be specified, but we were not able to make this work.
  3. Changing the 30000 ms (30 s) Windows service timeout value to larger values (60 s or 90 s) allowed the service to startup. However, this is a global Windows setting and it is not possible to set the timeout per-service. Again, this is not an acceptable solution.
  4. We experimented with using the Windows Service "Recovery" settings to restart our service in an attempt to mitigate this problem. However, it appears that these recovery settings have no effect for services that fail to start in time.
  5. We instrumented our service with timestamped logs when the first imported package is initialized and when main() starts. We found that main() was often not reached when the timeout occurred. Sometimes even the first imported package's initialization was not reached.
  6. A stripped-down Windows service that just logged would often fail to start, but it was less likely to happen compared to our full-fledged service.
  7. Using a simple C++ Windows Service (code here) modified to launch our executable (running as a 'normal' application, not as a Windows Service) always launched without fail (no timeouts) during boot-up. Logs within our executable, as well as in the C++ service (just prior to the CreateProcess() call that started our executable) often showed large time deltas. The logs indicated it often took a long time for the Go executable to get launched, and also to reach the main().

While we have some systems (VirtualBox VMs) that reliably show these startup problems, the issue is harder to reproduce on other systems. We have VirtualBox VMs where the problem occurs only occasionally and we have yet to see the timeout occur on non-virtual Windows 10 installations. Timestamp-instrumented loads on those systems sometimes show time deltas of up to 10 s during startup. While this is not long enough to trigger the 30 s timeout on a powerful real system, the concern is that in virtualized environments and/or slower systems the timeout will be reached and the service will not start.

Sample Programs

Sample code can be found in this GitHub repository, including instructions for building and installing (README.md).

Both the launchserv.exe and winserv.exe programs contain the Go Windows Service code (again, based one the sample code). The Go Windows Service sample code was combined into a single file in both cases in order to 'flatten' the main package for tracking imported package initialization.

It is intended that only one of the two should run as a Windows Service. The recommended configuration is to install launchserv.exe as a Windows Service ("my service"), and it will in turn launch winserv.exe as a child process (winserv.exe will run as a 'normal' program, not a service). launchserv.exe writes a timestamped log just prior to starting winserv.exe, which allows measuring the time it takes for winserv.exe to start.

winserv.exe contains a large number of standard package imports, to mimic a real program. These imports are marked as 'unused' and as such only their initializations run (variable initialization and init()).

The VirtualBox version of the Win10 MSEdge test VM can be downloaded from Microsoft here to run these test programs. The following changes may help to increase the likelihood of seeing the issue with this, or other systems:

  • Increase the number of CPU cores from 1 (to say 2 or 4). This might increase the concurrent activity when the system is starting up.
  • Install the Guest Additions.
  • Install other software that starts services when the system boots. For example, one VM that shows the issue very reliably has two VNC server packages, OpenVPN client and an SSL VPN client (Array Networks) (as well as other software) installed.

Multiple boot-ups of the system may be required to see the problem. In many cases, a VM may sometimes demonstrate the problem, but then not do so at other times. YMMV.

When delays are seen, the two most common places are:

  1. Launching winserv.exe; that is, between the last launchserv.exe log and the first winserv.exe log. Time deltas as high as 26 s have been seen (and as low as less than 400 ms) have been seen on an Win10/MSEdge VM.
  2. The 'lots of imports' in winserv.exe main.go, between the 'b' and 'c' logs. These are the large number of unused standard package imports. Times as high as 9 s (and less than 100 ms on the low end) have been seen on the same Win10/MSEdge VM.

This service is an important component of our product; we are adding more functionality all the time. As the amount of code increases, it seems even more likely that this problem will occur. If there is anything else we can do to help resolve this issue (experiments, measurements, debugging, etc), please let us know.

@rasky
Copy link
Member

rasky commented Jan 18, 2018

Thanks for the report. Can you try instrumenting the runtime with timestamped logs, following the code being executed before main?

Just compile Go from source, and use the produced binary to compile your application. Then put print calls in the runtime code and just recompile your software and test. Within the runtime, you can use println() to print without going through fmt, and nanotime() or walltime() to get a timer reading to print. Start from runtime.main() and runtime.init() and check if you can find where time is being lost.

@bradfitz bradfitz added OS-Windows NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jan 20, 2018
@bradfitz bradfitz added this to the Go1.11 milestone Jan 20, 2018
@alexbrainman
Copy link
Member

I cannot reproduce it here (I am running it on my WIndows 10 computer) - your service starts and stops fine. I do not have any bright ideas.

Alex

When delays are seen, the two most common places are:
...
The 'lots of imports' in winserv.exe main.go, between the 'b' and 'c' logs.

Maybe some code that runs during std packages init functions is broken when running as a service. We don't test code you import to run as a service. Maybe try removing some imports and see if it improve your situation.

Alex

@alexbrainman
Copy link
Member

@kneden

and you are not checking for errors here

https://github.com/Openera/winserv/blob/bdc3a73826f0b7e5d32838f8981aa7a67e2b1d65/cmd/launchserv/launchserv.go#L257

and when you Kill and Wait for process to exit. Not that it matter for your problem.

Alex

@kneden
Copy link
Author

kneden commented Jan 23, 2018

I rebuilt 1.9.2 from source with instrumentation in the main() in src/runtime/proc.go, using time_now() (wallclock() calls failed to compile) to get the epoch time and println() to output the values to stdout (in seconds and milliseconds). The winserv.exe Go executable was modified to send all of its timestamped logs to stdout and rebuilt it using this instrumented Go runtime.

I also reduced the number of unused standard package imports in winserv.exe from the original number to those that our Go language service uses throughout our code base. In the future, we expect this list of imports to grow, not shrink, but it does reflect our application's current usage. The updated code can be found on the stdout branch in the GitHub repo.

The simple C++ Windows service (mentioned previously) was modified to do the following:

  1. Open the log file.
  2. Write a timestamped log to the log file.
  3. Start winserv.exe in 'nonservice' mode, using CreateProcess(). stdout & stderr for this child process were set to the log file.

The C++ service was used instead of the Go launchserv.exe to remove the effects of a second Go runtime in these measurements.

There are thus three types of timestamped logs in the log file when the C++ service starts and launches the Go application:

  • Simple C++ service log immediately before starting winserve.exe child process.
  • Go runtime main() logs at the following points:
    • main() entry (runtime/proc.go, not the application's main()).
    • Just before and after runtime_init() call.
    • Just before and after main_init() call.
    • Just before main_main() call.
  • winserv.exe logs as before, mostly delineating the progress through the main package (imports, start of main(), etc). As mentioned, the unused standard package imports (between 'b' and 'c') was reduced to match what our service currently uses.

The results of a set of runs can be found in the following Google Sheet. There is one manual start of the service (ie system already running), followed by 15 boot-ups. The fourth column is the time delta between that log and the previous one, in seconds. It has been conditionally formatted to highlight any value >= 1 second.

For the manual start, there were no significant delays. A separate set of 10 manual service starts on a running system (data not shown) also showed no significant delays between any of the logs.

For the boot-up runs, there are cases where there was no significant delays (such as boot 2 & 7) and others where there were noticeable delays. These delays tend to be seen in several places:

  • Between CreateProcess() and entry point of runtime main() (boot 1 & 3).
  • Within the winserv.exe package imports, especially the unused standard packages.
  • Only once (boot 5) was there a noticeable delay within the runtime main(): between runtime_init() done and the main_init() start logs, though there were many other delays in this run.

There are runs where there was only one significant delay and others where there were many.

For the delay between the CreateProcess() and the runtime main() entry logs, it isn't clear where the time is lost. Is it possible to instrument the Go runtime even earlier than this point?

The current hypothesis is that when Windows is starting services during boot-up, there are often times when resources available to the processes (possibly disk and/or CPU?) are limited. These resource constraints might affect a Go application more than others due to its size, as a larger executable requires more disk I/O and CPU time to load into memory. launchserv.exe is 2.5M vs. C++ service at 21K, for two programs with similar functionality. winserv.exe is 5.6M in its current form (and was 7M when it had more imports).

If the application doesn't hit any significant resource constraints, it starts without delay. If it does, the delay(s) depend when the constraints occur.

@alexbrainman
Copy link
Member

For the delay between the CreateProcess() and the runtime main() entry logs, it isn't clear where the time is lost. Is it possible to instrument the Go runtime even earlier than this point?

I do not know. Maybe @aclements knows. Austin, can you suggest something here to understand why it is so slow to start?

The current hypothesis is that when Windows is starting services during boot-up, there are often times when resources available to the processes (possibly disk and/or CPU?) are limited.

It sounds reasonable to me. Windows is trying to get system up and running as soon as possible. It would delay anything that it consider might affect that goal.

These resource constraints might affect a Go application more than others due to its size, as a larger executable requires more disk I/O and CPU time to load into memory. launchserv.exe is 2.5M vs. C++ service at 21K, for two programs with similar functionality. winserv.exe is 5.6M in its current form (and was 7M when it had more imports).

I don't think executable size maters here. Maybe memory allocations affect things. Go reserves large amount of address space upfront, but it is not expensive for a normal process. And, as you have discovered, simple Go service does not suffers delays. Have you tried to build simple Go service with many inits (not starting Go executable with many inits from a service)?

I really don't have any good ideas. Sorry.

Alex

@kneden
Copy link
Author

kneden commented Jan 29, 2018

@alexbrainman:

We have seen simple Go service experience delays during startup, though the probability of it happening seem to be less compared to one with one with many imports.

A Go service with many imports behaves similar to one that runs as a 'normal' executable launched by a service, often suffering start-up delays. In fact, we started debugging with this first and moved to having the Go executable (in 'non-service' mode) launched by a service (C++ or Go) to get a measure of the time for the program to be launched. When the delay issues occur during Windows start-up, it doesn't seem to make much difference whether the Go executable is running as the service itself, or launched by the another service.

@alexbrainman
Copy link
Member

When the delay issues occur during Windows start-up, it doesn't seem to make much difference whether the Go executable is running as the service itself, or launched by the another service.

Thank you for explaining.

Just an idea - I believe Docker is running as a Windows service. If that service executable is written in Go, how come they don't have problems like yours?

Alex

@kneden
Copy link
Author

kneden commented Jan 30, 2018

Just an idea - I believe Docker is running as a Windows service. If that service executable is written in Go, how come they don't have problems like yours?

Good point. When I get a chance, I'll try installing Docker on a problematic VM and see how it behaves.

@aclements
Copy link
Member

For the delay between the CreateProcess() and the runtime main() entry logs, it isn't clear where the time is lost. Is it possible to instrument the Go runtime even earlier than this point?

It is. Runtime initialization calls a few functions before runtime.main. You can see the CALLs instructions in rt0_go in asm_amd64.s, but here are the ones you can practically annotate: check and args in runtime1.go, osinit in os_windows.go, and schedinit in proc.go. Quite a bit happens in osinit and schedinit, so it would be interesting to see if the time goes there.

@kneden
Copy link
Author

kneden commented Feb 5, 2018

I instrumented the earlier runtime initialization calls, with more focus on the middle of schedinit as some preliminary runs pointed to there. Since calls to println didn't work in the earlier calls, all of the timestamps up to and including those in schedinit are output together at the end of that function (after the schedinit exit timestamp is recorded). The 'schedinit() print time' captures how long it takes for these 15 println calls, in case that introduced any significant delay.

The results are in the second tab of the Google Sheets spreadsheet. Six of the boot-ups had significant times in the newly instrumented code:

  • boot 5: osinit, schedinit/mcommoninit, & schedinit/itabsinit
  • boot 6: schedinit/itabsinit
  • boot 18: schedinit after gcinit (until end of function)
  • boot 23: schedinit/itabsinit
  • boot 31: schedinit/mcommoninit
  • boot 35: schedinit/mcommoninit, schedinit/modulesinit, & schedinit/itabsinit

Other boot-ups either had delays elsewhere, or none of significance.

Seems similar to the previous results, with there being locations in the code where delays are more likely to happen than others, but they don't happen in these places all the time.

@alexbrainman
Copy link
Member

Seems similar to the previous results

Indeed.

I have nothing else that helps you. Sorry.

Alex

@rasky
Copy link
Member

rasky commented Feb 7, 2018

Does your binary use cgo? If so, can you try commenting out anything that depends on cgo and see if you still those random delays? Also, if that's not the case, you can still try dropping parts of your binary (like, entire packages commenting out functionalities) and see if you can bisect what causes the initial delay as side effect. I'm assuming this is something specific to your binary and not all Windows go binaries of course.

Which DLLs your binary depend on? You can use a tool like DependencyWalker to see a complete list of dependencies.

@tsafin
Copy link

tsafin commented Feb 14, 2018

Which DLLs your binary depend on? You can use a tool like DependencyWalker to see a complete list of dependencies.

That's good question - if your service executable is depending on anything network related then you might experience such delays (because of complexities how TCP-stack is getting up'n'running). Either mark your service as "Delayed start" or remove TCP-related code from the stages which send Windows control service signal that it's up.

@kneden
Copy link
Author

kneden commented Feb 15, 2018

Our actual application does not use CGo. Nor does the sample winserv program I have put on GitHub (basic Windows service with a lot of unused imports and some timestamped logging).

Using DependencyWalker on winserv takes almost 2 minutes to analyze the binary and shows a large tree with three top-level DLLS: WINMM.DLL, WS2_32.DLL, & KERNEL32.DLL. Feeding it a simple Go Hello World (just fmt.Println()) gives a similar result, at least time terms of analysis time, top-level DLLs, and functions listed. There are also a number of errors reported for both executables, including being unable to open many DLL files and the following (this is from the gohello.exe, but is the same for winserv.exe):

Error: At least one module has an unresolved import due to a missing export function in an implicitly dependent module.
Error: Modules with different CPU types were found.
Error: A circular dependency was detected.
Warning: At least one delay-load dependency module was not found.
Warning: At least one module has an unresolved import due to a missing export function in a delay-load dependent module.

Is there anything specific in the DependencyWalker results I should be looking for?

When I use dumpbin /imports (VS2015) for those two executables, it shows the same three top-level DLLs and functions as DependencyWalker. The output for the two programs is the same, except for the addresses:

Copyright (C) Microsoft Corporation.  All rights reserved.

Dump of file c:\Users\liveqos\d\gohello\gohello.exe

File Type: EXECUTABLE IMAGE

  Section contains the following imports:

    winmm.dll
                52A000 Import Address Table
                5DD2E6 Import Name Table
                     0 time date stamp
                     0 Index of first forwarder reference

                           0 timeEndPeriod
                           0 timeBeginPeriod

    ws2_32.dll
                52A018 Import Address Table
                5DD2FE Import Name Table
                     0 time date stamp
                     0 Index of first forwarder reference

                           0 WSAGetOverlappedResult

    kernel32.dll
                52A028 Import Address Table
                5DD30E Import Name Table
                     0 time date stamp
                     0 Index of first forwarder reference

                           0 WriteFile
                           0 WriteConsoleW
                           0 WaitForSingleObject
                           0 VirtualFree
                           0 VirtualAlloc
                           0 SwitchToThread
                           0 SetWaitableTimer
                           0 SetUnhandledExceptionFilter
                           0 SetProcessPriorityBoost
                           0 SetEvent
                           0 SetErrorMode
                           0 SetConsoleCtrlHandler
                           0 LoadLibraryA
                           0 LoadLibraryW
                           0 GetSystemInfo
                           0 GetStdHandle
                           0 GetQueuedCompletionStatus
                           0 GetProcessAffinityMask
                           0 GetProcAddress
                           0 GetEnvironmentStringsW
                           0 GetConsoleMode
                           0 FreeEnvironmentStringsW
                           0 ExitProcess
                           0 DuplicateHandle
                           0 CreateThread
                           0 CreateIoCompletionPort
                           0 CreateEventA
                           0 CloseHandle
                           0 AddVectoredExceptionHandler

  Summary

       32000 .data
        1000 .debug_abbrev
        1000 .debug_aranges
       13000 .debug_frame
        1000 .debug_gdb_scripts
       47000 .debug_info
       11000 .debug_line
        A000 .debug_pubnames
        9000 .debug_pubtypes
        1000 .idata
       23000 .symtab
      129000 .text

@KedarB
Copy link

KedarB commented Feb 18, 2018

Hi,
Are you building the exe in debug mode or in release mode? Please try building in release mode.

Thanks & Regards,
Kedar B.

@derekwbrown
Copy link

I am having the exact same problem using Go to generate a windows service.
I can reproduce the problem at will; and, in fact, since my executable can be run as a service or standard console application, I have some additional findings

  1. Delay appears only in the first execution. After that, whatever is the bottleneck appears to be cached? 2nd through Nth execution, application intializes/runs in ~1sec
  2. Delay appears while disk is being heavily used. If I just wait for the machine to boot, and then wait a few minutes for the disk activity to die away, and then attempt the first execution, it runs fine
  3. starting immediately results in a variable delay of between 45 sec and 2 minutes.
  4. same result building in release mode.

@kneden
Copy link
Author

kneden commented Feb 28, 2018

I assume 'debug' and 'release' refer to with and without debug symbols (go build -ldflags "-s -w" for the latter). If that is the case, then all Go binaries I was building were with debug symbols.

I rebuilt our application with those link flags, which resulted in a Windows executable that was about 1/3 smaller. However, it behaved the same as before, timing out when starting as a normal Windows service at boot-up.

@fkollmann
Copy link

I think the correct way to solve this would be to build the service support into the runtime.

A Windows service can be pending starting without be cancelled after sending a SERVICE_START_PENDING message:

A ServiceMain function first calls the RegisterServiceCtrlHandlerEx function to get the service's SERVICE_STATUS_HANDLE. Then it immediately calls the SetServiceStatus function to notify the service control manager that its status is SERVICE_START_PENDING. During initialization, the service can provide updated status to indicate that it is making progress but it needs more time. ... see reference

The next step could be to write a proposal for this. However having these kind of changes is probably hard getting approved and it won't release before Feb 2019.

On the other hand, there a quite a lot good service wrappers out there.

@derekwbrown
Copy link

Part of the problem won't be solved by building service support into the runtime, at least not without some very specific modifications.
Since, in Go, each package can have an init function which is called prior to main, if the init functions take long enough, then Windows will give up before the application could ever send SERVICE_START_PENDING.

@fkollmann
Copy link

fkollmann commented Apr 3, 2018

@derekwbrown I think the point here would be to hook into the service manager before firing up the runtime... before "runtime/proc.go main()" but inside the system's entry point (main/WinMain).

More details: https://msdn.microsoft.com/en-us/library/windows/desktop/ms685477(v=vs.85).aspx

@derekwbrown
Copy link

@fkollmann I understand that; however, even if your service hooked into the service manager during its own init, the fact that any vendored dependencies inits can run for an arbitrary amount of time makes it problematic.

Unless you're talking about changing the language to have a service-init() that occurs before init()...

@mattn
Copy link
Member

mattn commented Apr 3, 2018

As far as I know, This problem should be fixed by calling SetServiceStatus on regular intervals. And the interval can be changed on registory.

@fkollmann
Copy link

fkollmann commented Apr 3, 2018

I think it should be implemented directly into the OS main() function. You need specific functions anyway (see servicemain()).

I would suggest the following steps:

  1. Accept the constrain for SERVICE_WIN32_OWN_PROCESS services, only (no SERVICE_WIN32_SHARE_PROCESS support, which is barely used and I think not even fully supported by current Go/x)

  2. Connect to service manager during main() by calling StartServiceCtrlDispatcher() and set status to SERVICE_START_PENDING for 2 minutes by calling SetServiceStatus()

  • If the program is not running as a service, StartServiceCtrlDispatcher() will return ERROR_FAILED_SERVICE_CONTROLLER_CONNECT
  1. All status updates will be tracked in a global variable during that time (basically a service cannot receive any updates during startup, anyway)

  2. Update status to SERVICE_START_PENDING by 5 minutes by calling SetServiceStatus() after runtime initialization. See again @kneden 's spreadheet

  3. Have the developer call the service Run() code and pick-up the starting start and replace the handler (the change should be easy to make)

  • If the developer does not call service Run(), Windows will kill the process after 5 minutes

  • The service won't be receiving any device events before calling service Run()

@fkollmann
Copy link

@mattn I don't think that is a good idea. Microsoft explicitly discourages from doing this as it can keep the service stuck forever. I also think it's not required, as this is no heartbeat mechanism. You simply have to set an appropriate timeout.

@alexbrainman
Copy link
Member

@kneden @derekwbrown can you try https://golang.org/cl/110160 to see if it helps with your problem. Thank you.

Alex

@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
@alexbrainman
Copy link
Member

I am seeing exactly this issue(A timeout was reached (90000ms) while waiting for the Docker Engine service to connect​) for docker on our machines now and then, and the timeout in our case is 90s.

I had some free time today. And I tried running this service golang.org/x/sys/windows/svc/example on my Windows 10 64 bit computer. I rebooted it 5 times, and service started without any problems every single time.

I can try and debug this issue again, but I need to be able to reproduce it on my computer. Does anyone knows of a Go service that I can build myself to reproduce this issue? I don't know what Docker Engine service is and how to build it.

Thank you.

Alex

@adjackura
Copy link

@alexbrainman, this issue only appears to show up (for me) if you are heavily disk I/O constrained.

@alexbrainman
Copy link
Member

this issue only appears to show up (for me) if you are heavily disk I/O constrained.

@adjackura

Do you have any suggestions on how I can reproduce it? What version of Go do you use? What is your program?

Grasping for straws.

Thank you.

Alex

@adjackura
Copy link

From this thread it looks like even the simplest program will exhibit this, currently I am trying to repro myself from scratch in GCE. I have a repo (I didn't build the image though so not sure what was done) so I know its related to disk speed, but vanilla images don't have the issue. I think additional services need to be installed to slow boot down.

@alexbrainman
Copy link
Member

currently I am trying to repro myself from scratch in GCE.

That would be nice, if you can repro on GCE. What version of Go is your service built with?

I think additional services need to be installed to slow boot down.

What is your environment where this error happen to you? Does the error happens consistently?

Alex

@adjackura
Copy link

I was unable to repro the timeout, but I was able to see the issue using Windows Performance Recorder and measuring boot performance. One a really low specced system (1CPU, really low IO limit) my service took 27s for container init time. One a system with higher specs it was only 1s. I also installed lots of random windows server features to further slow down the boot process.

Here is a link that talks about further digging into why service start is slow:
https://famellee.wordpress.com/2013/05/10/use-windows-performance-toolkit-to-trackdown-slow-service-start-problem/

@alexbrainman
Copy link
Member

I just submitted https://golang.org/cl/246317 change that can possibly fix this bug. Please, check, if you still can reproduce this bug after CL 246317 ?

Thank you.

Alex

@alexbrainman
Copy link
Member

Since no one replied to my comment last week, I will assume that

https://golang.org/cl/246317

fixes this. So I will close this issue.

Please, let me know, if I am wrong.

Thank you.

Alex

@Jonathan727
Copy link

This issue is frustrating and I've seen it on busy Windows Server Core machines. One workaround for this issue is to edit the ServicesPipeTimeout timeout in Windows registry.

Another option for services is to change the startup type to Automatic (Delayed).

@Sigurthorb
Copy link

@alexbrainman can we reopen this issue?

@aclements
Copy link
Member

@Jonathan727 , @Sigurthorb , have you been able to test with Alex's CL 246317?

@Sigurthorb
Copy link

Yes, we are seeing this issue even with Alex's CL 246317 fix in our vendor-ed sys package

@aclements
Copy link
Member

Thanks. Reopening.

@aclements aclements reopened this Aug 10, 2021
@shalieshsoni
Copy link

We're facing the same issue in our environment, could share a permanent workaround.

@machooo-x
Copy link

I encountered this problem, putting all DLL related initialization actions into init method solves my problem.

@machooo-x
Copy link

machooo-x commented Jul 4, 2022

Thanks for this shirou/gopsutil#570

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows
Projects
Status: Triage Backlog
Development

No branches or pull requests