Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: CreateThread fails with ERROR_ACCESS_DENIED #18253

Closed
bradfitz opened this issue Dec 8, 2016 · 31 comments
Closed

runtime: CreateThread fails with ERROR_ACCESS_DENIED #18253

bradfitz opened this issue Dec 8, 2016 · 31 comments
Labels
FrozenDueToAge help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows Testing An issue that has been verified to require only test changes, not just a test failure.
Milestone

Comments

@bradfitz
Copy link
Contributor

bradfitz commented Dec 8, 2016

Windows flake:

windows-386-gce at 1ea60c1

https://build.golang.org/log/f1487f7fe66b893aa63406d0c181dbdc1c72b3ae

##### ../test
# go run run.go -- fixedbugs/bug093.go
incorrect output
runtime: failed to create new OS thread (have 5 already; errno=5)
fatal error: runtime.newosproc

FAIL	fixedbugs\bug093.go	0.310s

Any ideas? @ianlancetaylor @aclements @alexbrainman

@bradfitz bradfitz added the Testing An issue that has been verified to require only test changes, not just a test failure. label Dec 8, 2016
@bradfitz bradfitz added this to the Go1.8Maybe milestone Dec 8, 2016
@ianlancetaylor
Copy link
Contributor

Error code 5 is ERROR_ACCESS_DENIED. This doesn't have anything to do with test bug093.go, which doesn't even start any goroutines. I don't know why _CreateThread would return ERROR_ACCESS_DENIED. I don't know why the program would already have 5 threads running. Basically this failure mode seems impossible.

@aclements
Copy link
Member

This seems impossible, but isn't isolated:

$ greplogs -dashboard -E 'failed to create new OS thread.*have [1-9] ' -md
...
2016-08-30T00:10:47-67ea710/windows-386-gce:

# go run run.go -- syntax/topexpr.go

Unmatched Errors:
runtime: failed to create new OS thread (have 5 already; errno=5)
fatal error: runtime.newosproc

FAIL	syntax\topexpr.go

2016-10-17T18:53:22-0ba3c60/windows-386-gce:

# go run run.go -- fixedbugs/issue7316.go
incorrect output
runtime: failed to create new OS thread (have 5 already; errno=5)
fatal error: runtime.newosproc

FAIL	fixedbugs\issue7316.go

2016-11-11T21:00:11-10d2efd/windows-amd64-gce:

# go run run.go -- fixedbugs/issue9604.go
incorrect output
runtime: failed to create new OS thread (have 5 already; errno=
FAIL	fixedbugs\issue9604.go

2016-11-23T00:44:01-50c4dbc/windows-amd64-gce:

# go run run.go -- chan/sieve2.go
incorrect output
runtime: failed to create new OS thread (have 5 already; errno=5)

FAIL	chan\sieve2.go

2016-12-07T16:59:25-1ea60c1/windows-386-gce:

# go run run.go -- fixedbugs/bug093.go
incorrect output
runtime: failed to create new OS thread (have 5 already; errno=5)
fatal error: runtime.newosproc

FAIL	fixedbugs\bug093.go

So, it certainly seems to be unrelated to the particular test. It's interesting that it's always "have 5 already; errno=5".

@alexbrainman
Copy link
Member

I do not have any bright ideas.

CreateThread returns ERROR_ACCESS_DENIED makes me think that some pointers we pass to CreateThread might be invalid. Which leaves funcPC(tstart_stdcall) and 'uintptr(unsafe.Pointer(mp))'. These needs to live in memory that is readable+executable and readable+writable correspondently.

It's interesting that it's always "have 5 already; errno=5".

I don't see why this is interesting - 5 is, probable, what Go program that does not starts goroutines have (at this moment).

Alex

@aclements
Copy link
Member

Which leaves funcPC(tstart_stdcall) and 'uintptr(unsafe.Pointer(mp))'. These needs to live in memory that is readable+executable and readable+writable correspondently.

MSDN says "Note that CreateThread may succeed even if lpStartAddress points to data, code, or is not accessible." which suggests that it's not a problem with the start PC, though it does only say "may". I'm surprised it would care about the lpParameter. I figured it would just pass that value through to the start function. Does it really care that that's a valid pointer to RW memory?

I don't see why this is interesting - 5 is, probable, what Go program that does not starts goroutines have (at this moment).

Possibly. It looks like GOMAXPROCS is 4 on these builders, so 5 is not an unreasonable number of threads. But something had to start them, and if the test only has a single goroutine, the only thing I can think that would encourage more threads to start would be the garbage collector. If that's the case, this means these tests have all gotten at least as far as the first GC cycle (versus crashing really early in execution).

@ianlancetaylor
Copy link
Contributor

Looking at https://golang.org/test/fixedbugs/bug093.go you can see that it is very unlikely that any garbage collection occurred.

@alexbrainman
Copy link
Member

MSDN says "Note that CreateThread may succeed even if lpStartAddress points to data, code, or is not accessible." which suggests that it's not a problem with the start PC, though it does only say "may". I'm surprised it would care about the lpParameter. I figured it would just pass that value through to the start function. Does it really care that that's a valid pointer to RW memory?

You are right, I am wrong. lpParameter is just a uintptr - Windows does not care what is in it. I also tried calling CreateThread with lpStartAddress set to 0 and 123, and CreateThread succeeds regardless.

@alexbrainman
Copy link
Member

But something had to start them, and if the test only has a single goroutine, the only thing I can think that would encourage more threads to start would be the garbage collector. If that's the case, this means these tests have all gotten at least as far as the first GC cycle (versus crashing really early in execution).

This is what I see on my computer:

C:\dev\go\test\fixedbugs>go version
go version devel +0716fef Mon Dec 12 01:31:50 2016 +0000 windows/386

C:\dev\go\test\fixedbugs>git diff
diff --git a/src/runtime/os_windows.go b/src/runtime/os_windows.go
index 0db57f8..4c28ce7 100644
--- a/src/runtime/os_windows.go
+++ b/src/runtime/os_windows.go
@@ -522,6 +522,11 @@ func newosproc(mp *m, stk unsafe.Pointer) {
                print("runtime: failed to create new OS thread (have ", mcount())
                throw("runtime.newosproc")
        }
+
+       println("mcount=", mcount())
+       if mcount() == 2 {
+               panic("ALEX")
+       }
 }

 // Used by the C library build mode. On Linux this function would allocate a

C:\dev\go\test\fixedbugs>go run bug093.go
mcount= 2
panic: ALEX
fatal error: panic on system stack

runtime stack:
runtime.throw(0x45749e, 0x15)
        c:/dev/go/src/runtime/panic.go:596 +0x7c
panic(0x44c760, 0x10b44010)
        c:/dev/go/src/runtime/panic.go:420 +0x675
runtime.newosproc(0x10b5e000, 0x0)
        c:/dev/go/src/runtime/os_windows.go:528 +0xfa
runtime.newm(0x45a0e4, 0x0)
        c:/dev/go/src/runtime/proc.go:1598 +0xe7
runtime.main.func1()
        c:/dev/go/src/runtime/proc.go:126 +0x2b
runtime.systemstack(0x480040)
        c:/dev/go/src/runtime/asm_386.s:337 +0x5e
runtime.mstart()
        c:/dev/go/src/runtime/proc.go:1102

goroutine 1 [running]:
runtime.systemstack_switch()
        c:/dev/go/src/runtime/asm_386.s:291 fp=0x10b5afc8 sp=0x10b5afc4
runtime.main()
        c:/dev/go/src/runtime/proc.go:127 +0x69 fp=0x10b5aff0 sp=0x10b5afc8
runtime.goexit()
        c:/dev/go/src/runtime/asm_386.s:1629 +0x1 fp=0x10b5aff4 sp=0x10b5aff0
exit status 2

C:\dev\go\test\fixedbugs>git diff
diff --git a/src/runtime/os_windows.go b/src/runtime/os_windows.go
index 0db57f8..29312d1 100644
--- a/src/runtime/os_windows.go
+++ b/src/runtime/os_windows.go
@@ -522,6 +522,11 @@ func newosproc(mp *m, stk unsafe.Pointer) {
                print("runtime: failed to create new OS thread (have ", mcount())
                throw("runtime.newosproc")
        }
+
+       println("mcount=", mcount())
+       if mcount() == 3 {
+               panic("ALEX")
+       }
 }

 // Used by the C library build mode. On Linux this function would allocate a

C:\dev\go\test\fixedbugs>go run bug093.go
mcount= 2
mcount= 3
fatal error: unexpected signal during runtime execution
[signal 0xc0000005 code=0x0 addr=0x0 pc=0x409d70]

runtime stack:
runtime.throw(0x45949c, 0x2a)
        c:/dev/go/src/runtime/panic.go:596 +0x7c
runtime.sigpanic()
        c:/dev/go/src/runtime/signal_windows.go:155 +0x164
runtime.mallocgc(0x8, 0x44c760, 0x1, 0x0)
        c:/dev/go/src/runtime/malloc.go:669 +0x5c0
runtime.newobject(0x44c760, 0x4214c5)
        c:/dev/go/src/runtime/malloc.go:801 +0x2c
runtime.convT2E(0x44c760, 0x6fdb8, 0x20000, 0x440d10)
        c:/dev/go/src/runtime/iface.go:198 +0x2a
runtime.newosproc(0x10b5e280, 0x0)
        c:/dev/go/src/runtime/os_windows.go:528 +0xe6
runtime.newm(0x0, 0x10b52000)
        c:/dev/go/src/runtime/proc.go:1598 +0xe7
runtime.startm(0x10b52000, 0x423b00)
        c:/dev/go/src/runtime/proc.go:1668 +0x141
runtime.handoffp(0x10b52000)
        c:/dev/go/src/runtime/proc.go:1695 +0x49
runtime.stoplockedm()
        c:/dev/go/src/runtime/proc.go:1763 +0x137
runtime.schedule()
        c:/dev/go/src/runtime/proc.go:2150 +0x290
runtime.goschedImpl(0x10b54000)
        c:/dev/go/src/runtime/proc.go:2270 +0xd8
runtime.gopreempt_m(0x10b54000)
        c:/dev/go/src/runtime/proc.go:2285 +0x2c
runtime.newstack(0x0)
        c:/dev/go/src/runtime/stack.go:1081 +0x262
runtime.morestack()
        c:/dev/go/src/runtime/asm_386.s:409 +0x77

goroutine 1 [runnable, locked to thread]:
runtime.init()
        c:/dev/go/src/runtime/zcallback_windows.go:6 +0x9c
runtime.main()
        c:/dev/go/src/runtime/proc.go:141 +0xef
runtime.goexit()
        c:/dev/go/src/runtime/asm_386.s:1629 +0x1
exit status 2

C:\dev\go\test\fixedbugs>git diff
diff --git a/src/runtime/os_windows.go b/src/runtime/os_windows.go
index 0db57f8..cbe0b94 100644
--- a/src/runtime/os_windows.go
+++ b/src/runtime/os_windows.go
@@ -522,6 +522,11 @@ func newosproc(mp *m, stk unsafe.Pointer) {
                print("runtime: failed to create new OS thread (have ", mcount()
                throw("runtime.newosproc")
        }
+
+       println("mcount=", mcount())
+       if mcount() == 4 {
+               panic("ALEX")
+       }
 }

 // Used by the C library build mode. On Linux this function would allocate a

C:\dev\go\test\fixedbugs>go run bug093.go
mcount= 2
mcount= 3
mcount= 4
fatal error: unexpected signal during runtime execution
[signal 0xc0000005 code=0x0 addr=0x0 pc=0x409d70]

runtime stack:
runtime.throw(0x45949c, 0x2a)
        c:/dev/go/src/runtime/panic.go:596 +0x7c
runtime.sigpanic()
        c:/dev/go/src/runtime/signal_windows.go:155 +0x164
runtime.mallocgc(0x8, 0x44c760, 0x1, 0x0)
        c:/dev/go/src/runtime/malloc.go:669 +0x5c0
runtime.newobject(0x44c760, 0x4214c5)
        c:/dev/go/src/runtime/malloc.go:801 +0x2c
runtime.convT2E(0x44c760, 0x6fe70, 0x20000, 0x440d10)
        c:/dev/go/src/runtime/iface.go:198 +0x2a
runtime.newosproc(0x10b5e500, 0x0)
        c:/dev/go/src/runtime/os_windows.go:528 +0xe6
runtime.newm(0x0, 0x10b52000)
        c:/dev/go/src/runtime/proc.go:1598 +0xe7
runtime.startm(0x10b52000, 0x423b00)
        c:/dev/go/src/runtime/proc.go:1668 +0x141
runtime.handoffp(0x10b52000)
        c:/dev/go/src/runtime/proc.go:1695 +0x49
runtime.stoplockedm()
        c:/dev/go/src/runtime/proc.go:1763 +0x137
runtime.schedule()
        c:/dev/go/src/runtime/proc.go:2150 +0x290
runtime.park_m(0x10b54000)
        c:/dev/go/src/runtime/proc.go:2255 +0x95
runtime.mcall(0x6ff28)
        c:/dev/go/src/runtime/asm_386.s:279 +0x47

goroutine 1 [runnable, locked to thread]:
runtime.gopark(0x45a094, 0x10b4a1f0, 0x45684a, 0xc, 0x10b4c117, 0x3)
        c:/dev/go/src/runtime/proc.go:261 +0x109
runtime.goparkunlock(0x10b4a1f0, 0x45684a, 0xc, 0x10b4a117, 0x3)
        c:/dev/go/src/runtime/proc.go:267 +0x4c
runtime.chanrecv(0x44a3c0, 0x10b4a1c0, 0x0, 0x10b6ff01, 0x40ea85)
        c:/dev/go/src/runtime/chan.go:513 +0x2ec
runtime.chanrecv1(0x44a3c0, 0x10b4a1c0, 0x0)
        c:/dev/go/src/runtime/chan.go:395 +0x24
runtime.gcenable()
        c:/dev/go/src/runtime/mgc.go:210 +0x73
runtime.main()
        c:/dev/go/src/runtime/proc.go:151 +0x11f
runtime.goexit()
        c:/dev/go/src/runtime/asm_386.s:1629 +0x1
exit status 2

C:\dev\go\test\fixedbugs>

Alex

@aclements
Copy link
Member

lpParameter is just a uintptr - Windows does not care what is in it. I also tried calling CreateThread with lpStartAddress set to 0 and 123, and CreateThread succeeds regardless.

Interesting. Thanks. Too bad, though; that would have been a plausible explanation. :)

This is what I see on my computer:

Huh. That's a lot more threads than I expected. But at least it explains why mcount is 5.

@bradfitz
Copy link
Contributor Author

@aclements, how far back do your logs go? Is this failure new to Go 1.8?

@aclements
Copy link
Member

@bradfitz
Copy link
Contributor Author

@jstarks, any idea why _CreateThread might sometimes return ERROR_ACCESS_DENIED?

@ianlancetaylor
Copy link
Contributor

Looking at Wine suggests that _CreateThread can return ERROR_ACCESS_DENIED if it is called while the process is exiting. That is, one thread calls _ExitProcess and then some other thread, before the system stops it, calls _CreateThread. That CreateThread call will fail with ERROR_ACCESS_DENIED.

If that theory is correct, then one fix would to change exit in os_windows.go to set a new variable exiting and have newosproc simply sleep if exiting is true. os_plan9.go does something slightly similar in https://golang.org/cl/21135.

It might be interesting if somebody with a Windows system could see how often this problem occurs, by simply running bug093 in a loop, so we can see whether that change actually fixes the problem.

@aclements aclements changed the title test: fixedbugs/bug093.go flake runtime: CreateThread fails with ERROR_ACCESS_DENIED Dec 20, 2016
@bradfitz
Copy link
Contributor Author

bradfitz commented Dec 20, 2016

I built fixedbugs/bug093.go on Windows 10 (with CGO_ENABLED=0, though, if that matters) and ran the binary in a loop via another binary using os/exec.Command("./bug093.exe").CombinedOutput() in a loop.

It's now been thousands of iterations, and no failure. But maybe Windows 10 doesn't have the problem. Our builders are much older. (Windows 2008 Server)

Edit: it ran for about 100,000 iterations without failure overnight, before the computer went to sleep on its own. Still can't reproduce. Maybe I'll find time to try on a Windows 2008 Server VM next.

@alexbrainman
Copy link
Member

somebody with a Windows system could see how often this problem occurs

I am travelling until after New Year. I only have my Chromebook with me.

Alex

@gopherbot
Copy link

CL https://golang.org/cl/34616 mentions this issue.

gopherbot pushed a commit that referenced this issue Dec 21, 2016
On Windows, CreateThread occasionally fails with ERROR_ACCESS_DENIED.
We're not sure why this is, but the Wine source code suggests that
this can happen when there's a concurrent CreateThread and ExitProcess
in the same process.

Fix this by setting a flag right before calling ExitProcess and
halting if CreateThread fails and this flag is set.

Updates #18253 (might fix it, but we're not sure this is the issue and
can't reproduce it on demand).

Change-Id: I1945b989e73a16cf28a35bf2613ffab07577ed4e
Reviewed-on: https://go-review.googlesource.com/34616
TryBot-Result: Gobot Gobot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@bradfitz
Copy link
Contributor Author

I'm going to close this until we see it happen again.

@bradfitz bradfitz modified the milestones: Go1.9, Go1.8Maybe Mar 16, 2017
@bradfitz bradfitz reopened this Mar 16, 2017
@bradfitz
Copy link
Contributor Author

Reopening. Apparently Bazel is running into this occasionally. I'll let them add more details here.

/cc @johnsonj

@bradfitz bradfitz added help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows labels Mar 16, 2017
@jstarks
Copy link

jstarks commented Mar 16, 2017

I can confirm that ERROR_ACCESS_DENIED will be returned if you call CreateThread while the process is terminating. I don't immediately see another reason this would be returned.

@gunan
Copy link

gunan commented Mar 16, 2017

TensorFlow sometimes need buildozer from this repository on windows:
https://github.com/bazelbuild/buildifier
However, we run into this same issue occasionally.

cc @yifeif

@bradfitz
Copy link
Contributor Author

@gunan, can you provide any more details or repro steps?

@gunan
Copy link

gunan commented Mar 16, 2017

We will work on a smaller opensource example.

@aclements
Copy link
Member

@gunan, just to confirm: you're encountering this with a binary built with Go 1.8?

@gunan
Copy link

gunan commented Mar 16, 2017

@aclements yes, we built buildifier and buildozer with go 1.8

@johnsonj
Copy link
Contributor

Happy to take a look at a repro

@gunan
Copy link

gunan commented Mar 21, 2017

Sorry for the delay, I was not able to get to creating a minimal example.
Will try to look into this later this week.

@aclements
Copy link
Member

@gunan, any progress on a minimal example?

@gunan
Copy link

gunan commented Jun 6, 2017

So far I have not been able to isolate this in buildozer.
Still working on it.

@aclements
Copy link
Member

Recent failures:

$ greplogs -dashboard -E "failed to create new OS thread" -l -md | grep windows | tail -n 10
2015-08-06T23:11:34-b14b8d6/windows-amd64-gce
2016-04-21T16:01:05-7c6b48f/windows-amd64-race
2016-08-30T00:10:47-67ea710/windows-386-gce
2016-10-12T02:48:31-460d112/windows-amd64-gce
2016-10-17T18:53:22-0ba3c60/windows-386-gce
2016-11-11T21:00:11-10d2efd/windows-amd64-gce
2016-11-23T00:44:01-50c4dbc/windows-amd64-gce
2016-12-07T16:59:25-1ea60c1/windows-386-gce
2017-01-13T23:06:22-e48919b/windows-amd64-gce
2017-05-18T23:43:24-ca598e3/windows-amd64-2008

Interestingly, several of these are cut off part way through the message, which sure smells like we're still having issues with calling CreateThread concurrently with the process exiting.

@alexbrainman
Copy link
Member

we're still having issues with calling CreateThread concurrently with the process exiting.

Looks like we are calling Windows ExitProcess in syscall too:

# grep ExitProcess * -rn                                                     
Binary file debug/pe/testdata/gcc-386-mingw-exec matches                             
Binary file debug/pe/testdata/gcc-386-mingw-no-symbols-exec matches                  
Binary file runtime/race/race_windows_amd64.syso matches                             
runtime/os_windows.go:24://go:cgo_import_dynamic runtime._ExitProcess ExitProcess%1 "kernel32.dll"                                                                        
runtime/os_windows.go:68:       _ExitProcess,                                        
runtime/os_windows.go:439:      stdcall1(_ExitProcess, uintptr(code))                
runtime/os_windows.go:626:                      // concurrently with ExitProcess. If this                                                                                 
syscall/syscall_windows.go:145://sys    ExitProcess(exitcode uint32)                 
syscall/syscall_windows.go:239:func Exit(code int) { ExitProcess(uint32(code)) }     
syscall/zsyscall_windows.go:56: procExitProcess                        = modkernel32.NewProc("ExitProcess")                                                               
syscall/zsyscall_windows.go:288:func ExitProcess(exitcode uint32) {                  
syscall/zsyscall_windows.go:289:        Syscall(procExitProcess.Addr(), 1, uintptr(exitcode), 0, 0)                                                                       
# grep syscall.Exit * -rn                                                    
os/proc.go:54:  syscall.Exit(code)                                                   
#                                                                            

That can easily account for this failure. I suppose we should just replace syscall.Exit with call to runtime.exit. No?

Alex

@gopherbot
Copy link

CL https://golang.org/cl/45115 mentions this issue.

@aclements
Copy link
Member

@alexbrainman, good call! In fact, it looks like the compiler always explicitly os.Exit's, even on success. Other build tools may, too. That could explain why we almost always see this in the test directory (it does a huge number of builds) and why it's scattered across different tests (it doesn't matter which test you're building, just the timing of when the compiler exits).

@golang golang locked and limited conversation to collaborators Jun 8, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows Testing An issue that has been verified to require only test changes, not just a test failure.
Projects
None yet
Development

No branches or pull requests

8 participants