Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: add test for syscall failing to create new OS thread during syscall.Exec #20822

Open
jvshahid opened this issue Jun 28, 2017 · 14 comments
Labels
NeedsFix The path to resolution is known, but the work has not been done. Testing An issue that has been verified to require only test changes, not just a test failure.
Milestone

Comments

@jvshahid
Copy link
Contributor

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version go1.8 linux/amd64 (same behavior with 1.8.3)

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"          
GOBIN=""                
GOEXE=""                
GOHOSTARCH="amd64"      
GOHOSTOS="linux"        
GOOS="linux"            
GOPATH="/home/jvshahid/codez/gocodez"           
GORACE=""               
GOROOT="/home/jvshahid/.gvm/gos/go1.8"          
GOTOOLDIR="/home/jvshahid/.gvm/gos/go1.8/pkg/tool/linux_amd64"                                  
GCCGO="gccgo"           
CC="gcc"                
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build588313748=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"               
CGO_ENABLED="1"         
PKG_CONFIG="pkg-config" 
CGO_CFLAGS="-g -O2"     
CGO_CPPFLAGS=""         
CGO_CXXFLAGS="-g -O2"   
CGO_FFLAGS="-g -O2"     
CGO_LDFLAGS="-g -O2"    

What did you do?

Run this app in a while loop, e.g. while true; do go run main.go; done

What did you expect to see?

/path/to/pwd
/path/to/pwd
/path/to/pwd
/path/to/pwd
/path/to/pwd
/path/to/pwd

What did you see instead?

runtime: failed to create new OS thread (have 5 already; errno=11)                               
runtime: may need to increase max user processes (ulimit -u)                                     
fatal error: newosproc                                                                           

Kernel version (uname -a)

Linux amun 4.4.0-81-generic #104-Ubuntu SMP Wed Jun 14 08:17:06 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

There are few issues that were opened in the past with the same error message. The most relevant comment i found in all of them is this comment which suggests that this could be a kernel issue and was looking for a way to reproduce the problem. Some interesting notes:

  1. setting GOMAXPROCS to 1 make the problem hard to reproduce (may be event eliminate it)
  2. the go runtime usually gets a chance to run for a while before the process threads are killed. that means that the process will sometime exec successfully and exit 0 and will sometimes exit with non-0 status code after panicing
@jvshahid
Copy link
Contributor Author

/cc @ianlancetaylor since i referenced his comment

@ianlancetaylor
Copy link
Contributor

I'm not surprised that this fails, and I don't think it's a bug. Running go run main.go means starting the Go tool, which will look at main.go, check that all the imports are up to date, run the compiler, run the linker, and only then run your (simple) program. While it is doing that, your shell loop has plenty of time to loop around and start another instance of go run main.go. The number of go run main.go builds running in parallel will steadily increase, especially as the load on the system increases and each one takes longer and longer to complete. Soon you will hit your process limit (which you can by running ulimit -u) and you will get the error you are reporting.

If you want to show a real problem, run go build main.go and then run ./main in a loop. Then you will be running a very simple program where there is a realistic possibility that the program can complete in the time it takes the shell to loop around. Even then I expect they will tend to stack up, but it should take a lot longer.

@jvshahid
Copy link
Contributor Author

This while loop is running go run main.go synchronously, i.e. it will wait for it to exit. Simple way to verify that is to replace the echo $PWD with echo before && sleep 10 && echo after.

@jvshahid
Copy link
Contributor Author

jvshahid commented Jun 28, 2017

Also worth noting this is reproducible after few runs (10 or 20 runs). It is not consuming all the pids on the system

@ianlancetaylor
Copy link
Contributor

Ah, OK, sorry.

What does ulimit -u print on your system?

Immediately after the loop fails, what does ps print?

@jvshahid
Copy link
Contributor Author

$ ulimit -u
62821

it is really hard to make the loop fail, but i currently have 377 threads running. I don't imagine this loop to be adding enough processed and/or threads to exceed the limit:

$ ps -elF | wc -l
377

@jvshahid
Copy link
Contributor Author

Here's the system wide limits:

$ cat /proc/sys/kernel/pid_max
32768
jvshahid@amun [~/codez/gocodez/src/github.com/jvshahid/testexec]
$ cat /proc/sys/kernel/threads-max
125642

I really doubt this has anything to do with limits

@bradfitz bradfitz changed the title failed to create new OS thread during syscall.Exec runtime: failed to create new OS thread during syscall.Exec Jun 28, 2017
@bradfitz bradfitz added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 28, 2017
@bradfitz
Copy link
Contributor

Any difference with Go 1.9beta2?

@ianlancetaylor
Copy link
Contributor

Ah, you're right. This is #18146 for a program that doesn't use cgo. Sorry for forgetting about that.

@ianlancetaylor ianlancetaylor added this to the Go1.10 milestone Jun 28, 2017
@ianlancetaylor ianlancetaylor added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jun 28, 2017
@jvshahid
Copy link
Contributor Author

@bradfitz yes go1.9beta2 fixes the issue. I'm guessing it is 91139b8 that fixed it by introducing a lock. I was also curious if you think setting GOMAXPROCS to 1 is a reasonable workaround for the meantime ?

@bradfitz
Copy link
Contributor

Good to hear. So I guess what this bug needs now is a test.

I think we'd prefer you use go1.9beta2 as your fix rather than GOMAXPROCS=1 as a workaround. Go 1.9 has no known bugs compared to Go 1.8.

@bradfitz bradfitz added the Testing An issue that has been verified to require only test changes, not just a test failure. label Jun 28, 2017
@bradfitz bradfitz changed the title runtime: failed to create new OS thread during syscall.Exec runtime: add test for syscall failing to create new OS thread during syscall.Exec Jun 28, 2017
@jvshahid
Copy link
Contributor Author

@bradfitz do you think converting the bash loop into a go test will be ok to merge in ? I'm concerned that this might be flaky test. what do you think ?

@bradfitz
Copy link
Contributor

Ideally the test should execute pretty quickly. And flaky tests are no good, but I don't see why this one would be flaky. Rather than expect a failure in, say, 10,000 iterations, just do 10,000 iterations and pass if you don't get a failure. Assuming you used to generally get a failure in 10,000 iterations.

@rsc rsc modified the milestones: Go1.10, Go1.11 Nov 22, 2017
@odeke-em
Copy link
Member

Hello @jvshahid, might you be interested or available to submit a CL with the suggested test for Go1.11?

@ianlancetaylor ianlancetaylor modified the milestones: Go1.11, Unplanned Jul 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsFix The path to resolution is known, but the work has not been done. Testing An issue that has been verified to require only test changes, not just a test failure.
Projects
None yet
Development

No branches or pull requests

5 participants