Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime, net: OS X 10.9 kernel dumpens quicker spinning applications down by default #7582

Closed
gopherbot opened this issue Mar 19, 2014 · 15 comments

Comments

@gopherbot
Copy link
Contributor

by jake.net:

I have the following code, and when I run it after a certain amount of time I it
crashes. Could this be be a file descriptor exhaustion problem? 

What does 'go version' print?
go version go1.2.1 darwin/386

Mac OS X 10.9

What steps reproduce the problem?
If possible, include a link to a program on play.golang.org.

http://play.golang.org/p/-74yJshfTk

package main

import (
    "log"
    "net/http"
    "sync"
)

const MaxOutstanding int = 2000

var semaphore = make(chan int, MaxOutstanding)
var wg sync.WaitGroup

func init() {
    for i := 0; i < MaxOutstanding; i++ {
        semaphore <- 1
    }
}

func main() {
    log.Println("start")
    for i := 0; i < 5000; i++ {
        wg.Add(1)
        go handle(i)
    }
    wg.Wait()
    log.Println("finish")
}

func handle(i int) {
    <-semaphore
    process(i)
    semaphore <- 1
}

func process(i int) {
    resp, err := http.Get("http://localhost:3000";)
    panicIf(err)
    defer resp.Body.Close()

    log.Println("handle", i)
    wg.Done()
}

func panicIf(err error) {
    if err != nil {
        panic(err)
    }
}

What happened?
Sometimes it says: 
panic: Get http://localhost:3000: dial tcp 127.0.0.1:3000: connection reset by peer
panic: Get http://localhost:3000: dial tcp 127.0.0.1:3000: can't assign requested address

What should have happened instead?
It should not panic :)

Please provide any additional information below.
I have asked on the golang-nuts mailing list 
https://groups.google.com/forum/#!topic/golang-nuts/NY7NMx1jAVo
@davecheney
Copy link
Contributor

Comment 1:

What does `ulimit -a` print on your system ?

@ianlancetaylor
Copy link
Member

Comment 2:

Labels changed: added release-go1.3, repo-main, os-macosx.

@gopherbot
Copy link
Contributor Author

Comment 4 by jake.net:

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
file size               (blocks, -f) unlimited
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 2560
pipe size            (512 bytes, -p) 1
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 709
virtual memory          (kbytes, -v) unlimited

@davecheney
Copy link
Contributor

Comment 5:

@jake you are opening 5,000 http connections and only have 2,560 file descriptors
available.

@gopherbot
Copy link
Contributor Author

Comment 6 by jake.net:

No even with `const MaxOutstanding int = 200` I can get that error.

@davecheney
Copy link
Contributor

Comment 7:

Can you try to make a smaller example ? If this is about DNS exhaustion you don't need
to bring in the entire HTTP package, just use net.Dial("tcp", "localhost:3000")

@gopherbot
Copy link
Contributor Author

Comment 8 by jake.net:

I can still get the error just using net.Dial
Every time I run it there is a different result. Like I can set MaxOutstanding to 1 or
10 and still get an error. 
package main
import (
    "log"
    "net"
    "sync"
)
const MaxOutstanding int = 300
var semaphore = make(chan int, MaxOutstanding)
var wg sync.WaitGroup
func init() {
    for i := 0; i < MaxOutstanding; i++ {
        semaphore <- 1
    }
}
func main() {
    log.Println("start")
    for i := 0; i < 10000; i++ {
        wg.Add(1)
        go handle(i)
    }
    wg.Wait()
    log.Println("finish")
}
func handle(i int) {
    <-semaphore
    process(i)
    semaphore <- 1
}
func process(i int) {
    conn, err := net.Dial("tcp", "localhost:3000")
    panicIf(err)
    defer conn.Close()
    log.Println("handle", i)
    wg.Done()
}
func panicIf(err error) {
    if err != nil {
        panic(err)
    }
}

@davecheney
Copy link
Contributor

Comment 9:

I can't reproduce this, 
http://play.golang.org/p/rqHasq8tbW
^ your example slightly shortened. The default number of files on my system is 256,
setting MaxOutstanding to 200 causes the test to pass. Well, until panicIf complains
because there is nothing listening on port 3000

@gopherbot
Copy link
Contributor Author

Comment 10 by jake.net:

I spun up a Martini web server on 3000
Then ran your example with const MaxOutstanding int = 100 and it works first run.
The run it again and it fails.

@davecheney
Copy link
Contributor

Comment 11:

What error does it fail with ?

@gopherbot
Copy link
Contributor Author

Comment 12 by jake.net:

panic: dial tcp 127.0.0.1:3000: can't assign requested address

@crawshaw
Copy link
Member

Comment 13:

The concurrency is not necessary to replicate this. Here is a minimal version:
package main
import (
    "fmt"
    "net"
)
func do() {
    for i := 0; i < 200; i++ {
        conn, err := net.Dial("tcp", ":3000")
        if err != nil {
            panic(err)
        }
        if err := conn.Close(); err != nil {
            panic(err)
        }
    }
}
func main() {
    for i := 0; i < 100; i++ {
        fmt.Println("loop", i)
        do()
    }
}
With tip
    go version devel +9eacb9c0d810 Thu Apr 24 12:24:22 2014 -0700 + darwin/amd64
and starting a server
    godoc -http=:3000
this regularly panics about half way through with:
panic: dial tcp :3000: can't assign requested address
goroutine 1 [running]:
runtime.panic(0xf0700, 0xc2100460c0)
    /usr/local/go/src/pkg/runtime/panic.c:266 +0xb6
main.do()
    /Users/crawshaw/junk2.go:12 +0xaa
main.main()
    /Users/crawshaw/junk2.go:24 +0xe2
exit status 2

@mikioh
Copy link
Contributor

mikioh commented Apr 25, 2014

Comment 14:

Looks like this is a kinda resource exhaustion issue on the latest OS X but not related
to the number of file/socket descriptors; so you don't need to tweak launchd. I just
tried to repro #13 on OS X and got the following:
/var/log/systemlog:
process issue7582[3042] caught causing excessive wakeups. Observed wakeups rate (per
sec): 10143; Maximum permitted wakeups rate (per sec): 150; Observation period: 300
seconds; Task lifetime number of wakeups: 45005
So certainly adding time.Sleep(an appropriate value) into the for-loop appears a
different result, but not sure what we could do for quicker spinning applications on OS
X 10.9 and beyond.

Labels changed: removed release-go1.3.

Status changed to HelpWanted.

@ja30278
Copy link

ja30278 commented Apr 28, 2014

Comment 15:

After some digging, I'm fairly certain that this is just ephemeral port exhaustion. OSX
only allocates  16k ports for dynamic use, as compared to ~32k on linux.
Linux:
jonallie@foo:~$ cat /proc/sys/net/ipv4/ip_local_port_range
32768   61000
OSX
jonallie-macbookpro2:gophercon jonallie$ sysctl net.inet.ip.portrange
net.inet.ip.portrange.lowfirst: 1023
net.inet.ip.portrange.lowlast: 600
net.inet.ip.portrange.first: 49152
net.inet.ip.portrange.last: 65535
net.inet.ip.portrange.hifirst: 49152
net.inet.ip.portrange.hilast: 65535
Tellingly, the repro script runs ~100 loops that open 200 connections each..and usually
fails (for me) around loop 80. Increasing the port range via:
jonallie-macbookpro$ sudo sysctl -w net.inet.ip.portrange.first=32768
net.inet.ip.portrange.first: 49152 -> 32768
jonallie-macbookpro$ sudo sysctl -w net.inet.ip.portrange.hifirst=32768
net.inet.ip.portrange.hifirst: 49152 -> 32768
allows the test script to complete a full 100 loops, and increasing it to 200 loops
causes it to fail as expected.

@mikioh
Copy link
Contributor

mikioh commented Apr 28, 2014

Comment 16:

Good catch!

Status changed to Retracted.

@golang golang locked and limited conversation to collaborators Jun 25, 2016
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants