-
Notifications
You must be signed in to change notification settings - Fork 18k
syscall: Accept sometimes returns address family not supported by protocol family error on OS X #3849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Comments
Yes, you may be right, it's very strange indeed. I looked into xnu and I think there might be a race condition there somewhere. Basically, during accept it takes a malloc'ed sockaddr, which could be NULL, and there might be a chance (haven't checked if locking is correct) for socket accept to succeed, but sockaddr_in generation to fail (result of in_setpeeraddr is not checked for errors) due to a connection being reset in the mean time (another possibility of malloc failing is too unlikely). In short, it may be that on Mac OS X accept() may succeed but not fill sockaddr address parameter, leaving it with garbage. :( |
I've been running my server for a while now and found that this error is actually happening quite often (probably because chineese IP addresses are constantly trying to scan me). I wrapped ListenAndServe into a for loop (since I got tired of restarting it), and here it is, less than an hour and two errors logged already: 2012/07/24 00:05:11 Server running... 2012/07/24 00:25:02 accept tcp 0.0.0.0:12345: address family not supported by protocol family 2012/07/24 00:51:35 accept tcp 0.0.0.0:12345: address family not supported by protocol family |
Sure, I added a simple Write(1, ...) to anyToSockaddr and here's what I got: anyToSockaddr: Addr.Len = 0 Addr.Family = 0 anyToSockaddr failed, nfd = 3 2012/07/25 00:07:49 accept tcp 0.0.0.0:12345: address family not supported by protocol family anyToSockaddr: Addr.Len = 0 Addr.Family = 0 anyToSockaddr failed, nfd = 4 2012/07/25 00:07:49 accept tcp 0.0.0.0:12345: address family not supported by protocol family anyToSockaddr: Addr.Len = 0 Addr.Family = 0 anyToSockaddr failed, nfd = 4 2012/07/25 00:07:55 accept tcp 0.0.0.0:12345: address family not supported by protocol family anyToSockaddr: Addr.Len = 0 Addr.Family = 0 anyToSockaddr failed, nfd = 4 2012/07/25 00:07:56 accept tcp 0.0.0.0:12345: address family not supported by protocol family This is especially cool since I found how to reproduce it basically 100% of the time. Server up on my machine (Mac OS X 10.7.4), then I run this: $ nmap -p 80 myhost On my linux box in London (the port 80 is forwarded by my router to port 12345 on my machine). Turns out it happens to work like a charm and reliably triggers this bug. :) Here's a diff in pkg/syscall where I added that print: diff -r 5e806355a9e1 src/pkg/syscall/syscall_bsd.go --- a/src/pkg/syscall/syscall_bsd.go Thu Jun 14 12:50:42 2012 +1000 +++ b/src/pkg/syscall/syscall_bsd.go Wed Jul 25 00:14:05 2012 +0400 @@ -294,6 +294,7 @@ } return sa, nil } + Write(1, []byte("anyToSockaddr: Addr.Len = " + itoa(int(rsa.Addr.Len)) + " Addr.Family = " + itoa(int(rsa.Addr.Family)) + "\n")) return nil, EAFNOSUPPORT } @@ -306,6 +307,7 @@ } sa, err = anyToSockaddr(&rsa) if err != nil { + Write(1, []byte("anyToSockaddr failed, nfd = " + itoa(nfd) + "\n")) Close(nfd) nfd = 0 } It's probably just like I suspected, socket is accepted by due to a race in the kernel there's no in_pcb associated anymore, so there's no sockaddr and thus it's not copied to user space, leaving all zeroes. If you're interested this is a possible call chain during accept: http://fxr.watson.org/fxr/source/bsd/kern/kpi_socket.c?v=xnu-1699.24.8#L154 http://fxr.watson.org/fxr/source/bsd/netinet/tcp_usrreq.c?v=xnu-1699.24.8;im=10#L537 http://fxr.watson.org/fxr/source/bsd/netinet/in_pcb.c?v=xnu-1699.24.8;im=10#L1072 I'm not sure there's any sane way to fix (or even workaround) it in go though... it's just not supposed to work like that. :( |
Oh wow, it's been right there... First I was looking in the wrong file (should be uipc_syscalls.c, not kpi_socket.c), but even there accept_nocancel() simply ignores error result from soacceptlock(). There might not even be any races, ECONNABORTED simply gets ignored and never propagates to the caller. :-/ It is feasible to convert such "successes" when Addr.Len is 0 to ECONNABORTED? Preliminary patch is something like this: diff -r 5e806355a9e1 src/pkg/syscall/syscall_bsd.go --- a/src/pkg/syscall/syscall_bsd.go Thu Jun 14 12:50:42 2012 +1000 +++ b/src/pkg/syscall/syscall_bsd.go Wed Jul 25 01:25:41 2012 +0400 @@ -304,6 +304,14 @@ if err != nil { return } + if rsa.Addr.Len == 0 && rsa.Addr.Family == AF_UNSPEC { + // Workaround for Darwin: xnu ignores errors from + // soacceptlock, so ECONNABORTED is not returned + // from accept syscall. Turn it into a correct + // error here. + Close(nfd) + return 0, nil, ECONNABORTED + } sa, err = anyToSockaddr(&rsa) if err != nil { Close(nfd) I'm not sure if this is entirely correct though. |
Oh, and just for reference, I managed to find probably the same bug only in FreeBSD 3: http://fxr.watson.org/fxr/source/kern/uipc_syscalls.c?v=FREEBSD3#L271 (in later FreeBSD versions it appears to be fixed) |
This issue was closed by revision 5197fa8. Status changed to Fixed. |
adg
pushed a commit
that referenced
this issue
May 11, 2015
««« backport 0eae95b0307a syscall: workaround accept() bug on Darwin Darwin kernels have a bug in accept() where error result from an internal call is not checked and socket is accepted instead of ECONNABORTED error. However, such sockets have no sockaddr, which results in EAFNOSUPPORT error from anyToSockaddr, making Go http servers running on Mac OS X easily susceptible to denial of service from simple port scans with nmap. Fixes #3849. R=golang-dev, adg, mikioh.mikioh CC=golang-dev https://golang.org/cl/6456045 »»»
This issue was closed.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
The text was updated successfully, but these errors were encountered: