Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: dragonfly broken by Listen after close on same addr #13146

Closed
bradfitz opened this issue Nov 4, 2015 · 9 comments
Closed

net: dragonfly broken by Listen after close on same addr #13146

bradfitz opened this issue Nov 4, 2015 · 9 comments

Comments

@bradfitz
Copy link
Contributor

bradfitz commented Nov 4, 2015

I seem to have broken the dragonfly builder after:

syscall: allow nacl's fake network code to Listen twice on the same address
https://go-review.googlesource.com/16650 (rev 8ee90fa)

... which added a test to the net package, which dragonfly can't pass.

Failure is:

http://build.golang.org/log/732b7056b17948b06073dbe399de931765307152

--- FAIL: TestListenCloseListen (0.00s)
    net_test.go:283: failed on try 1/10: listen tcp 127.0.0.1:52085: bind: address already in use
    net_test.go:283: failed on try 2/10: listen tcp 127.0.0.1:52093: bind: address already in use
    net_test.go:283: failed on try 3/10: listen tcp 127.0.0.1:52101: bind: address already in use
    net_test.go:283: failed on try 4/10: listen tcp 127.0.0.1:52109: bind: address already in use
    net_test.go:283: failed on try 5/10: listen tcp 127.0.0.1:52117: bind: address already in use
FAIL
FAIL    net 4.894s

I see a recent failure: http://build.golang.org/log/c6e3c951e0398d21ff52b932bc261ab70003120b

--- FAIL: TestServerConnState (1.02s)
    serve_test.go:3041: Unexpected events.
        Got log: Conn 1: new active idle active closed 
        Conn 2: new active idle active 
        Conn 3: new active hijacked 
        Conn 4: new active hijacked 
        Conn 5: new closed 
        Conn 6: new active closed 
        Conn 7: new active idle closed 

           Want: Conn 1: new active idle active closed 
        Conn 2: new active idle active closed 
        Conn 3: new active hijacked 
        Conn 4: new active hijacked 
        Conn 5: new closed 
        Conn 6: new active closed 
        Conn 7: new active idle closed 

FAIL
FAIL    net/http    20.463s

... which suggests the net package, poller, or runtime are processing networking events in a different order than all the other platforms.

We also don't have a new-style builder for dragonfly, so people without dragonfly can't use gomote or trybots to debug.

I think a Dragonfly person needs to help.

@ianlancetaylor
Copy link
Contributor

This suggests that there is something wrong with the way we are using SO_REUSEADDR on DragonFly. Unfortunately we use the same code on all the *BSD systems, so I have no idea why DragonFly would behave differently.

@ianlancetaylor ianlancetaylor added this to the Unplanned milestone Nov 5, 2015
@mdempsky
Copy link
Member

CC @fupjack @4a6f656c

@fupjack
Copy link

fupjack commented Nov 14, 2015

I am working on tracking someone down who can solve it, who I will immediately refer to as NotMe.

@jorisgio
Copy link

Hi, my understanding is that SO_REUSADDR is meant to relax the rules of address collision at bind. The man says :

SO_REUSEADDR indicates that the rules used in validating addresses supplied in a bind(2) call should allow reuse of local addresses.

This is not really clear, but what it appears to mean is that you can have collision in your binding : for instance you can bind to '0.0.0.0:80' and then bind to '192.168.1.1:80', something which is disallowed without this flag. But you cannot bind twice to the same exact (addr, port) tuple.

SO_REUSEPORT allows completely duplicate bindings by multiple processes

SO_REUSEPORT on the other hand allows for this behavior. You can bind twice to the syntaxically identical (addr, port) tuple.

I've tested your (bind/listen/close/bind/listen) example (in C) and indeed, it's working with SO_REUSEPORT but returns EADDRINUSE with SO_REUSEADDR.

EDIT: this is indeed working on freebsd with SO_REUSEADDR, maybe dfly should match the behavior.

@ianlancetaylor
Copy link
Contributor

SO_REUSEADDR permits a server to quickly bind to the same local address, without having to wait for the TCP TIME_WAIT state to expire. This is a long-standing feature of the socket network API. If it doesn't work on Dragonfly (I don't know whether it does or not), that has to be considered a bug in Dragonfly.

SO_REUSEPORT is much newer. It permits multiple servers to bind to the same local address. That is not what we want here.

@jorisgio
Copy link

This is indeed a bug in dfly.

@fupjack
Copy link

fupjack commented Nov 15, 2015

Fixed here:

http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/296c350d3c63a181744b80a4b7973dac5fc162a3

I've cleared the most recent breaks in the DragonFly builder and restarted them, and they are completing this test. (Albeit still failing with a different error that I suspect is related to the gold linker coming in to DragonFly.)

@mikioh
Copy link
Contributor

mikioh commented Feb 10, 2016

Looks like dragonfly guys have finished working on handling of shared IP control blocks. So the latest DragonFly kernels, at least 4.4 and above, have no issues. Also we can re-enable TestDualStack{TCP,UDP}Listener in Go 1.7. Closing.

@mikioh mikioh closed this as completed Feb 10, 2016
@gopherbot
Copy link

CL https://golang.org/cl/19406 mentions this issue.

gopherbot pushed a commit that referenced this issue Feb 25, 2016
It looks like the latest DragonFly BSD kernels, at least 4.4 and above,
have finished working on handling of shared IP control blocks. Let's
re-enbale test cases referring to IP control blocks and see what
happens.

Updates #13146.

Change-Id: Icbe2250e788f6a445a648541272c99b598c3013d
Reviewed-on: https://go-review.googlesource.com/19406
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@golang golang locked and limited conversation to collaborators Feb 28, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants