Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: tls handshake EOF on new connections when passing connection between goroutines #10685

Closed
bbangert opened this issue May 4, 2015 · 9 comments

Comments

@bbangert
Copy link

bbangert commented May 4, 2015

When using SSL and websockets (using either gorilla or golang.org's websocket lib), moderate connection load will result in clients not connecting properly, and this message:

2015/05/04 13:31:58 http: TLS handshake error from 127.0.0.1:5253: EOF
2015/05/04 13:31:58 http: TLS handshake error from 127.0.0.1:5254: EOF
2015/05/04 13:31:58 http: TLS handshake error from 127.0.0.1:5255: EOF
2015/05/04 13:31:58 http: TLS handshake error from 127.0.0.1:5256: EOF
2015/05/04 13:31:58 http: TLS handshake error from 127.0.0.1:5257: EOF

This doesn't happen in my basic example (located here: https://github.com/bbangert/ssl-ram-testing/blob/master/Go/golang.org/main.go) when it uses io.Copy to implement the echo server, but when it merely uses another goroutine to write the echo'd data back then the TLS handshake errors pop up.

I observed this using the latest Go 1.4.2, we have observed it in prior versions of Go in production, but this is the first time I have a minimal example of it (for testing SSL RAM overhead per connection, which is very very high in Golang, but that's a separate issue....).

@mdempsky mdempsky changed the title tls handshake EOF when using a goroutine in http/websockets x/net/websocket: tls handshake EOF when using a goroutine May 4, 2015
@mdempsky
Copy link
Member

mdempsky commented May 4, 2015

Can you please provide instructions on how to reproduce the issue? It sounds like you said the main.go file you linked does not demonstrate the problem.

@bbangert
Copy link
Author

bbangert commented May 4, 2015

The Go file I linked to has both types of echo in it, the default build of it will exhibit the EOF issue, passing the environ var "USE_COPY=true" will run the other echo handler using io.Copy that does not exhibit the EOF.

README for the Go code: https://github.com/bbangert/ssl-ram-testing/tree/master/Go/golang.org

To run the client tester, the directions are here: https://github.com/bbangert/ssl-ram-testing

You will need to increase the open file descriptors for the shell running the testing client and the Go binary as it attaches 1000 clients:

ulimit -n 4096

Then run the the Go code in its directory with: USE_SSL=true ./run
And the tester can be run from the project root dir: USE_SSL=true ./tester/client

@bbangert
Copy link
Author

bbangert commented May 4, 2015

I should note that the exact same error occured when using Gorilla's websocket implementation, so I believe this error is in the http TLS lib itself, not x/net/websocket.

@bbangert bbangert changed the title x/net/websocket: tls handshake EOF when using a goroutine net/http: tls handshake EOF on new connections when passing connection between goroutines May 4, 2015
@bbangert
Copy link
Author

bbangert commented May 4, 2015

I'm updating the title to reflect the least common denominator. Since this is reproducible using both gorilla's websocket library and code.google.com/p/go.net/websocket, it doesn't make sense to say that its an error in the websocket library.

They both use the net/http library however, and the behavior that is occurring is a direct result of when a goroutine besides the one initially spawned to handle the request utilizes the connection object to write to it. I have tried several other variants to try and rule out race conditions, ie, I tweaked it to use the channel in lock-step to prevent read/write at once:

func echoHandler(ws *websocket.Conn) {
    echoChan := make(chan string)
    defer close(echoChan)
    go func() {
        for data := range echoChan {
            websocket.Message.Send(ws, data)
            echoChan <- data
        }
    }()

    var d string
    for true {
        err := websocket.Message.Receive(ws, d)
        if err != nil {
            echoChan <- d
            <-echoChan
        }
    }
}

That doesn't help. However, if I have the new goroutine not write the data out, and instead it just passes it back, and the original goroutine handling the request does the write, no more TLS handshake problems.

My only theory at this point is that the connection has somehow mangled things in the system somewhere on being passed between the goroutine's, such that new clients get TLS handshake errors.

@bbangert
Copy link
Author

bbangert commented May 5, 2015

I just compiled Go tip, the TLS handshake errors are gone. Anyone using websockets (whether golang.org/net/websocket or gorilla) should be aware that prior to recent Go, using a connection from two goroutines will cause bad things to happen.

I was about to hit Comment on this, when I saw that the CPU usage had skyrocketed for the basic little websocket server, even though I dropped all the connections (and memory use climbed rapidly, about 150 MB of memory every 5 seconds, so I killed it quickly). So something still seems to be wrong in some way.

Any other versions of Go I should test this with?

@bradfitz
Copy link
Contributor

bradfitz commented May 5, 2015

Let's move this conversation to golang-nuts@ and come back here when there are concrete bugs to be worked on.

It's not clear what this issue is about anymore.

@bradfitz bradfitz closed this as completed May 5, 2015
@bbangert
Copy link
Author

bbangert commented May 5, 2015

@bradfitz Have you tried running it the way I documented clearly so you could see what happens?

@bradfitz
Copy link
Contributor

bradfitz commented May 5, 2015

No, because you said it was fixed. Although I don't recall any relevant fixes during Go 1.5.

Please discuss this on golang-nuts@ where others might help identify a resource leak in your demo.

@bbangert
Copy link
Author

bbangert commented May 5, 2015

Ah, sorry, yes, 1.5 has fixed it, I forgot to close a goroutine on connection break, so my example somehow leaked super fast. After fixing it 1.5 seems to work properly, so having this closed is fine.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants