Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/crypto/ssh: ssh.Dial can hang indefinitely #15113

Closed
a-robinson opened this issue Apr 4, 2016 · 6 comments
Closed

x/crypto/ssh: ssh.Dial can hang indefinitely #15113

a-robinson opened this issue Apr 4, 2016 · 6 comments

Comments

@a-robinson
Copy link

Please answer these questions before submitting your issue. Thanks!

  1. What version of Go are you using (go version)?
    1.4.2
  2. What operating system and processor architecture are you using (go env)?
    linux, amd64
  3. What did you do?
    Called ssh.Dial()
  4. What did you expect to see?
    I'd expect ssh.Dial() to return either an opened client or an error within some reasonable amount of time, ideally no more than a few minutes.
  5. What did you see instead?
    ssh.Dial() has been hung for more than three hours.

We ran into this in Kubernetes. We aren't doing anything special--just SSHing to a known host and port over TCP with a specified user and public key. It appears as though the underlying problem is that a TCP connection was established (i.e. net.Dial() returned a connection), but then establishing an SSH connection using the underlying TCP connection hung. That means that #14941 wouldn't help here, since that just specifies a timeout for net.Dial(), not for establishing the client SSH connection after that.

Stack trace from the stuck goroutine:

goroutine 85851 [IO wait, 178 minutes]:
net.(*pollDesc).Wait(0xc20ae2e840, 0x72, 0x0, 0x0)
    /usr/src/go/src/net/fd_poll_runtime.go:84 +0x47
net.(*pollDesc).WaitRead(0xc20ae2e840, 0x0, 0x0)
    /usr/src/go/src/net/fd_poll_runtime.go:89 +0x43
net.(*netFD).Read(0xc20ae2e7e0, 0xc20c38b2d0, 0x1, 0x1, 0x0, 0x7f20600e52a0, 0xc20c38b2d8)
    /usr/src/go/src/net/fd_unix.go:242 +0x40f
net.(*conn).Read(0xc20b7add18, 0xc20c38b2d0, 0x1, 0x1, 0x0, 0x0, 0x0)
    /usr/src/go/src/net/net.go:121 +0xdc
io.ReadAtLeast(0x7f20600eb598, 0xc20b7add18, 0xc20c38b2d0, 0x1, 0x1, 0x1, 0x0, 0x0, 0x0)
    /usr/src/go/src/io/io.go:298 +0xf1
io.ReadFull(0x7f20600eb598, 0xc20b7add18, 0xc20c38b2d0, 0x1, 0x1, 0x40, 0x0, 0x0)
    /usr/src/go/src/io/io.go:316 +0x6d
golang.org/x/crypto/ssh.readVersion(0x7f20600eb598, 0xc20b7add18, 0x0, 0x0, 0x0, 0x0, 0x0)
    /go/src/k8s.io/kubernetes/Godeps/_workspace/src/golang.org/x/crypto/ssh/transport.go:303 +0x167
golang.org/x/crypto/ssh.exchangeVersions(0x7f205ff485c0, 0xc20b7add18, 0xc20c38b2c0, 0xa, 0x10, 0x0, 0x0, 0x0, 0x0, 0x0)
    /go/src/k8s.io/kubernetes/Godeps/_workspace/src/golang.org/x/crypto/ssh/transport.go:287 +0x2f1
golang.org/x/crypto/ssh.(*connection).clientHandshake(0xc20cdb9100, 0xc20bdf6380, 0x11, 0xc20b56e1e0, 0x0, 0x0)
    /go/src/k8s.io/kubernetes/Godeps/_workspace/src/golang.org/x/crypto/ssh/client.go:91 +0x132
golang.org/x/crypto/ssh.NewClientConn(0x7f20600ea500, 0xc20b7add18, 0xc20bdf6380, 0x11, 0xc20b56e140, 0x0, 0x0, 0x0, 0xe, 0x0, ...)
    /go/src/k8s.io/kubernetes/Godeps/_workspace/src/golang.org/x/crypto/ssh/client.go:74 +0x140
golang.org/x/crypto/ssh.Dial(0x1d48080, 0x3, 0xc20bdf6380, 0x11, 0xc20b56e140, 0x11, 0x0, 0x0)
    /go/src/k8s.io/kubernetes/Godeps/_workspace/src/golang.org/x/crypto/ssh/client.go:176 +0xf9
k8s.io/kubernetes/pkg/ssh.(*SSHTunnel).Open(0xc20d0490e0, 0x0, 0x0)
    /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/ssh/ssh.go:114 +0xbc
k8s.io/kubernetes/pkg/ssh.(*SSHTunnelList).createAndAddTunnel(0xc208228360, 0xc2085c6720, 0xe)
    /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/ssh/ssh.go:433 +0x331
created by k8s.io/kubernetes/pkg/ssh.(*SSHTunnelList).removeAndReAdd
    /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/pkg/ssh/ssh.go:345 +0x3ae

cc @cjcullen

@hanwen
Copy link
Contributor

hanwen commented Apr 6, 2016

I don't think the SSH spec says anything about what constitutes "reasonable" timeout. The HTTP library also defaults to no timeouts.

ssh.Dial is a convenience method, and you can use NewClientConn with a TCP connection tweaked to your desire

I assume you know that the remote end is probably not an SSH server, or else it would hang on the version exchange.

@a-robinson
Copy link
Author

Thanks @hanwen.

Just to make sure I'm on the same page, if we open the connection ourselves and customize it, the io.ReadFull call will give up depending on how we configure it. Is setting a hard deadline on the connection with SetDeadline() rather than a timeout on creating the ClientConn the only option? Setting a hard deadline on a long-lived connection SSH connection feels wrong to me, but I'm probably misunderstanding something. Would the recommended pattern be to set the deadline before establishing the ClientConn, then unset it afterwards?

@hanwen
Copy link
Contributor

hanwen commented Apr 7, 2016

SSH has support for keepalive. OpenSSH sends them, and the Go client can also issue them. Just send a "keepalive@openssh.com" request, either global or per-channel every once in a while. This ensures somethign gets sent over the connection so it appears live.

@minusnine
Copy link

@hanwen Now that Go 1.7 is out, what do you think about having a new version of NewClientConn (and maybe Dial) support a context.Context?

The caller can do everything the library could do, but it's messy, and this would be a nice API for timing out/canceling session initialization.

@hanwen
Copy link
Contributor

hanwen commented Jul 22, 2016

Can you come up with a comprehensive list of things you want in this -ehrm- context? All of net/http/ goes through http.Request which provides a convenient place to add a context, but that is not the case for SSH, and I'd hate to have to duplicate all functions to provide Context flavors.

@dsnet
Copy link
Member

dsnet commented Mar 1, 2017

Apparently, this functionality has been adding in https://golang.org/cl/21136

@dsnet dsnet closed this as completed Mar 1, 2017
@golang golang locked and limited conversation to collaborators Mar 1, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants