New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/crypto/ssh: ssh.Dial can hang indefinitely #15113
Comments
I don't think the SSH spec says anything about what constitutes "reasonable" timeout. The HTTP library also defaults to no timeouts. ssh.Dial is a convenience method, and you can use NewClientConn with a TCP connection tweaked to your desire I assume you know that the remote end is probably not an SSH server, or else it would hang on the version exchange. |
Thanks @hanwen. Just to make sure I'm on the same page, if we open the connection ourselves and customize it, the io.ReadFull call will give up depending on how we configure it. Is setting a hard deadline on the connection with SetDeadline() rather than a timeout on creating the ClientConn the only option? Setting a hard deadline on a long-lived connection SSH connection feels wrong to me, but I'm probably misunderstanding something. Would the recommended pattern be to set the deadline before establishing the ClientConn, then unset it afterwards? |
SSH has support for keepalive. OpenSSH sends them, and the Go client can also issue them. Just send a "keepalive@openssh.com" request, either global or per-channel every once in a while. This ensures somethign gets sent over the connection so it appears live. |
@hanwen Now that Go 1.7 is out, what do you think about having a new version of NewClientConn (and maybe Dial) support a context.Context? The caller can do everything the library could do, but it's messy, and this would be a nice API for timing out/canceling session initialization. |
Can you come up with a comprehensive list of things you want in this -ehrm- context? All of net/http/ goes through http.Request which provides a convenient place to add a context, but that is not the case for SSH, and I'd hate to have to duplicate all functions to provide Context flavors. |
Apparently, this functionality has been adding in https://golang.org/cl/21136 |
Please answer these questions before submitting your issue. Thanks!
go version
)?1.4.2
go env
)?linux, amd64
Called ssh.Dial()
I'd expect ssh.Dial() to return either an opened client or an error within some reasonable amount of time, ideally no more than a few minutes.
ssh.Dial() has been hung for more than three hours.
We ran into this in Kubernetes. We aren't doing anything special--just SSHing to a known host and port over TCP with a specified user and public key. It appears as though the underlying problem is that a TCP connection was established (i.e. net.Dial() returned a connection), but then establishing an SSH connection using the underlying TCP connection hung. That means that #14941 wouldn't help here, since that just specifies a timeout for net.Dial(), not for establishing the client SSH connection after that.
Stack trace from the stuck goroutine:
cc @cjcullen
The text was updated successfully, but these errors were encountered: