Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/http2: transport.CloseIdleConnections() does not have any effect #14607

Closed
blixt opened this issue Mar 2, 2016 · 5 comments
Closed

x/net/http2: transport.CloseIdleConnections() does not have any effect #14607

blixt opened this issue Mar 2, 2016 · 5 comments
Milestone

Comments

@blixt
Copy link
Contributor

blixt commented Mar 2, 2016

I'm writing a service that delivers notifications to Apple's APNS service via their HTTP/2 API. For some reason APNS stops caring about an HTTP/2 connection after a certain amount of time of idling. The problem is that HTTP/2 connections are retained in the client pool, and they eventually become unusable because they are 100% guaranteed to time out once APNS stops caring about them. Despite this, they're not removed from the HTTP/2 client pool which means my service is rendered unusable.

Initially I was thinking of keeping the connection alive by pinging, but I'm unable to find a way to send PING frames with the Go HTTP library, so what I tried to do instead is to kill idle connections and reestablish them later (since this should be a fairly rare occurrence anyway). However, it appears that calling CloseIdleConnections() on the Transport instance does nothing. See example below with relevant code + log entries.

(Note that using an invalid device token in this example is intentional.)

// Set up an HTTP/2 client:
cert, err := tls.LoadX509KeyPair("secrets/cert.pem", "secrets/cert.key")
if err != nil {
    log.Fatalln(err)
}
config := &tls.Config{
    Certificates: []tls.Certificate{cert},
}
config.BuildNameToCertificate()
transport := &http.Transport{
    TLSClientConfig: config,
}
// Explicitly enable HTTP/2 as TLS-configured clients don't auto-upgrade.
// See: https://github.com/golang/go/issues/14275
if err := http2.ConfigureTransport(transport); err != nil {
    log.Fatalln(err)
}
client := &http.Client{
    Timeout:   3 * time.Second,
    Transport: transport,
}
// Perform a POST request to APNS:
client.Do(req)
…
2016/03/02 17:14:23 http2: Transport failed to get client conn for api.push.apple.com:443: http2: no cached connection was available
2016/03/02 17:14:23 http2: Transport creating client conn to 17.143.164.140:443
2016/03/02 17:14:23 Unhandled Setting: [HEADER_TABLE_SIZE = 4096]
2016/03/02 17:14:23 Unhandled Setting: [MAX_HEADER_LIST_SIZE = 8000]
2016/03/02 17:14:23 http2: Transport encoding header ":authority" = "api.push.apple.com"
2016/03/02 17:14:23 http2: Transport encoding header ":method" = "POST"
2016/03/02 17:14:23 http2: Transport encoding header ":path" = "/3/device/abc"
2016/03/02 17:14:23 http2: Transport encoding header ":scheme" = "https"
2016/03/02 17:14:23 http2: Transport encoding header "content-type" = "application/json"
2016/03/02 17:14:23 http2: Transport encoding header "accept-encoding" = "gzip"
2016/03/02 17:14:23 http2: Transport encoding header "user-agent" = "Go-http-client/2.0"
2016/03/02 17:14:23 http2: Transport received SETTINGS flags=ACK len=0
2016/03/02 17:14:23 http2: Transport received HEADERS flags=END_HEADERS stream=1 len=1
2016/03/02 17:14:23 http2: Transport decoded header field ":status" = "400"
2016/03/02 17:14:23 http2: Transport received DATA flags=END_STREAM stream=1 len=27 data="{\"reason\":\"BadDeviceToken\"}"
…
// 30 seconds pass by, then this is called:
transport.CloseIdleConnections()
// Another 30 seconds pass by, note below that a connection is being reused.
…
2016/03/02 17:15:33 http2: Transport encoding header ":authority" = "api.push.apple.com"
2016/03/02 17:15:33 http2: Transport encoding header ":method" = "POST"
2016/03/02 17:15:33 http2: Transport encoding header ":path" = "/3/device/abc"
2016/03/02 17:15:33 http2: Transport encoding header ":scheme" = "https"
2016/03/02 17:15:33 http2: Transport encoding header "content-type" = "application/json"
2016/03/02 17:15:33 http2: Transport encoding header "accept-encoding" = "gzip"
2016/03/02 17:15:33 http2: Transport encoding header "user-agent" = "Go-http-client/2.0"
2016/03/02 17:15:33 http2: Transport received HEADERS flags=END_HEADERS stream=7 len=1
2016/03/02 17:15:33 http2: Transport decoded header field ":status" = "400"
2016/03/02 17:15:33 http2: Transport received DATA flags=END_STREAM stream=7 len=27 data="{\"reason\":\"BadDeviceToken\"}"

(I'm using go version go1.6 darwin/amd64)

@blixt
Copy link
Contributor Author

blixt commented Mar 3, 2016

I've investigated this further and when running a Docker instance on my computer (MacBook Pro) I can't reproduce the timeout issue, but it does occur regularly on my Kubernetes pod (on Google Container Engine), so it's probably because the Go sockets can't detect that they've been shut down by Google Cloud's keep-alive timeout of 10 minutes.

Something is still not right with transport.CloseIdleConnections() because even if the clients are not handling any requests, they are not being shut down. I'm calling the function every 10 seconds and it's clear that the old client is still being reused.

I've attached the full set of logs (kubelogs.txt) from my Kubernetes pod showing how requests work and then after a while start failing with timeouts (the client.Timeout is set to 3 seconds), but the same client keeps being reused nonetheless. Eventually, after a very long time, the client finally dies with Transport readFrame error: (*net.OpError) read tcp …->…: read: connection timed out after which a new client is created and everything works again (until it doesn't).

While this issue is for transport.CloseIdleConnections(), any hints on how to resolve the root cause would be appreciated (you can e-mail me at blixt@rogertalk.com to keep this issue from sidetracking).

@bradfitz bradfitz self-assigned this Apr 9, 2016
@bradfitz bradfitz added this to the Go1.7 milestone Apr 9, 2016
@bradfitz
Copy link
Contributor

Sorry for the delay. I see the problem. Looks like an easy fix.

@bradfitz
Copy link
Contributor

https://go-review.googlesource.com/23258

@gopherbot
Copy link

CL https://golang.org/cl/23258 mentions this issue.

gopherbot pushed a commit to golang/net that referenced this issue May 20, 2016
…together

Go 1.6+ wires up its http.Transport values to http2 with ConfigureTransport.
ConfigureTransport sets the connection pool to a "noDialClientConnPool"
type. But the Transport.CloseIdleConnections implementation was specifically
looking for a concrete type. Look for an interface instead.

Updates golang/go#14607 (needs bundle into the main repo before fix)

Change-Id: I6ad23b6facab5d3c2cbe71a1809a52794375d803
Reviewed-on: https://go-review.googlesource.com/23258
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Andrew Gerrand <adg@golang.org>
@gopherbot
Copy link

CL https://golang.org/cl/23280 mentions this issue.

@golang golang locked and limited conversation to collaborators May 20, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants