Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: setting IdleConnTimeout to a low value on http.Transport is causing non-deterministic tls handshake failures #33891

Open
ahmetb opened this issue Aug 27, 2019 · 1 comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@ahmetb
Copy link

ahmetb commented Aug 27, 2019

What version of Go are you using (go version)?

$ go version
go version go1.12.9 darwin/amd64

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/ahmetb/Library/Caches/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/ahmetb/go"
GOPROXY=""
GORACE=""
GOROOT="/Users/ahmetb/.homebrew/Cellar/go/1.12.9/libexec"
GOTMPDIR=""
GOTOOLDIR="/Users/ahmetb/.homebrew/Cellar/go/1.12.9/libexec/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/h7/n46zg3md4l57vzsgxcs355hw00dl55/T/go-build140871803=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Consider the following program that sets

  • IdleConnTimeout: 1 * time.Nanosecond
  • MaxIdleConns: 1

and makes GET https://example.com calls:

package main

import (
	"net/http"
	"time"
	"log"
	"net"
)

func main() {
	c := &http.Client{
		Transport: &http.Transport{
			IdleConnTimeout:       1 * time.Nanosecond,
			MaxIdleConns: 1,
			DialContext: (&net.Dialer{
				KeepAlive: -1,
			}).DialContext,
		},
	}

	for {
		req, _ := http.NewRequest(http.MethodGet, "https://www.example.com", nil)
		resp, err := c.Do(req)
		if err != nil {
			log.Fatal(err)
		}
		resp.Body.Close()
		log.Printf("request complete")
		time.Sleep(100*time.Millisecond)
	}
}

What did you expect to see?

Client uses tls connection fully. Puts it back, then the timeout starts counting.

What did you see instead?

It seems like there’s a race between this deadline and how the connection is used. After 1 nanosecond the connection expires and goes away from the pool, so I get non-deterministic tls handshake failures

Sometimes 1 or 2 requests work before tls handshake error hits. Sometimes it directly fails on the first request:

go run main.go
2019/08/27 16:48:17 request complete
2019/08/27 16:48:17 Get https://www.example.com: tls: use of closed connection
exit status 1
go run main.go
2019/08/27 16:48:21 Get https://www.example.com: tls: use of closed connection
exit status 1

Apparently the conn is actually being used for tls handshake, possibly because 1 nanosecond is too low so that when the connection is established, it goes away just before TLS handshake starts.

Not sure if this is expected.

@networkimprov
Copy link

cc @FiloSottile

@julieqiu julieqiu added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 29, 2019
@seankhliao seankhliao added this to the Unplanned milestone Aug 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

4 participants