Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http, crypto/tls: first HTTP request is consistently slower during TLS handshake in Go 1.17, but not 1.18 #50298

Closed
janivanecky opened this issue Dec 21, 2021 · 6 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@janivanecky
Copy link

What version of Go are you using (go version)?

$ go version
go version go1.17.5 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GOVCS=""
GOVERSION="go1.17.5"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch x86_64 -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/0h/wc_q919d4_9_f0wj1p4v4h840000gn/T/go-build4221745526=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

I'm seeing the first http request using net/http package and Client to be consistently 100-200ms slower than the consecutive ones. This happens both when sending requests to one or multiple servers. The code to reproduce the issue:

package main

import (
	"crypto/tls"
	"fmt"
	"net/http"
	"net/http/httptrace"
	"time"
)

func main() {
	request := func (client *http.Client, url string) {
		t := time.Now()
		fmt.Println("")
		clientTrace := &httptrace.ClientTrace{
			GetConn:      func(hostPort string) { s := time.Now(); fmt.Println(s.Sub(t), "starting to create conn ", hostPort); t = s },
			DNSStart:     func(info httptrace.DNSStartInfo) { s := time.Now(); fmt.Println(s.Sub(t), "starting to look up dns", info); t = s },
			DNSDone:      func(info httptrace.DNSDoneInfo) { s := time.Now(); fmt.Println(s.Sub(t), "done looking up dns", info); t = s },
			ConnectStart: func(network, addr string) { s := time.Now(); fmt.Println(s.Sub(t), "starting tcp connection", network, addr); t = s },
			ConnectDone:  func(network, addr string, err error) { s := time.Now(); fmt.Println(s.Sub(t), "tcp connection created", network, addr, err) ; t = s},
			GotConn:      func(info httptrace.GotConnInfo) { s := time.Now(); fmt.Println(s.Sub(t), "connection established", info.Reused); t = s },
			GotFirstResponseByte: func() { s := time.Now(); fmt.Println(s.Sub(t), "got first byte"); t = s },
			TLSHandshakeStart: func() { s := time.Now(); fmt.Println(s.Sub(t), "handshake start"); t = s },
			TLSHandshakeDone: func(tt tls.ConnectionState, e error) {s := time.Now(); fmt.Println(s.Sub(t), "handshake end"); t = s},
		}
		
		req, _ := http.NewRequest("GET", url, nil)
		req.Close = true
		clientTraceCtx := httptrace.WithClientTrace(req.Context(), clientTrace)
		req = req.WithContext(clientTraceCtx)
		response, _ := client.Do(req)
		response.Body.Close()
	}

	client := &http.Client{}
	request(client, "https://news.ycombinator.com")
	client = &http.Client{}
	request(client, "https://news.ycombinator.com")
	client = &http.Client{}
	request(client, "https://news.ycombinator.com")
}

What did you expect to see?

Roughly the same timings for all the requests.

What did you see instead?

145.166µs starting to create conn  news.ycombinator.com:443
276.459µs starting to look up dns {news.ycombinator.com}
58.451916ms done looking up dns {[{209.216.230.240 }] <nil> false}
616.542µs starting tcp connection tcp 209.216.230.240:443
190.6365ms tcp connection created tcp 209.216.230.240:443 <nil>
711.875µs handshake start
517.936208ms handshake end
620.167µs connection established false
194.723083ms got first byte

115.5µs starting to create conn  news.ycombinator.com:443
93.959µs starting to look up dns {news.ycombinator.com}
8.060583ms done looking up dns {[{209.216.230.240 }] <nil> false}
41.167µs starting tcp connection tcp 209.216.230.240:443
179.021541ms tcp connection created tcp 209.216.230.240:443 <nil>
224.875µs handshake start
362.071834ms handshake end
152.291µs connection established false
180.119625ms got first byte

91.333µs starting to create conn  news.ycombinator.com:443
129.208µs starting to look up dns {news.ycombinator.com}
2.597417ms done looking up dns {[{209.216.230.240 }] <nil> false}
74.833µs starting tcp connection tcp 209.216.230.240:443
181.5515ms tcp connection created tcp 209.216.230.240:443 <nil>
151.334µs handshake start
359.525791ms handshake end
157.042µs connection established false
179.980583ms got first byte

The slowdown for the first request happens for every server I tried and it's always only the first request, whether the Client is reused or recreated for every request.

Just to make sure that there's no confusion - I tried this for multiple servers, the first request is always slower. For server X, the average request duration is N if it's the first request and M if it's one of the consecutive requests. Duration M for consecutive requests is roughly the same across multiple runs regardless of whether the first request was to the server X or some other server. The duration M is always 100-200ms lower than N.

The logs I posted seem to indicate that the bulk of the slowdown is coming from TLS handshake. I couldn't find anything in the code that would point to some overhead/initialization happening exclusively for the first request.

I'm not necessarily saying there's a bug - this might be the expected behavior. In that case it would be great to update the documentation, I spend few hours researching this and found no explanation for this behavior.

Thank you for your time!

@dmitshur dmitshur changed the title net/http: First http request consistently slower. net/http: first HTTP request is consistently slower Dec 21, 2021
@dmitshur dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Dec 21, 2021
@dmitshur
Copy link
Contributor

In the test performed, client := &http.Client{} looks suspicious. From https://pkg.go.dev/net/http#Client.Transport:

// Transport specifies the mechanism by which individual
// HTTP requests are made.
// If nil, DefaultTransport is used.
Transport RoundTripper

If all 3 instances of an HTTP client end up reusing the default transport, there may be something cached and reused for future connections.

Maybe this isn't a problem because of req.Close = true line, but still, I'd suggest avoiding reusing the transport to make the test more fair.

This doesn't appear to reproduce consistently for me with Go 1.18 Beta 1 and using my internet. I got a result that looks like:

$ go run . | grep "got first byte"
112.980708ms got first byte
93.152333ms got first byte
84.753375ms got first byte
$ go run . | grep "got first byte"
79.133125ms got first byte
85.475709ms got first byte
82.661083ms got first byte
$ go run . | grep "got first byte"
80.984625ms got first byte
78.987416ms got first byte
91.047583ms got first byte

So, it happened on first program execution, but not second and third.

I would imagine this can happen because of how the internet works: when making a request to a new server for the first time, routers take a bit longer to route packets the first time. For subsequent requests, those "first time" delays aren't incurred. For related discussion, see https://stackoverflow.com/questions/54078692/why-first-network-call-takes-more-time-than-subsequent-ones.

This is to say there may still be room for improvement in certain parts of net/http, but the provided test is too high level to find where the opportunities may be.

@dmitshur
Copy link
Contributor

consistently 100-200ms slower

Can you clarify how you're getting to that number? From your log, it looked like all 3 requests got first byte within 20 ms, not 100-200 ms.

@seankhliao seankhliao added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Dec 21, 2021
@janivanecky
Copy link
Author

janivanecky commented Dec 21, 2021

I'd suggest avoiding reusing the transport to make the test more fair.

I tried running this with creating each client like this:

client := &http.Client{
	Transport: &http.Transport{
		Proxy: http.ProxyFromEnvironment,
		DialContext: (&net.Dialer{
			Timeout:   30 * time.Second,
			KeepAlive: 30 * time.Second,
		}).DialContext,
		ForceAttemptHTTP2:     true,
		MaxIdleConns:          100,
		IdleConnTimeout:       90 * time.Second,
		TLSHandshakeTimeout:   10 * time.Second,
		ExpectContinueTimeout: 1 * time.Second,
	},
}

and the issue persists.

Can you clarify how you're getting to that number? From your log, it looked like all 3 requests got first byte within 20 ms, not 100-200 ms.

Yes, I was not clear about this in my original post - the line to track is handshake end in the log. The timings are not cumulative.

janivanecky@MacBook-Pro blnk % go run example.go | grep "handshake end"
499.478584ms handshake end
358.794625ms handshake end
358.046667ms handshake end
janivanecky@MacBook-Pro blnk % go run example.go | grep "handshake end"
496.45375ms handshake end
363.463291ms handshake end
356.473209ms handshake end

I would imagine this can happen because of how the internet works: when making a request to a new server for the first time, routers take a bit longer to route packets the first time. For subsequent requests, those "first time" delays aren't incurred. For related discussion, see https://stackoverflow.com/questions/54078692/why-first-network-call-takes-more-time-than-subsequent-ones.

Makes sense, but I'm seeing the same thing when sending requests across different servers:

janivanecky@MacBook-Pro blnk % go run example.go | grep "handshake end"
487.623042ms https://news.ycombinator.com handshake end
33.04375ms https://twitter.com handshake end
janivanecky@MacBook-Pro blnk % go run example.go | grep "handshake end"
131.811708ms https://twitter.com handshake end
360.978583ms https://news.ycombinator.com handshake end

@dmitshur
Copy link
Contributor

dmitshur commented Dec 21, 2021

Thanks for clarifying.

I'm still not seeing the same difference you're describing on my machine, at least not with Go 1.18 Beta 1:

$ go run . | grep "handshake end"
184.652292ms handshake end
185.861708ms handshake end
232.005459ms handshake end
$ go run . | grep "handshake end"
206.014125ms handshake end
198.452334ms handshake end
171.361875ms handshake end
$ go run . | grep "handshake end"
181.764333ms handshake end
180.761042ms handshake end
176.7205ms handshake end

Edit: But I do see it with Go 1.17.5!

I'll let someone more familiar with net/http and crypto/tls take a look, possibly later on due to #50241.

@dmitshur dmitshur removed the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Dec 21, 2021
@dmitshur dmitshur changed the title net/http: first HTTP request is consistently slower net/http, crypto/tls: first HTTP request is consistently slower during TLS handshake Dec 21, 2021
@janivanecky
Copy link
Author

Thank you anyways Dmitri!

I tried running this with Go 1.18 Beta 1 and I don't see the issue either. I think the absolute first request to the server is still the slowest, by a small margin, but the speed up persists across multiple runs.

@dmitshur dmitshur changed the title net/http, crypto/tls: first HTTP request is consistently slower during TLS handshake net/http, crypto/tls: first HTTP request is consistently slower during TLS handshake in Go 1.17, but not 1.18 Dec 21, 2021
@seankhliao seankhliao added this to the Unplanned milestone Aug 20, 2022
@seankhliao
Copy link
Member

Closing as 1.17 is no longer supported

@seankhliao seankhliao closed this as not planned Won't fix, can't repro, duplicate, stale Aug 20, 2022
@golang golang locked and limited conversation to collaborators Aug 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

No branches or pull requests

4 participants