New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net/http: investigate and fix uncaught allocations and regressions #25482
Comments
Are there any considerable regressions from Go 1.10 to 1.11? If not, it seems safest and easiest to do this when the tree reopens. |
First, somebody should benchmark the mux for Go 1.10 and tip. If it's really bad, we can investigate and do a minimal perf fix for Go 1.11. Otherwise this is all Go 1.12 material. I will assume it's Go 1.12 for now, but somebody should still benchmark it soon. |
@mvdan great question! So, so far what prompted this issue was #25383 which was manually noticed. Roger that, perhaps when the tree opens. Perhaps we could even make tooling that runs benchmarks as part of the Trybots and reports when there have been huge fluctuations say |8%| I have worked on some internal tool that does that, but I'll need to get the time to package it |
That is not a small project. It's been done twice and abandoned twice. First was #8930 (comment) and second (current) is #20552. |
Roger that, thanks for the cautionary tale @bradfitz! Alright, so for some results, I ran all the net/http benchmarks on both Go1.10 and tip devel 87a18c6 with the results at [1] My machine's information Darwin Emmanuels-MacBook-Pro-2.local 15.6.0 Darwin Kernel Version 15.6.0: Tue Jan 9 20:12:05 PST 2018; root:xnu-3248.73.5~1/RELEASE_X86_64 x86_64 i386 SynopsisThe ones that sorely stick out:
and the above matches the mux regression report, but with an addition ~5% regression on tip, so in general: ~15% regression since Go1.10, ~10% regression on tip since before 2fd1b52, at least on my machine
All results[1] with a full gist of results at https://gist.github.com/odeke-em/3a6c95c4aa74d15619ab95728337e615. $ benchstat benchmarks-go1.10.txt benchmarks-go-devel-87a18c6.txt
name old time/op new time/op delta
Client-4 130µs ± 5% 125µs ± 8% ~ (p=0.343 n=4+4)
ClientServer-4 104µs ± 7% 107µs ± 2% ~ (p=0.343 n=4+4)
ClientServerParallel4-4 1.26ms ±254% 0.42ms ±100% ~ (p=1.000 n=4+4)
ClientServerParallel64-4 147µs ±69% 197µs ±34% ~ (p=0.686 n=4+4)
ClientServerParallelTLS4-4 168µs ±30% 157µs ±22% ~ (p=0.886 n=4+4)
ClientServerParallelTLS64-4 58.9µs ±32% 62.4µs ±122% ~ (p=1.000 n=4+4)
CloseNotifier-4 220µs ±15% 209µs ± 5% ~ (p=0.486 n=4+4)
CookieString-4 951ns ± 1% 982ns ± 1% +3.31% (p=0.029 n=4+4)
HeaderWriteSubset-4 558ns ± 1% 528ns ± 2% -5.29% (p=0.029 n=4+4)
ReadCookies-4 3.61µs ± 1% 3.69µs ± 1% +2.29% (p=0.029 n=4+4)
ReadRequestApachebench-4 1.63µs ± 1% 1.66µs ± 1% +1.36% (p=0.029 n=4+4)
ReadRequestChrome-4 2.95µs ± 1% 2.92µs ± 1% ~ (p=0.343 n=4+4)
ReadRequestCurl-4 1.62µs ± 0% 1.65µs ± 1% +1.84% (p=0.029 n=4+4)
ReadRequestSiege-4 2.11µs ± 1% 2.13µs ± 0% ~ (p=0.343 n=4+4)
ReadRequestWrk-4 1.14µs ± 1% 1.17µs ± 1% +2.26% (p=0.029 n=4+4)
ReadSetCookies-4 3.24µs ± 1% 3.29µs ± 1% ~ (p=0.143 n=4+4)
ResponseStatusLine-4 35.1ns ± 2% 37.9ns ± 6% ~ (p=0.086 n=4+4)
ServeMux-4 78.9µs ± 2% 91.0µs ± 1% +15.37% (p=0.029 n=4+4)
Server-4 131µs ± 3% 131µs ± 0% ~ (p=0.343 n=4+4)
ServerFakeConnNoKeepAlive-4 18.9µs ± 1% 18.8µs ± 0% ~ (p=0.486 n=4+4)
ServerFakeConnWithKeepAlive-4 10.8µs ± 5% 10.3µs ±12% ~ (p=0.200 n=4+4)
ServerFakeConnWithKeepAliveLite-4 6.87µs ± 2% 6.83µs ± 1% ~ (p=0.800 n=4+4)
ServerHandlerNoHeader-4 6.47µs ± 2% 6.60µs ± 2% ~ (p=0.057 n=4+4)
ServerHandlerNoLen-4 7.75µs ± 4% 7.58µs ± 1% ~ (p=0.057 n=4+4)
ServerHandlerNoType-4 7.85µs ± 1% 7.82µs ± 1% ~ (p=0.200 n=4+4)
ServerHandlerTypeLen-4 8.35µs ± 2% 8.17µs ± 2% ~ (p=0.086 n=4+4)
ServerHijack-4 20.3µs ± 0% 20.5µs ± 1% +1.21% (p=0.029 n=4+4)
name old alloc/op new alloc/op delta
Client-4 3.49kB ± 0% 3.49kB ± 0% ~ (p=1.000 n=4+4)
ClientServer-4 4.98kB ± 0% 4.98kB ± 0% ~ (p=0.429 n=4+4)
ClientServerParallel4-4 7.92kB ± 7% 7.93kB ± 2% ~ (p=0.486 n=4+4)
ClientServerParallel64-4 4.66kB ±12% 4.51kB ± 3% ~ (p=0.343 n=4+4)
ClientServerParallelTLS4-4 18.3kB ±28% 20.4kB ±21% ~ (p=0.343 n=4+4)
ClientServerParallelTLS64-4 4.98kB ±11% 4.78kB ± 6% ~ (p=0.686 n=4+4)
CloseNotifier-4 3.32kB ± 0% 3.31kB ± 0% ~ (p=0.229 n=4+4)
CookieString-4 384B ± 0% 360B ± 0% -6.25% (p=0.029 n=4+4)
HeaderWriteSubset-4 0.00B ±NaN% 0.00B ±NaN% ~ (all samples are equal)
ReadCookies-4 2.01kB ± 0% 2.01kB ± 0% ~ (all samples are equal)
ReadRequestApachebench-4 883B ± 0% 883B ± 0% ~ (all samples are equal)
ReadRequestChrome-4 1.46kB ± 0% 1.46kB ± 0% ~ (all samples are equal)
ReadRequestCurl-4 880B ± 0% 880B ± 0% ~ (all samples are equal)
ReadRequestSiege-4 965B ± 0% 965B ± 0% ~ (all samples are equal)
ReadRequestWrk-4 832B ± 0% 832B ± 0% ~ (all samples are equal)
ReadSetCookies-4 960B ± 0% 960B ± 0% ~ (all samples are equal)
ResponseStatusLine-4 0.00B ±NaN% 0.00B ±NaN% ~ (all samples are equal)
ServeMux-4 19.9kB ± 0% 19.9kB ± 0% ~ (all samples are equal)
Server-4 2.24kB ± 0% 2.24kB ± 0% ~ (p=0.286 n=4+4)
ServerFakeConnNoKeepAlive-4 4.64kB ± 0% 4.59kB ± 0% -1.09% (p=0.029 n=4+4)
ServerFakeConnWithKeepAlive-4 2.44kB ± 0% 2.44kB ± 0% ~ (all samples are equal)
ServerFakeConnWithKeepAliveLite-4 1.30kB ± 0% 1.30kB ± 0% ~ (all samples are equal)
ServerHandlerNoHeader-4 1.30kB ± 0% 1.30kB ± 0% ~ (all samples are equal)
ServerHandlerNoLen-4 2.08kB ± 0% 2.08kB ± 0% ~ (all samples are equal)
ServerHandlerNoType-4 2.08kB ± 0% 2.08kB ± 0% ~ (all samples are equal)
ServerHandlerTypeLen-4 2.12kB ± 0% 2.12kB ± 0% ~ (all samples are equal)
ServerHijack-4 16.1kB ± 0% 16.0kB ± 0% -0.30% (p=0.029 n=4+4)
name old allocs/op new allocs/op delta
Client-4 49.0 ± 0% 49.0 ± 0% ~ (all samples are equal)
ClientServer-4 65.0 ± 0% 65.0 ± 0% ~ (all samples are equal)
ClientServerParallel4-4 79.5 ± 8% 76.5 ± 2% ~ (p=0.114 n=4+4)
ClientServerParallel64-4 64.5 ± 5% 63.2 ± 1% ~ (p=0.229 n=4+4)
ClientServerParallelTLS4-4 196 ±26% 211 ±19% ~ (p=0.686 n=4+4)
ClientServerParallelTLS64-4 69.2 ± 7% 66.8 ± 4% ~ (p=0.543 n=4+4)
CloseNotifier-4 54.0 ± 0% 50.0 ± 0% -7.41% (p=0.029 n=4+4)
CookieString-4 3.00 ± 0% 5.00 ± 0% +66.67% (p=0.029 n=4+4)
HeaderWriteSubset-4 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal)
ReadCookies-4 16.0 ± 0% 16.0 ± 0% ~ (all samples are equal)
ReadRequestApachebench-4 9.00 ± 0% 9.00 ± 0% ~ (all samples are equal)
ReadRequestChrome-4 14.0 ± 0% 14.0 ± 0% ~ (all samples are equal)
ReadRequestCurl-4 9.00 ± 0% 9.00 ± 0% ~ (all samples are equal)
ReadRequestSiege-4 11.0 ± 0% 11.0 ± 0% ~ (all samples are equal)
ReadRequestWrk-4 7.00 ± 0% 7.00 ± 0% ~ (all samples are equal)
ReadSetCookies-4 17.0 ± 0% 17.0 ± 0% ~ (all samples are equal)
ResponseStatusLine-4 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal)
ServeMux-4 540 ± 0% 540 ± 0% ~ (all samples are equal)
Server-4 20.0 ± 0% 20.0 ± 0% ~ (all samples are equal)
ServerFakeConnNoKeepAlive-4 48.0 ± 0% 49.0 ± 0% +2.08% (p=0.029 n=4+4)
ServerFakeConnWithKeepAlive-4 22.0 ± 0% 22.0 ± 0% ~ (all samples are equal)
ServerFakeConnWithKeepAliveLite-4 12.0 ± 0% 12.0 ± 0% ~ (all samples are equal)
ServerHandlerNoHeader-4 12.0 ± 0% 12.0 ± 0% ~ (all samples are equal)
ServerHandlerNoLen-4 17.0 ± 0% 17.0 ± 0% ~ (all samples are equal)
ServerHandlerNoType-4 18.0 ± 0% 18.0 ± 0% ~ (all samples are equal)
ServerHandlerTypeLen-4 20.0 ± 0% 20.0 ± 0% ~ (all samples are equal)
ServerHijack-4 46.0 ± 0% 47.0 ± 0% +2.17% (p=0.029 n=4+4)
name old speed new speed delta
ReadRequestApachebench-4 50.2MB/s ± 1% 49.5MB/s ± 1% -1.33% (p=0.029 n=4+4)
ReadRequestChrome-4 207MB/s ± 1% 209MB/s ± 1% ~ (p=0.343 n=4+4)
ReadRequestCurl-4 48.1MB/s ± 0% 47.2MB/s ± 1% -1.81% (p=0.029 n=4+4)
ReadRequestSiege-4 71.6MB/s ± 1% 71.0MB/s ± 0% ~ (p=0.343 n=4+4)
ReadRequestWrk-4 35.0MB/s ± 1% 34.3MB/s ± 1% -2.22% (p=0.029 n=4+4) |
Alright, I've also just discovered another regression from CookieString where we switched to using strings.Builder in commit 48db2c0#diff-44567335da7512324c74b50e5fcb3def aka CL https://go-review.googlesource.com/102479 The results are wild for CookieString with both ns/op and allocs/op increasing but more radically allocs/op increasing by 40% BeforeBenchmarkCookieString-4 1000000 1055 ns/op 384 B/op 3 allocs/op AfterBenchmarkCookieString-4 1000000 1086 ns/op 360 B/op 5 allocs/op While I don't think Cookie.String is used a lot; for the niche cases of servers that log cookies per request, this is a huge incurred cost, thus I propose that for Go1.11 we should roll back the CookieString change. It worked well before and am not sure that the change to use strings.Builder provides any better readability improvements either. Paging @bradfitz and @ianlancetaylor as per the reported CL |
@quasilyte nice, but that required some specialized skills and scrutiny #LiamNeeson |
Change https://golang.org/cl/113996 mentions this issue: |
CL 96575 introduced concurrency protection for ServeMux.shouldRedirect with a read lock and deferred unlock. However, the change produced a noticeable regression. Instead add the suffix "RLocked" to the function name to declare that we should hold the read lock as a pre-requisite before calling it, hence avoiding the defer altogether. Benchmarks: name old time/op new time/op delta ServeMux-8 63.3µs ± 0% 54.6µs ± 0% -13.74% (p=0.000 n=9+9) ServeMux_SkipServe-8 41.4µs ± 2% 32.7µs ± 1% -21.05% (p=0.000 n=10+10) name old alloc/op new alloc/op delta ServeMux-8 17.3kB ± 0% 17.3kB ± 0% ~ (all equal) ServeMux_SkipServe-8 0.00B 0.00B ~ (all equal) name old allocs/op new allocs/op delta ServeMux-8 360 ± 0% 360 ± 0% ~ (all equal) ServeMux_SkipServe-8 0.00 0.00 ~ (all equal) Updates #25383 Updates #25482 Change-Id: I2ffa4eafe165faa961ce23bd29b5653a89facbc2 Reviewed-on: https://go-review.googlesource.com/113996 Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
I just made a fix for https://golang.org/issue/25383 with https://go-review.googlesource.com/c/go/+/113996 and that CL just puts a bandaid on the issue which was a regression that was manually noticed by @quasilyte.
The real issue as raised by @bradfitz is that there have been a bunch of allocations and regressions that have crept into net/http code over the past couple of months/years, that are more concerning.
This issue is to track the mentioned the problem raised. I am currently too swamped to comprehensively do investigative and performance work but if anyone would like to take on this issue or would like to work on this for Go1.11 or perhaps during Go1.12 or so, I would be very happy to help out whether pairing or as a "bounty", please feel free to reach out to me.
The text was updated successfully, but these errors were encountered: