-
Notifications
You must be signed in to change notification settings - Fork 18k
x/net/http2: investigate Server speed regressions from more select case #20302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
CL https://golang.org/cl/43034 mentions this issue. |
From the profile linked in kubernetes/kubernetes#45216, this looks like lock contention. runtime.selectgoImpl: 13.06s of 451.69s Compare this to the profile from go 1.7: runtime.selectgoImpl: 4.65s of 12.69s |
Pity--looks like mutex profiling doesn't include futex. I wonder how to detect which case is contended; if it is in fact lock contention, knowing which lock it is should give reproduction hints. |
/cc @mdempsky @randall77 @aclements for any runtime help. This is a pretty serious speed regression for Kubernetes, but one we can't yet repro in a stand-alone benchmark. Any clues? |
@bradfitz it looks like BenchmarkGets has no concurrency (at least according to the trace viewer), so the select statement isn't under any contention. So if the problem is lock contention, it won't show up. Maybe b.RunParallel to the rescue? |
(I tried hacking it in myself but broke all sorts of things--obviously requires more domain expertise than I have to get it right.) |
Here is a version of BenchmarkServerGets that adds concurrency. I cannot reproduce the bug though. The benchmark has the same runtime before and after commenting out the select cases listed in this comment. I wonder if there's something about their testbed setup that makes really long scheduling delays possible? Or maybe they're running in a VM that has a weird implementation of sys_futex? |
@tombergan thanks. That benchmark patch doesn't apply cleanly for me (mail a CL next time?), but I applied it manually, and got:
and lots more errors. Anyway, from visually inspecting the patch, it does set up a concurrent get, but not N concurrent gets, so it doesn't appear that it would really introduce contention. You want something (waves hands) like:
You can check for full concurrency using runtime tracing. Be sure to use a short benchtime to avoid overwhelming amounts of data.
You should see all procs active the whole time, not just one or two. If it is fully parallel, and there are contention effects, you should also be able to see the effect when you mess with -cpu:
|
Sorry the patch didn't apply cleanly, not sure why that happened. HTTP/2 frames must be serialized onto a TCP connection. Whether or not the "gets" run concurrently on the client side is not relevant to server-side concurrency. Previously, the test forced serial execution on the server because it waited for a response immediately after sending each request. My patched test reads the responses in a background goroutine so the server can respond concurrently with the client sending new requests. The client still needs to send requests serially (the The server will serialize response frames, but each HTTP request handler should run concurrently. That serialization is managed by the Thanks for the tips about runtime tracing. Hopefully I have time later today to try that out. |
Cool. Let me know if you have any problems with tracing; happy to help. Feel free to also ping me on some form of chat if interactive discussion would be helpful. Brad can give you my contact info. |
@tombergan, yes, that's the sort of benchmark I was imagining: one connection, bunch of concurrent requests streaming from long-running handlers doing Write+Flush for |
CL https://golang.org/cl/43230 mentions this issue. |
For the diagnosis, see this comment onwards. |
Depending on the fix, this might be for Go 1.8.2, since it causes serious problems and there isn't a clear workaround. Marking as Go 1.8.2, with the expectation that others may disagree and move it back. |
CL https://golang.org/cl/43455 mentions this issue. |
I filed a more general bug at #20351 |
gracefulShutdownCh is shared by all connections in a server. When a server accumulates many connections (e.g., 5000 in the kubemark-5000 benchmark), we have 5000 serverConn.serve goroutines selecting on this channel. This means 5000 goroutines hammer the channel's lock, which causes severe lock contention. The fix in this CL is to make a local proxy for gracefulShutdownCh in each connection so that each connection selects on gracefulShutdownCh at most once per connection rather than once per serverConn.serve loop iteration. This fix is intended to be backported quickly into Go 1.8.2. The downside of this fix is 2KB extra stack usage per connection. A better fix will be implemented in Go 1.9. Unfortunately, I have been unable to reproduce this problem locally. This fix was verified by the kubernetes team. See: kubernetes/kubernetes#45216 (comment) Updates golang/go#20302 Change-Id: I19ab19268a6ccab9b6e9dffa0cfbc89b8c7d0f19 Reviewed-on: https://go-review.googlesource.com/43455 Run-TryBot: Tom Bergan <tombergan@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
CL https://golang.org/cl/43459 mentions this issue. |
This drops the number of select cases in serverConn.server from 10 to 6. It was 7 in Go 1.7. It increased to 10 in Go 1.8. * replace testing-only testHookCh with always-on grab-bag serveMsgCh to be used for both test purposes, and infrequently used message types * remove the settingsTimer.C case for the initial settings timer and use serveMsgCh and time.AfterFunc instead * ... and do the same for the idle timeout timer. * ... and for the shutdown timer. * remove wantStartPushCh and just send the *startPushRequest to serveMsgCh too I could go further with this (and plan to, later), but these are the safe and easy ones that don't require more work elsewhere. The speed gets better the more the request/response go via the serverConn.serve loop. (once per Request.Body.Read or ResponseWriter.Flush): name old time/op new time/op delta ServerGets-4 138µs ± 3% 134µs ± 2% -2.54% (p=0.002 n=10+10) ServerPosts-4 176µs ±27% 154µs ± 2% -12.62% (p=0.011 n=10+10) Updates kubernetes/kubernetes#45216 Updates golang/go#20302 Change-Id: I18019554089d7e3d76355d7137b5957e9597e803 Reviewed-on: https://go-review.googlesource.com/43034 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Tom Bergan <tombergan@google.com>
…cefulShutdownCh gracefulShutdownCh is shared by all connections in a server. When a server accumulates many connections (e.g., 5000 in the kubemark-5000 benchmark), we have 5000 serverConn.serve goroutines selecting on this channel. This means 5000 goroutines hammer the channel's lock, which causes severe lock contention. The fix in this CL is to make a local proxy for gracefulShutdownCh in each connection so that each connection selects on gracefulShutdownCh at most once per connection rather than once per serverConn.serve loop iteration. This fix is intended to be backported quickly into Go 1.8.2. The downside of this fix is 2KB extra stack usage per connection. A better fix will be implemented in Go 1.9. Unfortunately, I have been unable to reproduce this problem locally. This fix was verified by the kubernetes team. See: kubernetes/kubernetes#45216 (comment) Updates golang/go#20302 Change-Id: I19ab19268a6ccab9b6e9dffa0cfbc89b8c7d0f19 Reviewed-on: https://go-review.googlesource.com/43455 Run-TryBot: Tom Bergan <tombergan@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> (cherry picked from commit d3ede01) Reviewed-on: https://go-review.googlesource.com/43459 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Chris Broadfoot <cbro@golang.org>
CL https://golang.org/cl/43996 mentions this issue. |
…tdownCh lock contention slowdown This updates the bundled x/net/http2 repo to git rev 186fd3fc (from the net repo's release-branch.go1.8) for: [release-branch.go1.8] http2: fix lock contention slowdown due to gracefulShutdownCh https://golang.org/cl/43459 Fixes #20302 Change-Id: Ia01a44c6749292de9c16ca330bdebe1e52458b18 Reviewed-on: https://go-review.googlesource.com/43996 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Chris Broadfoot <cbro@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
This will be used to allow http2 servers to register a shutdown function so that net/http.Server.Shutdown will work when the http2 server is configured via a manual call to http2.ConfigureServer. Currently, Shutdown only works when the http2 server is configured automatically by the net/http package. Updates #20302 Updates #18471 Change-Id: Ifc2b5f3126126a106b49ea4a7e999279852b9cc9 Reviewed-on: https://go-review.googlesource.com/44003 Run-TryBot: Tom Bergan <tombergan@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reopening for the go 1.9 fixes. |
CL https://golang.org/cl/44003 mentions this issue. |
This is a better fix that https://golang.org/cl/43455. Instead of creating a separate goroutine to wait for the global shutdown channel, we reuse the new serverMsgCh, which was added in a prior CL. We also use the new net/http.Server.RegisterOnShutdown method to register a shutdown callback for each http2.Server. Updates golang/go#20302 Updates golang/go#18471 Change-Id: Icf29d5e4f65b3779d1fb4ea92924e4fb6bdadb2a Reviewed-on: https://go-review.googlesource.com/43230 Run-TryBot: Tom Bergan <tombergan@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Still TODO is to bundle https://golang.org/cl/44003 into net/http. |
CL https://golang.org/cl/44006 mentions this issue. |
@tombergan, I'm going to switch this back to Go 1.8.3 so it shows up in the Github search link from the Go 1.8.3 release notes. /cc @broady |
gracefulShutdownCh is shared by all connections in a server. When a server accumulates many connections (e.g., 5000 in the kubemark-5000 benchmark), we have 5000 serverConn.serve goroutines selecting on this channel. This means 5000 goroutines hammer the channel's lock, which causes severe lock contention. The fix in this CL is to make a local proxy for gracefulShutdownCh in each connection so that each connection selects on gracefulShutdownCh at most once per connection rather than once per serverConn.serve loop iteration. This fix is intended to be backported quickly into Go 1.8.2. The downside of this fix is 2KB extra stack usage per connection. A better fix will be implemented in Go 1.9. Unfortunately, I have been unable to reproduce this problem locally. This fix was verified by the kubernetes team. See: kubernetes/kubernetes#45216 (comment) Updates golang/go#20302 Change-Id: I19ab19268a6ccab9b6e9dffa0cfbc89b8c7d0f19 Reviewed-on: https://go-review.googlesource.com/43455 Run-TryBot: Tom Bergan <tombergan@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
This drops the number of select cases in serverConn.server from 10 to 6. It was 7 in Go 1.7. It increased to 10 in Go 1.8. * replace testing-only testHookCh with always-on grab-bag serveMsgCh to be used for both test purposes, and infrequently used message types * remove the settingsTimer.C case for the initial settings timer and use serveMsgCh and time.AfterFunc instead * ... and do the same for the idle timeout timer. * ... and for the shutdown timer. * remove wantStartPushCh and just send the *startPushRequest to serveMsgCh too I could go further with this (and plan to, later), but these are the safe and easy ones that don't require more work elsewhere. The speed gets better the more the request/response go via the serverConn.serve loop. (once per Request.Body.Read or ResponseWriter.Flush): name old time/op new time/op delta ServerGets-4 138µs ± 3% 134µs ± 2% -2.54% (p=0.002 n=10+10) ServerPosts-4 176µs ±27% 154µs ± 2% -12.62% (p=0.011 n=10+10) Updates kubernetes/kubernetes#45216 Updates golang/go#20302 Change-Id: I18019554089d7e3d76355d7137b5957e9597e803 Reviewed-on: https://go-review.googlesource.com/43034 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Tom Bergan <tombergan@google.com>
This is a better fix that https://golang.org/cl/43455. Instead of creating a separate goroutine to wait for the global shutdown channel, we reuse the new serverMsgCh, which was added in a prior CL. We also use the new net/http.Server.RegisterOnShutdown method to register a shutdown callback for each http2.Server. Updates golang/go#20302 Updates golang/go#18471 Change-Id: Icf29d5e4f65b3779d1fb4ea92924e4fb6bdadb2a Reviewed-on: https://go-review.googlesource.com/43230 Run-TryBot: Tom Bergan <tombergan@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Go bug for tracking kubernetes/kubernetes#45216
It appears that the new select cases in http2.serverConn.serve slowed things down.
But tip (~Go 1.9) had a number of select improvements after Go 1.8, so it doesn't show up as much perhaps. See comments and numbers in the linked bug.
/cc @tombergan
The text was updated successfully, but these errors were encountered: