Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: Possible HTTP/2 busy loop regression in Go 1.17.3 #49615

Closed
lwithers opened this issue Nov 16, 2021 · 8 comments
Closed

net/http: Possible HTTP/2 busy loop regression in Go 1.17.3 #49615

lwithers opened this issue Nov 16, 2021 · 8 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance release-blocker
Milestone

Comments

@lwithers
Copy link

What version of Go are you using (go version)?

$ go version
go version go1.17.3 linux/amd64

Does this issue reproduce with the latest release?

Go 1.17.3 is currently the latest release. The same binary compiled with Go 1.17.2 does not have the same issue.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE="on"
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/lwithers/.cache/go-build"
GOENV="/home/lwithers/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/lwithers/go/pkg/mod"
GONOPROXY=""
GONOSUMDB="github.com/lampkicking,go.internal.yoti.com,dev.go.internal.yoti.com"
GOOS="linux"
GOPATH="/home/lwithers/go"
GOPRIVATE=""
GOPROXY="https://athens.internal.yoti.com"
GOROOT="/opt/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/opt/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.17.3"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="0"
GOMOD="/dev/null"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build390959389=/tmp/go-build -gno-record-gcc-switches"

The issue

Wtih some of our production binaries, we have seen quiescent CPU usages between 67%–100% of one core, even though the binary is idle. We have not yet found a small reproducer. This issue was found on binaries that were recompiled without changes (i.e. identical source, identical go.mod) with Go 1.17.3 — it does not occur if we compile the same input with Go 1.17.2 or any earlier version that we have used.

Looking at pprof we see:
profile001
pprof.human-attribute-server.samples.cpu.003.pb.gz

Looking at the update to the HTTP/2 bundle in Go1.17.3, we see a new func (cs *http2clientStream) writeRequest which has the following loop at the end:

        for {
                select {
                case <-cs.peerClosed:
                        return nil
                case <-respHeaderTimer:
                        return http2errTimeout
                case <-respHeaderRecv:
                        respHeaderTimer = nil // keep waiting for END_STREAM
                case <-cs.abort:
                        return cs.abortErr
                case <-ctx.Done():
                        return ctx.Err()
                case <-cs.reqCancel:
                        return http2errRequestCanceled
                }
        }

The line number of this in h2_bundle.go matches the line number in the profile.

Without knowing much at all about the design or operation of the HTTP/2 code, it would seem that if the channel respHeaderRecv (which is earlier set to cs.respHeaderRecv) is closed, that this function would busy-wait. And indeed, the CL updating the bundle in Go 1.17.3 does introduce a new close(cs.respHeaderRecv) on line 8692. I'm not 100% clear if that is the issue, but it's as far as I got without familiarising myself with the HTTP/2 codebase.

@seankhliao seankhliao added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance labels Nov 16, 2021
@seankhliao seankhliao changed the title Possible HTTP/2 busy loop regression in Go 1.17.3 net/http: Possible HTTP/2 busy loop regression in Go 1.17.3 Nov 16, 2021
@seankhliao
Copy link
Member

cc @neild

@lwithers
Copy link
Author

lwithers commented Nov 16, 2021

We have found a reproducer, please find attached. It is related to some of the settings on the http.Client.Transport object:

https://play.golang.org/p/pNqYAo3WnX4

@bcmills bcmills added this to the Go1.17.4 milestone Nov 16, 2021
@neild neild self-assigned this Nov 16, 2021
@gopherbot
Copy link

Change https://golang.org/cl/364574 mentions this issue: http2: avoid busy loop when ResponseHeaderTimeout is set

@neild
Copy link
Contributor

neild commented Nov 16, 2021

Thats an unpleasant bug. Thanks for the nice analysis.

@neild
Copy link
Contributor

neild commented Nov 16, 2021

@gopherbot Please open backport issues for 1.16 and 1.17.

@gopherbot
Copy link

Backport issue(s) opened: #49623 (for 1.16), #49624 (for 1.17).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases.

@gopherbot
Copy link

Change https://golang.org/cl/368377 mentions this issue: [internal-branch.go1.17-vendor] http2: avoid busy loop when ResponseHeaderTimeout is set

@gopherbot
Copy link

Change https://golang.org/cl/368376 mentions this issue: [internal-branch.go1.16-vendor] http2: avoid busy loop when ResponseHeaderTimeout is set

gopherbot pushed a commit to golang/net that referenced this issue Dec 1, 2021
…eaderTimeout is set

Don't keep reading from respHeaderRecv after the response headers
are received.

For golang/go#49615.
Fixes golang/go#49624.

Change-Id: Ib8126c954930011ac09b2cbc70bbbce76531b7db
Reviewed-on: https://go-review.googlesource.com/c/net/+/364574
Trust: Damien Neil <dneil@google.com>
Run-TryBot: Damien Neil <dneil@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
(cherry picked from commit 47ca1ff)
Reviewed-on: https://go-review.googlesource.com/c/net/+/368377
Reviewed-by: Michael Knyszek <mknyszek@google.com>
gopherbot pushed a commit to golang/net that referenced this issue Dec 1, 2021
…eaderTimeout is set

Don't keep reading from respHeaderRecv after the response headers
are received.

For golang/go#49615.
Fixes golang/go#49623.

Change-Id: Ib8126c954930011ac09b2cbc70bbbce76531b7db
Reviewed-on: https://go-review.googlesource.com/c/net/+/364574
Trust: Damien Neil <dneil@google.com>
Run-TryBot: Damien Neil <dneil@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
(cherry picked from commit 47ca1ff)
Reviewed-on: https://go-review.googlesource.com/c/net/+/368376
Reviewed-by: Michael Knyszek <mknyszek@google.com>
@mknyszek mknyszek modified the milestones: Go1.17.4, Go1.18 Dec 2, 2021
dteh pushed a commit to dteh/fhttp that referenced this issue Jun 22, 2022
Don't keep reading from respHeaderRecv after the response headers
are received.

Fixes golang/go#49615.

Change-Id: Ib8126c954930011ac09b2cbc70bbbce76531b7db
Reviewed-on: https://go-review.googlesource.com/c/net/+/364574
Trust: Damien Neil <dneil@google.com>
Run-TryBot: Damien Neil <dneil@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
fedosgad pushed a commit to fedosgad/oohttp that referenced this issue Jun 22, 2022
…eaderTimeout is set

Don't keep reading from respHeaderRecv after the response headers
are received.

For golang/go#49615.
Fixes golang/go#49624.

Change-Id: Ib8126c954930011ac09b2cbc70bbbce76531b7db
Reviewed-on: https://go-review.googlesource.com/c/net/+/364574
Trust: Damien Neil <dneil@google.com>
Run-TryBot: Damien Neil <dneil@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
(cherry picked from commit 47ca1ff31462b4c8e280b5f2cdba5050aa61e890)
Reviewed-on: https://go-review.googlesource.com/c/net/+/368377
Reviewed-by: Michael Knyszek <mknyszek@google.com>
@rsc rsc unassigned neild Jun 23, 2022
@golang golang locked and limited conversation to collaborators Jun 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance release-blocker
Projects
None yet
Development

No branches or pull requests

6 participants