Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net,net/http/httputil: ReverseProxy attempts to log each http2 stream cancellation, causing concurrent streams to grow unbounded due to logger buildup #53257

Open
amandachow opened this issue Jun 6, 2022 · 1 comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@amandachow
Copy link

Summary: ReverseProxy backed up on mutex in go logger while cleaning up cancelled HTTP2 streams.

What version of Go are you using (go version)?

$ go version

go version go1.17.7 darwin/amd64

Does this issue reproduce with the latest release?

Yes - We weren't able to try this, but the httputil code in question has not changed.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env

GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/amandachow/Library/Caches/go-build"
GOENV="/Users/amandachow/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/amandachow/go/pkg/mod"
GOOS="darwin"
GOPATH="/Users/amandachow/go"
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GOVCS=""
GOVERSION="go1.17.7"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/dev/null"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch x86_64 -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/_f/1tl8g2w95wx34b7zp53y0kqc0000gp/T/go-build69516959=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

I have a service that uses the httputil ReverseProxy as a load balancer, proxying requests to upstream services. My service averages ~10k requests per second, nearly all of which are streams. The streams are short lived - The P95 stream duration (the duration of time that a finished stream was alive) is 400ms.

I upgraded the x/net version from v0.0.0-20210929193557-e81a3d93ecf6 to v0.0.0-20220531201128-c960675eff93. I confirmed that the change directly after the old version, http2: refactor request write flow, is the first version that introduces the following symptoms.

What did you expect to see?

No noticeable behavior change to my service.

What did you see instead?

My service's number of currently open streams began growing unbounded. When healthy, my service has ~60 concurrent streams - With the upgrade, within 3 minutes it had reached 300k open streams and was continuing to grow.

I took a goroutine dump and it showed nearly all goroutines were waiting on the go logger's mutex to log the stream cancellation. Example log: (Message is always identical, except for the stream ID)

2022/06/03 01:37:59 httputil: ReverseProxy read error during body copy: stream error: stream ID 1403; CANCEL

From the pprof:
Screen Shot 2022-06-03 at 7 27 10 PM

I do not set a custom ReverseProxy ErrorLogger. My theory is that due to the cleanupWriteRequest introduced in the transport.go change, a CANCELed stream will now cause http2's Transport to return an error in RoundTrip, which is then consumed by ReverseProxy. ReverseProxy attempts to log each instance of this. Since the service has a high workload (rate of stream requests proxied), the logger falls behind and causes the buildup of concurrent streams.

It looks like ReverseProxy attempts to filter out other request cancellation errors - io.EOF and context.Canceled - and not log in those cases. It seems like it should also filter out http2 stream cancellation?

To test this theory, I set the ReverseProxy ErrorLogger to a custom go Logger with a no-op io.Writer:

type noOpWriter struct {}

func (noOpWriter) Write(_ []byte) (n int, err error) {return 0, nil}
[...]
&httputil.ReverseProxy{
		ErrorLog: log.New(noOpWriter{}, "", 0),
		[...]
}

With this change, using x/net v0.0.0-20220531201128-c960675eff93, the service performed as expected - The concurrent streams no longer grew unbounded.

@gopherbot gopherbot added this to the Unreleased milestone Jun 6, 2022
@cherrymui cherrymui changed the title x/net,httputil: ReverseProxy attempts to log each http2 stream cancellation, causing concurrent streams to grow unbounded due to logger buildup x/net,net/http/httputil: ReverseProxy attempts to log each http2 stream cancellation, causing concurrent streams to grow unbounded due to logger buildup Jun 6, 2022
@cherrymui cherrymui added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 6, 2022
@cherrymui
Copy link
Member

cc @neild @bradfitz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

3 participants