Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/http2/hpack: lookup maps for the dynamic table leak despite evictions #19756

Closed
rodaine opened this issue Mar 29, 2017 · 5 comments
Closed
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@rodaine
Copy link

rodaine commented Mar 29, 2017

What version of Go are you using (go version)?

go version go1.8 darwin/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/chrisroche/go"
GORACE=""
GOROOT="/Users/chrisroche/.gimme/versions/go1.8.darwin.amd64"
GOTOOLDIR="/Users/chrisroche/.gimme/versions/go1.8.darwin.amd64/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/7s/kv_f_g112gq5gl7zt_lyt2zr0000gp/T/go-build079652742=/tmp/go-build -gno-record-gcc-switches -fno-common"
CXX="clang++"
CGO_ENABLED="1"
PKG_CONFIG="pkg-config"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"

What did you do?

Adding the following test to the current master of x/net/http2/hpack/tables_test.go:

func TestHeaderFieldTable_LookupMapEviction(t *testing.T) {
	table := &headerFieldTable{}
	table.init()
	table.addEntry(pair("key1", "value1-1"))
	table.addEntry(pair("key2", "value2-1"))
	table.addEntry(pair("key1", "value1-2"))
	table.addEntry(pair("key3", "value3-1"))
	table.addEntry(pair("key4", "value4-1"))
	table.addEntry(pair("key2", "value2-2"))

	// evict all pairs
	table.evictOldest(table.len())

	if l := table.len(); l > 0 {
		t.Errorf("table.len() = %d, want 0", l)
	}

	if l := len(table.byName); l > 0 {
		t.Errorf("len(table.byName) = %d, want 0", l)
	}

	if l := len(table.byNameValue); l > 0 {
		t.Errorf("len(table.byNameValue) = %d, want 0", l)
	}
}

And running go test -v -run TestHeaderFieldTable_LookupMapEviction from that directory

What did you expect to see?

PASS

What did you see instead?

=== RUN   TestHeaderFieldTable_LookupMapEviction
--- FAIL: TestHeaderFieldTable_LookupMapEviction (0.00s)
	tables_test.go:110: len(table.byName) = 4, want 0
	tables_test.go:114: len(table.byNameValue) = 6, want 0
FAIL
exit status 1
FAIL	golang.org/x/net/http2/hpack	0.006s

More context

When the load to a gRPC service was recently ramped up, we were alerted to a memory leak which took approximately 24hrs to consume all the RAM and die. Getting the heap profile off the running service, the vast majority of the objects were originating from hpack.(*dynamicTable).add and hpack.(*Decoder).readString:

(pprof) top100 -cum
718.97MB of 721.98MB total (99.58%)
Dropped 442 nodes (cum <= 3.61MB)
      flat  flat%   sum%        cum   cum%
         0     0%     0%   721.98MB   100%  runtime.goexit
         0     0%     0%   720.97MB 99.86%  github.com/example/pkg/vendor/golang.org/x/net/http2.(*Framer).ReadFrame
         0     0%     0%   720.97MB 99.86%  github.com/example/pkg/vendor/google.golang.org/grpc/transport.(*framer).readFrame
         0     0%     0%   720.47MB 99.79%  github.com/example/pkg/vendor/golang.org/x/net/http2.(*Framer).readMetaFrame
         0     0%     0%   720.47MB 99.79%  github.com/example/pkg/vendor/golang.org/x/net/http2/hpack.(*Decoder).Write
         0     0%     0%   720.47MB 99.79%  github.com/example/pkg/vendor/golang.org/x/net/http2/hpack.(*Decoder).parseHeaderFieldRepr
         0     0%     0%   719.97MB 99.72%  github.com/example/pkg/vendor/golang.org/x/net/http2/hpack.(*Decoder).parseFieldLiteral
         0     0%     0%   719.09MB 99.60%  github.com/example/pkg/vendor/google.golang.org/grpc.(*Server).handleRawConn
         0     0%     0%   719.09MB 99.60%  github.com/example/pkg/vendor/google.golang.org/grpc.(*Server).serveHTTP2Transport
         0     0%     0%   718.57MB 99.53%  github.com/example/pkg/vendor/google.golang.org/grpc.(*Server).serveStreams
         0     0%     0%   718.57MB 99.53%  github.com/example/pkg/vendor/google.golang.org/grpc/transport.(*http2Server).HandleStreams
  448.06MB 62.06% 62.06%   448.06MB 62.06%  github.com/example/pkg/vendor/golang.org/x/net/http2/hpack.(*Decoder).readString
  270.90MB 37.52% 99.58%   271.90MB 37.66%  github.com/example/pkg/vendor/golang.org/x/net/http2/hpack.(*dynamicTable).add

Digging into that code, there was a performance change made last month that improved lookup times considerably by caching the header IDs in byName and byNameValue maps. Unfortunately, it doesn't appear that the map entries are evicted, but instead set to zero:

if t.byName[f.Name] == id {
	t.byName[f.Name] = 0
}
if p := (pairNameValue{f.Name, f.Value}); t.byNameValue[p] == id {
	t.byNameValue[p] = 0
}

This eventually (but somewhat slowly) grows huge for variable headers like request IDs, timings, etc. Rolling back the package to the prior SHA immediately rectified the problem. I'm working on a patch to replace the above to use delete() instead. Curious if there is a reason to not?

@bradfitz bradfitz changed the title x/net/htt2/hpack: lookup maps for the dynamic table leak despite evictions x/net/http2/hpack: lookup maps for the dynamic table leak despite evictions Mar 29, 2017
@bradfitz bradfitz added this to the Go1.9 milestone Mar 29, 2017
@bradfitz bradfitz added the NeedsFix The path to resolution is known, but the work has not been done. label Mar 29, 2017
@bradfitz
Copy link
Contributor

@rodaine, thanks for the debugging!

If you want to send a fix, please do. Otherwise @tombergan or I can handle it too. Let us know.

@rodaine
Copy link
Author

rodaine commented Mar 29, 2017

No sweat! Pushing the patch momentarily :)

@bradfitz
Copy link
Contributor

I trust you've discovered https://golang.org/doc/contribute.html ?

(Sorry in advance for the wall of words. It tries to be comprehensive but it might be intimidating. There aren't many steps in the end.)

@rodaine
Copy link
Author

rodaine commented Mar 29, 2017

Yessir!

@gopherbot
Copy link

CL https://golang.org/cl/38781 mentions this issue.

@golang golang locked and limited conversation to collaborators Mar 29, 2018
c3mb0 pushed a commit to c3mb0/net that referenced this issue Apr 2, 2018
An earlier performance change, https://golang.org/cl/37406,
made headerFieldTable lookups O(1). The entries in the maps used to perform
these lookups were not evicted along with the original field elements, however
causing a gradual memory leak. This is most apparent when the headers have
highly variable content such as request IDs or timings.

Fixes golang/go#19756

Change-Id: Icdb51527eb671925216350ada15f2a1336ea3158
Reviewed-on: https://go-review.googlesource.com/38781
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Tom Bergan <tombergan@google.com>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

4 participants