Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: http2 requests blocked after timeout #57478

Open
stampy88 opened this issue Dec 27, 2022 · 6 comments
Open

net/http: http2 requests blocked after timeout #57478

stampy88 opened this issue Dec 27, 2022 · 6 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@stampy88
Copy link

What version of Go are you using (go version)?

$ go version
go version go1.18.6 linux/amd64

Does this issue reproduce with the latest release?

Have not attempted yet as it is hard to reproduce. I am trying to attempt to write a reproducer.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/XXXX/.cache/go-build"
GOENV="/home/XXXX/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/XXXX/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/XXXX/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.18.6"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build3899516689=/tmp/go-build -gno-record-gcc-switches"

What did you do?

The application does HTTP POSTs to configured clients endpoints whenever an event is received. The HTTP client is configured with a 5 second timeout so that long running requests don't block an event consumer too long.

What did you expect to see?

When a timeout occurs, I expected all go routines that the std library creates to handle the HTTP request to be cleaned up.

What did you see instead?

The server the app is connecting to supports HTTP 2. During periods where timeouts are occurring we can see the number of go routines steadily increase until the server it is trying to communicate to starts responding again. See stack traces below:

goroutine profile: total 1451
1258 @ 0x438936 0x449473 0x44944d 0x464865 0x4726e5 0x81d5af 0x81d59a 0x81bfee 0x468741
#	0x464864	sync.runtime_SemacquireMutex+0x24			/usr/local/go/src/runtime/sema.go:71
#	0x4726e4	sync.(*Mutex).lockSlow+0x164				/usr/local/go/src/sync/mutex.go:162
#	0x81d5ae	sync.(*Mutex).Lock+0x30e				/usr/local/go/src/sync/mutex.go:81
#	0x81d599	net/http.(*http2clientStream).cleanupWriteRequest+0x2f9	/usr/local/go/src/net/http/h2_bundle.go:8159
#	0x81bfed	net/http.(*http2clientStream).doRequest+0x2d		/usr/local/go/src/net/http/h2_bundle.go:7905

36 @ 0x438936 0x406a0c 0x406478 0x1005f13 0x468741
#	0x1005f12	main.main.func3.1+0xf2	/XXXXXX/main.go:383

16 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4ccb5a 0x4ccb48 0x6939a9 0x6a5345 0x787c1d 0x4f8b78 0x787e05 0x785416 0x78ae6f 0x78ae70 0x51af74 0x4c5f7a 0x7fe96e 0x7fe928 0x7ff1d5 0x8214f0 0x82094f 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88		/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31		/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4ccb59	internal/poll.(*pollDesc).waitRead+0x259	/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
#	0x4ccb47	internal/poll.(*FD).Read+0x247			/usr/local/go/src/internal/poll/fd_unix.go:167
#	0x6939a8	net.(*netFD).Read+0x28				/usr/local/go/src/net/fd_posix.go:55
#	0x6a5344	net.(*conn).Read+0x44				/usr/local/go/src/net/net.go:183
#	0x787c1c	crypto/tls.(*atLeastReader).Read+0x3c		/usr/local/go/src/crypto/tls/conn.go:785
#	0x4f8b77	bytes.(*Buffer).ReadFrom+0x97			/usr/local/go/src/bytes/buffer.go:204
#	0x787e04	crypto/tls.(*Conn).readFromUntil+0xe4		/usr/local/go/src/crypto/tls/conn.go:807
#	0x785415	crypto/tls.(*Conn).readRecordOrCCS+0x115	/usr/local/go/src/crypto/tls/conn.go:614
#	0x78ae6e	crypto/tls.(*Conn).readRecord+0x16e		/usr/local/go/src/crypto/tls/conn.go:582
#	0x78ae6f	crypto/tls.(*Conn).Read+0x16f			/usr/local/go/src/crypto/tls/conn.go:1285
#	0x51af73	bufio.(*Reader).Read+0x1b3			/usr/local/go/src/bufio/bufio.go:236
#	0x4c5f79	io.ReadAtLeast+0x99				/usr/local/go/src/io/io.go:331
#	0x7fe96d	io.ReadFull+0x6d				/usr/local/go/src/io/io.go:350
#	0x7fe927	net/http.http2readFrameHeader+0x27		/usr/local/go/src/net/http/h2_bundle.go:1566
#	0x7ff1d4	net/http.(*http2Framer).ReadFrame+0x94		/usr/local/go/src/net/http/h2_bundle.go:1830
#	0x8214ef	net/http.(*http2clientConnReadLoop).run+0x12f	/usr/local/go/src/net/http/h2_bundle.go:8820
#	0x82094e	net/http.(*http2ClientConn).readLoop+0x6e	/usr/local/go/src/net/http/h2_bundle.go:8716

13 @ 0x438936 0x449473 0x44944d 0x464745 0x474172 0x100ab15 0x10099af 0x1006073 0x468741
#	0x464744	sync.runtime_Semacquire+0x24		/usr/local/go/src/runtime/sema.go:56
#	0x474171	sync.(*WaitGroup).Wait+0x51		/usr/local/go/src/sync/waitgroup.go:136
#	0x100ab14	main.(*server).doProcessPacket+0x1034	XXXXXX/processpacket.go:168
#	0x10099ae	main.(*server).processPacket+0x18e	XXXXXX/processpacket.go:45
#	0x1006072	main.main.func3.1+0x252			XXXXXX/main.go:390

12 @ 0x438936 0x448392 0x857275 0x468741
#	0x857274	net/http.(*persistConn).writeLoop+0xf4	/usr/local/go/src/net/http/transport.go:2392

12 @ 0x438936 0x449473 0x44944d 0x464865 0x4726e5 0x825fb2 0x825f7e 0x81d545 0x81bfee 0x468741
#	0x464864	sync.runtime_SemacquireMutex+0x24			/usr/local/go/src/runtime/sema.go:71
#	0x4726e4	sync.(*Mutex).lockSlow+0x164				/usr/local/go/src/sync/mutex.go:162
#	0x825fb1	sync.(*Mutex).Lock+0x51					/usr/local/go/src/sync/mutex.go:81
#	0x825f7d	net/http.(*http2ClientConn).writeStreamReset+0x1d	/usr/local/go/src/net/http/h2_bundle.go:9574
#	0x81d544	net/http.(*http2clientStream).cleanupWriteRequest+0x2a4	/usr/local/go/src/net/http/h2_bundle.go:8145
#	0x81bfed	net/http.(*http2clientStream).doRequest+0x2d		/usr/local/go/src/net/http/h2_bundle.go:7905

11 @ 0x438936 0x448392 0x81bb71 0x817cb7 0x8264db 0x8264d5 0x84b790 0x832ab9 0x7f43f8 0x7f3c9b 0x7f60d5 0x100e0e6 0x100e0d9 0x468741
#	0x81bb70	net/http.(*http2ClientConn).RoundTrip+0x450		/usr/local/go/src/net/http/h2_bundle.go:7874
#	0x817cb6	net/http.(*http2Transport).RoundTripOpt+0x1b6		/usr/local/go/src/net/http/h2_bundle.go:7218
#	0x8264da	net/http.(*http2Transport).RoundTrip+0x1a		/usr/local/go/src/net/http/h2_bundle.go:7179
#	0x8264d4	net/http.http2noDialH2RoundTripper.RoundTrip+0x14	/usr/local/go/src/net/http/h2_bundle.go:9682
#	0x84b78f	net/http.(*Transport).roundTrip+0x38f			/usr/local/go/src/net/http/transport.go:539
#	0x832ab8	net/http.(*Transport).RoundTrip+0x18			/usr/local/go/src/net/http/roundtrip.go:17
#	0x7f43f7	net/http.send+0x5d7					/usr/local/go/src/net/http/client.go:252
#	0x7f3c9a	net/http.(*Client).send+0x9a				/usr/local/go/src/net/http/client.go:176
#	0x7f60d4	net/http.(*Client).do+0x8f4				/usr/local/go/src/net/http/client.go:725
#	0x100e0e5	net/http.(*Client).Do+0x985				/usr/local/go/src/net/http/client.go:593
#	0x100e0d8	main.(*server).sendRestMessage+0x978			/XXXXXX/rest.go:89

10 @ 0x438936 0x406a0c 0x406438 0x8f57d8 0x468741
#	0x8f57d7	google.golang.org/grpc.(*addrConn).resetTransport+0x477	/var/go/go/pkg/mod/google.golang.org/grpc@v1.28.1/clientconn.go:1155

10 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4ccb5a 0x4ccb48 0x6939a9 0x6a5345 0x787c1d 0x4f8b78 0x787e05 0x785416 0x78ae6f 0x78ae70 0x51af74 0x4c5f7a 0x86c2ce 0x86c288 0x86cb35 0x889c73 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88					/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31					/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4ccb59	internal/poll.(*pollDesc).waitRead+0x259				/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
#	0x4ccb47	internal/poll.(*FD).Read+0x247						/usr/local/go/src/internal/poll/fd_unix.go:167
#	0x6939a8	net.(*netFD).Read+0x28							/usr/local/go/src/net/fd_posix.go:55
#	0x6a5344	net.(*conn).Read+0x44							/usr/local/go/src/net/net.go:183
#	0x787c1c	crypto/tls.(*atLeastReader).Read+0x3c					/usr/local/go/src/crypto/tls/conn.go:785
#	0x4f8b77	bytes.(*Buffer).ReadFrom+0x97						/usr/local/go/src/bytes/buffer.go:204
#	0x787e04	crypto/tls.(*Conn).readFromUntil+0xe4					/usr/local/go/src/crypto/tls/conn.go:807
#	0x785415	crypto/tls.(*Conn).readRecordOrCCS+0x115				/usr/local/go/src/crypto/tls/conn.go:614
#	0x78ae6e	crypto/tls.(*Conn).readRecord+0x16e					/usr/local/go/src/crypto/tls/conn.go:582
#	0x78ae6f	crypto/tls.(*Conn).Read+0x16f						/usr/local/go/src/crypto/tls/conn.go:1285
#	0x51af73	bufio.(*Reader).Read+0x1b3						/usr/local/go/src/bufio/bufio.go:236
#	0x4c5f79	io.ReadAtLeast+0x99							/usr/local/go/src/io/io.go:331
#	0x86c2cd	io.ReadFull+0x6d							/usr/local/go/src/io/io.go:350
#	0x86c287	golang.org/x/net/http2.readFrameHeader+0x27				/var/go/go/pkg/mod/golang.org/x/net@v0.0.0-20220425223048-2871e0cb64e4/http2/frame.go:237
#	0x86cb34	golang.org/x/net/http2.(*Framer).ReadFrame+0x94				/var/go/go/pkg/mod/golang.org/x/net@v0.0.0-20220425223048-2871e0cb64e4/http2/frame.go:498
#	0x889c72	google.golang.org/grpc/internal/transport.(*http2Client).reader+0x232	/var/go/go/pkg/mod/google.golang.org/grpc@v1.28.1/internal/transport/http2_client.go:1270

10 @ 0x438936 0x448392 0x8790b5 0x879745 0x882045 0x468741
#	0x8790b4	google.golang.org/grpc/internal/transport.(*controlBuffer).get+0x114	/var/go/go/pkg/mod/google.golang.org/grpc@v1.28.1/internal/transport/controlbuf.go:395
#	0x879744	google.golang.org/grpc/internal/transport.(*loopyWriter).run+0x84	/var/go/go/pkg/mod/google.golang.org/grpc@v1.28.1/internal/transport/controlbuf.go:513
#	0x882044	google.golang.org/grpc/internal/transport.newHTTP2Client.func3+0x64	/var/go/go/pkg/mod/google.golang.org/grpc@v1.28.1/internal/transport/http2_client.go:346

9 @ 0x438936 0x448392 0x81c16c 0x81bfde 0x468741
#	0x81c16b	net/http.(*http2clientStream).writeRequest+0x14b	/usr/local/go/src/net/http/h2_bundle.go:7929
#	0x81bfdd	net/http.(*http2clientStream).doRequest+0x1d		/usr/local/go/src/net/http/h2_bundle.go:7904

6 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4ccb5a 0x4ccb48 0x6939a9 0x6a5345 0x787c1d 0x4f8b78 0x787e05 0x785416 0x78ae6f 0x78ae70 0x85476e 0x51a9a3 0x51aafd 0x85558c 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88		/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31		/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4ccb59	internal/poll.(*pollDesc).waitRead+0x259	/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
#	0x4ccb47	internal/poll.(*FD).Read+0x247			/usr/local/go/src/internal/poll/fd_unix.go:167
#	0x6939a8	net.(*netFD).Read+0x28				/usr/local/go/src/net/fd_posix.go:55
#	0x6a5344	net.(*conn).Read+0x44				/usr/local/go/src/net/net.go:183
#	0x787c1c	crypto/tls.(*atLeastReader).Read+0x3c		/usr/local/go/src/crypto/tls/conn.go:785
#	0x4f8b77	bytes.(*Buffer).ReadFrom+0x97			/usr/local/go/src/bytes/buffer.go:204
#	0x787e04	crypto/tls.(*Conn).readFromUntil+0xe4		/usr/local/go/src/crypto/tls/conn.go:807
#	0x785415	crypto/tls.(*Conn).readRecordOrCCS+0x115	/usr/local/go/src/crypto/tls/conn.go:614
#	0x78ae6e	crypto/tls.(*Conn).readRecord+0x16e		/usr/local/go/src/crypto/tls/conn.go:582
#	0x78ae6f	crypto/tls.(*Conn).Read+0x16f			/usr/local/go/src/crypto/tls/conn.go:1285
#	0x85476d	net/http.(*persistConn).Read+0x4d		/usr/local/go/src/net/http/transport.go:1929
#	0x51a9a2	bufio.(*Reader).fill+0x102			/usr/local/go/src/bufio/bufio.go:106
#	0x51aafc	bufio.(*Reader).Peek+0x5c			/usr/local/go/src/bufio/bufio.go:144
#	0x85558b	net/http.(*persistConn).readLoop+0x1ab		/usr/local/go/src/net/http/transport.go:2093

3 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4ccb5a 0x4ccb48 0x6939a9 0x6a5345 0x787c1d 0x4f8b78 0x787e05 0x785416 0x78ae6f 0x78ae70 0x85476e 0x51a9a3 0x51b58f 0x7f12a5 0x7f0d28 0x7f0fee 0x8492dc 0x8491a5 0x858ea2 0x517fbf 0x517bbb 0x517818 0xf2b36b 0xf2afba 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88				/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31				/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4ccb59	internal/poll.(*pollDesc).waitRead+0x259			/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
#	0x4ccb47	internal/poll.(*FD).Read+0x247					/usr/local/go/src/internal/poll/fd_unix.go:167
#	0x6939a8	net.(*netFD).Read+0x28						/usr/local/go/src/net/fd_posix.go:55
#	0x6a5344	net.(*conn).Read+0x44						/usr/local/go/src/net/net.go:183
#	0x787c1c	crypto/tls.(*atLeastReader).Read+0x3c				/usr/local/go/src/crypto/tls/conn.go:785
#	0x4f8b77	bytes.(*Buffer).ReadFrom+0x97					/usr/local/go/src/bytes/buffer.go:204
#	0x787e04	crypto/tls.(*Conn).readFromUntil+0xe4				/usr/local/go/src/crypto/tls/conn.go:807
#	0x785415	crypto/tls.(*Conn).readRecordOrCCS+0x115			/usr/local/go/src/crypto/tls/conn.go:614
#	0x78ae6e	crypto/tls.(*Conn).readRecord+0x16e				/usr/local/go/src/crypto/tls/conn.go:582
#	0x78ae6f	crypto/tls.(*Conn).Read+0x16f					/usr/local/go/src/crypto/tls/conn.go:1285
#	0x85476d	net/http.(*persistConn).Read+0x4d				/usr/local/go/src/net/http/transport.go:1929
#	0x51a9a2	bufio.(*Reader).fill+0x102					/usr/local/go/src/bufio/bufio.go:106
#	0x51b58e	bufio.(*Reader).ReadSlice+0x2e					/usr/local/go/src/bufio/bufio.go:371
#	0x7f12a4	net/http/internal.readChunkLine+0x24				/usr/local/go/src/net/http/internal/chunked.go:129
#	0x7f0d27	net/http/internal.(*chunkedReader).beginChunk+0x27		/usr/local/go/src/net/http/internal/chunked.go:48
#	0x7f0fed	net/http/internal.(*chunkedReader).Read+0x14d			/usr/local/go/src/net/http/internal/chunked.go:98
#	0x8492db	net/http.(*body).readLocked+0x3b				/usr/local/go/src/net/http/transfer.go:844
#	0x8491a4	net/http.(*body).Read+0x124					/usr/local/go/src/net/http/transfer.go:836
#	0x858ea1	net/http.(*bodyEOFSignal).Read+0x141				/usr/local/go/src/net/http/transport.go:2774
#	0x517fbe	encoding/json.(*Decoder).refill+0x17e				/usr/local/go/src/encoding/json/stream.go:165
#	0x517bba	encoding/json.(*Decoder).readValue+0xba				/usr/local/go/src/encoding/json/stream.go:140
#	0x517817	encoding/json.(*Decoder).Decode+0x77				/usr/local/go/src/encoding/json/stream.go:63
#	0xf2b36a	XXXXXX/kuberesolver.(*streamWatcher).Decode+0x6a	XXXXXX/kuberesolver@v2.1.7+incompatible/stream.go:96
#	0xf2afb9	XXXXXX/kuberesolver.(*streamWatcher).receive+0x119	XXXXXX/kuberesolver@v2.1.7+incompatible/stream.go:72

3 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4ccb5a 0x4ccb48 0x6939a9 0x6a5345 0x8345ed 0x51a9a3 0x51b58f 0x51b7e7 0x7d4ff9 0x82f9b9 0x82f9ba 0x835b8a 0x839f8b 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88		/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31		/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4ccb59	internal/poll.(*pollDesc).waitRead+0x259	/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
#	0x4ccb47	internal/poll.(*FD).Read+0x247			/usr/local/go/src/internal/poll/fd_unix.go:167
#	0x6939a8	net.(*netFD).Read+0x28				/usr/local/go/src/net/fd_posix.go:55
#	0x6a5344	net.(*conn).Read+0x44				/usr/local/go/src/net/net.go:183
#	0x8345ec	net/http.(*connReader).Read+0x16c		/usr/local/go/src/net/http/server.go:780
#	0x51a9a2	bufio.(*Reader).fill+0x102			/usr/local/go/src/bufio/bufio.go:106
#	0x51b58e	bufio.(*Reader).ReadSlice+0x2e			/usr/local/go/src/bufio/bufio.go:371
#	0x51b7e6	bufio.(*Reader).ReadLine+0x26			/usr/local/go/src/bufio/bufio.go:400
#	0x7d4ff8	net/textproto.(*Reader).readLineSlice+0x98	/usr/local/go/src/net/textproto/reader.go:57
#	0x82f9b8	net/textproto.(*Reader).ReadLine+0x78		/usr/local/go/src/net/textproto/reader.go:38
#	0x82f9b9	net/http.readRequest+0x79			/usr/local/go/src/net/http/request.go:1029
#	0x835b89	net/http.(*conn).readRequest+0x249		/usr/local/go/src/net/http/server.go:988
#	0x839f8a	net/http.(*conn).serve+0x32a			/usr/local/go/src/net/http/server.go:1891

3 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4ccb5a 0x4ccb48 0x6939a9 0x6a5345 0x85476e 0x51a9a3 0x51aafd 0x85558c 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88		/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31		/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4ccb59	internal/poll.(*pollDesc).waitRead+0x259	/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
#	0x4ccb47	internal/poll.(*FD).Read+0x247			/usr/local/go/src/internal/poll/fd_unix.go:167
#	0x6939a8	net.(*netFD).Read+0x28				/usr/local/go/src/net/fd_posix.go:55
#	0x6a5344	net.(*conn).Read+0x44				/usr/local/go/src/net/net.go:183
#	0x85476d	net/http.(*persistConn).Read+0x4d		/usr/local/go/src/net/http/transport.go:1929
#	0x51a9a2	bufio.(*Reader).fill+0x102			/usr/local/go/src/bufio/bufio.go:106
#	0x51aafc	bufio.(*Reader).Peek+0x5c			/usr/local/go/src/bufio/bufio.go:144
#	0x85558b	net/http.(*persistConn).readLoop+0x1ab		/usr/local/go/src/net/http/transport.go:2093

3 @ 0x438936 0x448392 0x856179 0x468741
#	0x856178	net/http.(*persistConn).readLoop+0xd98	/usr/local/go/src/net/http/transport.go:2213

3 @ 0x438936 0x448392 0x8ed3cf 0x468741
#	0x8ed3ce	google.golang.org/grpc.(*ccBalancerWrapper).watcher+0x8e	/var/go/go/pkg/mod/google.golang.org/grpc@v1.28.1/balancer_conn_wrappers.go:69

3 @ 0x438936 0x448392 0xf29867 0xf289db 0xf2b5de 0xf2b50e 0x468741
#	0xf29866	XXXXXX/kuberesolver.(*kResolver).watch+0x206		XXXXXX/kuberesolver@v2.1.7+incompatible/builder.go:276
#	0xf289da	XXXXXX/kuberesolver.(*kubeBuilder).Build.func1+0x3a	XXXXXX/kuberesolver@v2.1.7+incompatible/builder.go:161
#	0xf2b5dd	XXXXXX/kuberesolver.until.func1+0x3d			XXXXXX/kuberesolver@v2.1.7+incompatible/util.go:19
#	0xf2b50d	XXXXXX/kuberesolver.until+0x4d				XXXXXX/kuberesolver@v2.1.7+incompatible/util.go:20

2 @ 0x438936 0x448392 0x81c9e9 0x81bfde 0x468741
#	0x81c9e8	net/http.(*http2clientStream).writeRequest+0x9c8	/usr/local/go/src/net/http/h2_bundle.go:8042
#	0x81bfdd	net/http.(*http2clientStream).doRequest+0x1d		/usr/local/go/src/net/http/h2_bundle.go:7904

2 @ 0x438936 0x448392 0x858134 0x84bbc9 0x832ab9 0x7f43f8 0x7f3c9b 0x7f60d5 0x100e0e6 0x100e0d9 0x468741
#	0x858133	net/http.(*persistConn).roundTrip+0x973	/usr/local/go/src/net/http/transport.go:2620
#	0x84bbc8	net/http.(*Transport).roundTrip+0x7c8	/usr/local/go/src/net/http/transport.go:594
#	0x832ab8	net/http.(*Transport).RoundTrip+0x18	/usr/local/go/src/net/http/roundtrip.go:17
#	0x7f43f7	net/http.send+0x5d7			/usr/local/go/src/net/http/client.go:252
#	0x7f3c9a	net/http.(*Client).send+0x9a		/usr/local/go/src/net/http/client.go:176
#	0x7f60d4	net/http.(*Client).do+0x8f4		/usr/local/go/src/net/http/client.go:725
#	0x100e0e5	net/http.(*Client).Do+0x985		/usr/local/go/src/net/http/client.go:593
#	0x100e0d8	main.(*server).sendRestMessage+0x978	XXXXXXs/rest.go:89

1 @ 0x40bf14 0x464eaf 0xf2c6d9 0x468741
#	0x464eae	os/signal.signal_recv+0x2e	/usr/local/go/src/runtime/sigqueue.go:151
#	0xf2c6d8	os/signal.loop+0x18		/usr/local/go/src/os/signal/signal_unix.go:23

1 @ 0x438936 0x406a0c 0x406438 0x10055e9 0x438572 0x468741
#	0x10055e8	main.main+0x1028	XXXXXX/main.go:425
#	0x438571	runtime.main+0x211	/usr/local/go/src/runtime/proc.go:250

1 @ 0x438936 0x406a0c 0x406438 0xf38312 0x468741
#	0xf38311	XXXXXX/app.(*Server).handleSignals+0x151	XXXXXX/app/server.go:194

1 @ 0x438936 0x406a0c 0x406478 0xea7614 0x468741
#	0xea7613	github.com/eclipse/paho%2emqtt%2egolang.startComms.func2+0x73	/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/net.go:430

1 @ 0x438936 0x406a0c 0x406478 0xea76b1 0x468741
#	0xea76b0	github.com/eclipse/paho%2emqtt%2egolang.startComms.func1+0x70	/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/net.go:407

1 @ 0x438936 0x406a0c 0x406478 0xeaab1c 0x468741
#	0xeaab1b	github.com/eclipse/paho%2emqtt%2egolang.(*router).matchAndDispatch.func2+0xbb	/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/router.go:174

1 @ 0x438936 0x406a0c 0x406478 0xf62628 0x468741
#	0xf62627	github.com/streadway/amqp.(*consumers).buffer+0x107	/var/go/go/pkg/mod/github.com/streadway/amqp@v0.0.0-20190404075320-75d898a42a94/consumers.go:61

1 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4ccb5a 0x4ccb48 0x6939a9 0x6a5345 0x51af74 0x4c5f7a 0xf63993 0xf6396f 0xf6048a 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88			/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31			/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4ccb59	internal/poll.(*pollDesc).waitRead+0x259		/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
#	0x4ccb47	internal/poll.(*FD).Read+0x247				/usr/local/go/src/internal/poll/fd_unix.go:167
#	0x6939a8	net.(*netFD).Read+0x28					/usr/local/go/src/net/fd_posix.go:55
#	0x6a5344	net.(*conn).Read+0x44					/usr/local/go/src/net/net.go:183
#	0x51af73	bufio.(*Reader).Read+0x1b3				/usr/local/go/src/bufio/bufio.go:236
#	0x4c5f79	io.ReadAtLeast+0x99					/usr/local/go/src/io/io.go:331
#	0xf63992	io.ReadFull+0x52					/usr/local/go/src/io/io.go:350
#	0xf6396e	github.com/streadway/amqp.(*reader).ReadFrame+0x2e	/var/go/go/pkg/mod/github.com/streadway/amqp@v0.0.0-20190404075320-75d898a42a94/read.go:49
#	0xf60489	github.com/streadway/amqp.(*Connection).reader+0x1c9	/var/go/go/pkg/mod/github.com/streadway/amqp@v0.0.0-20190404075320-75d898a42a94/connection.go:521

1 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4ccb5a 0x4ccb48 0x6939a9 0x6a5345 0x787c1d 0x4f8b78 0x787e05 0x785416 0x78ae6f 0x78ae70 0x4c5f7a 0xe8111f 0xe81102 0xea4ce5 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88					/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31					/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4ccb59	internal/poll.(*pollDesc).waitRead+0x259				/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
#	0x4ccb47	internal/poll.(*FD).Read+0x247						/usr/local/go/src/internal/poll/fd_unix.go:167
#	0x6939a8	net.(*netFD).Read+0x28							/usr/local/go/src/net/fd_posix.go:55
#	0x6a5344	net.(*conn).Read+0x44							/usr/local/go/src/net/net.go:183
#	0x787c1c	crypto/tls.(*atLeastReader).Read+0x3c					/usr/local/go/src/crypto/tls/conn.go:785
#	0x4f8b77	bytes.(*Buffer).ReadFrom+0x97						/usr/local/go/src/bytes/buffer.go:204
#	0x787e04	crypto/tls.(*Conn).readFromUntil+0xe4					/usr/local/go/src/crypto/tls/conn.go:807
#	0x785415	crypto/tls.(*Conn).readRecordOrCCS+0x115				/usr/local/go/src/crypto/tls/conn.go:614
#	0x78ae6e	crypto/tls.(*Conn).readRecord+0x16e					/usr/local/go/src/crypto/tls/conn.go:582
#	0x78ae6f	crypto/tls.(*Conn).Read+0x16f						/usr/local/go/src/crypto/tls/conn.go:1285
#	0x4c5f79	io.ReadAtLeast+0x99							/usr/local/go/src/io/io.go:331
#	0xe8111e	io.ReadFull+0x5e							/usr/local/go/src/io/io.go:350
#	0xe81101	github.com/eclipse/paho.mqtt.golang/packets.ReadPacket+0x41		/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/packets/packets.go:131
#	0xea4ce4	github.com/eclipse/paho%2emqtt%2egolang.startIncoming.func1+0xe4	/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/net.go:124

1 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4ccb5a 0x4ccb48 0x6939a9 0x6a5345 0x83411f 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88		/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31		/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4ccb59	internal/poll.(*pollDesc).waitRead+0x259	/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
#	0x4ccb47	internal/poll.(*FD).Read+0x247			/usr/local/go/src/internal/poll/fd_unix.go:167
#	0x6939a8	net.(*netFD).Read+0x28				/usr/local/go/src/net/fd_posix.go:55
#	0x6a5344	net.(*conn).Read+0x44				/usr/local/go/src/net/net.go:183
#	0x83411e	net/http.(*connReader).backgroundRead+0x3e	/usr/local/go/src/net/http/server.go:672

1 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4ceb05 0x4ceaf2 0x694229 0x6a5525 0x788528 0x7889f1 0x789cf4 0x8176e9 0x51c5a2 0x825fdc 0x81d545 0x81bfee 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88			/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31			/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4ceb04	internal/poll.(*pollDesc).waitWrite+0x2c4		/usr/local/go/src/internal/poll/fd_poll_runtime.go:92
#	0x4ceaf1	internal/poll.(*FD).Write+0x2b1				/usr/local/go/src/internal/poll/fd_unix.go:391
#	0x694228	net.(*netFD).Write+0x28					/usr/local/go/src/net/fd_posix.go:96
#	0x6a5524	net.(*conn).Write+0x44					/usr/local/go/src/net/net.go:195
#	0x788527	crypto/tls.(*Conn).write+0x107				/usr/local/go/src/crypto/tls/conn.go:920
#	0x7889f0	crypto/tls.(*Conn).writeRecordLocked+0x350		/usr/local/go/src/crypto/tls/conn.go:988
#	0x789cf3	crypto/tls.(*Conn).Write+0x3f3				/usr/local/go/src/crypto/tls/conn.go:1159
#	0x8176e8	net/http.http2stickyErrWriter.Write+0x148		/usr/local/go/src/net/http/h2_bundle.go:7133
#	0x51c5a1	bufio.(*Writer).Flush+0x61				/usr/local/go/src/bufio/bufio.go:628
#	0x825fdb	net/http.(*http2ClientConn).writeStreamReset+0x7b	/usr/local/go/src/net/http/h2_bundle.go:9576
#	0x81d544	net/http.(*http2clientStream).cleanupWriteRequest+0x2a4	/usr/local/go/src/net/http/h2_bundle.go:8145
#	0x81bfed	net/http.(*http2clientStream).doRequest+0x2d		/usr/local/go/src/net/http/h2_bundle.go:7905

1 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4d0a0c 0x4d09f9 0x695ab5 0x6ae528 0x6ad6fd 0x83f2e5 0x83ef1d 0x10069cf 0x100695b 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88		/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31		/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4d0a0b	internal/poll.(*pollDesc).waitRead+0x22b	/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
#	0x4d09f8	internal/poll.(*FD).Accept+0x218		/usr/local/go/src/internal/poll/fd_unix.go:614
#	0x695ab4	net.(*netFD).accept+0x34			/usr/local/go/src/net/fd_unix.go:172
#	0x6ae527	net.(*TCPListener).accept+0x27			/usr/local/go/src/net/tcpsock_posix.go:139
#	0x6ad6fc	net.(*TCPListener).Accept+0x3c			/usr/local/go/src/net/tcpsock.go:288
#	0x83f2e4	net/http.(*Server).Serve+0x384			/usr/local/go/src/net/http/server.go:3039
#	0x83ef1c	net/http.(*Server).ListenAndServe+0x7c		/usr/local/go/src/net/http/server.go:2968
#	0x10069ce	net/http.ListenAndServe+0xce			/usr/local/go/src/net/http/server.go:3222
#	0x100695a	main.main.func1+0x5a				XXXXXX/main.go:286

1 @ 0x438936 0x4315b7 0x4630e9 0x4cb7f2 0x4d0a0c 0x4d09f9 0x695ab5 0x6ae528 0x6ad6fd 0x83f2e5 0xf37dff 0x468741
#	0x4630e8	internal/poll.runtime_pollWait+0x88				/usr/local/go/src/runtime/netpoll.go:302
#	0x4cb7f1	internal/poll.(*pollDesc).wait+0x31				/usr/local/go/src/internal/poll/fd_poll_runtime.go:83
#	0x4d0a0b	internal/poll.(*pollDesc).waitRead+0x22b			/usr/local/go/src/internal/poll/fd_poll_runtime.go:88
#	0x4d09f8	internal/poll.(*FD).Accept+0x218				/usr/local/go/src/internal/poll/fd_unix.go:614
#	0x695ab4	net.(*netFD).accept+0x34					/usr/local/go/src/net/fd_unix.go:172
#	0x6ae527	net.(*TCPListener).accept+0x27					/usr/local/go/src/net/tcpsock_posix.go:139
#	0x6ad6fc	net.(*TCPListener).Accept+0x3c					/usr/local/go/src/net/tcpsock.go:288
#	0x83f2e4	net/http.(*Server).Serve+0x384					/usr/local/go/src/net/http/server.go:3039
#	0xf37dfe	XXXXXX/app.(*Server).Start.func1+0xbe	XXXXXX/app/server.go:146

1 @ 0x438936 0x448392 0x1006492 0x468741
#	0x1006491	main.main.func2+0x151	XXXXXX/main.go:356

1 @ 0x438936 0x448392 0xe9c8fb 0x468741
#	0xe9c8fa	github.com/eclipse/paho%2emqtt%2egolang.(*client).startCommsWorkers.func2+0xfa	/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/client.go:663

1 @ 0x438936 0x448392 0xe9ce5c 0x468741
#	0xe9ce5b	github.com/eclipse/paho%2emqtt%2egolang.(*client).startCommsWorkers.func1+0x17b	/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/client.go:628

1 @ 0x438936 0x448392 0xea51cd 0x468741
#	0xea51cc	github.com/eclipse/paho%2emqtt%2egolang.startIncomingComms.func1+0x18c	/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/net.go:175

1 @ 0x438936 0x448392 0xea6565 0x468741
#	0xea6564	github.com/eclipse/paho%2emqtt%2egolang.startOutgoingComms.func1+0x1e4	/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/net.go:277

1 @ 0x438936 0x448392 0xea9569 0x468741
#	0xea9568	github.com/eclipse/paho%2emqtt%2egolang.keepalive+0x1e8	/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/ping.go:48

1 @ 0x438936 0x448392 0xead22a 0xf480d3 0xf47f0d 0xf48226 0x10017bf 0x100b2c5 0x10099af 0x1006073 0x468741
#	0xead229	github.com/eclipse/paho%2emqtt%2egolang.(*baseToken).WaitTimeout+0x69			/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/token.go:84
#	0xf480d2	XXXXXX/mqtt.(*MqttBuddy).SendMessage.func1+0x32	XXXXXX/mqtt/buddy.go:212
#	0xf47f0c	XXXXXX/mqtt.(*MqttBuddy).SendMessage+0x14c		XXXXXX/mqtt/buddy.go:222
#	0xf48225	XXXXXX/mqtt.BuddySendAndDone+0x105			XXXXXX/mqtt/buddy.go:244
#	0x10017be	main.(*server).sendAzureMessage+0x73e							XXXXXX/azure.go:98
#	0x100b2c4	main.(*server).doProcessPacket+0x17e4							XXXXXX/processpacket.go:157
#	0x10099ae	main.(*server).processPacket+0x18e							XXXXXX/processpacket.go:45
#	0x1006072	main.main.func3.1+0x252									XXXXXX/main.go:390

1 @ 0x438936 0x448392 0xf60738 0x468741
#	0xf60737	github.com/streadway/amqp.(*Connection).heartbeater+0x1b7	/var/go/go/pkg/mod/github.com/streadway/amqp@v0.0.0-20190404075320-75d898a42a94/connection.go:551

1 @ 0x438936 0x448392 0xf73aa5 0x468741
#	0xf73aa4	github.com/assembla/cony.(*Client).Loop.func1+0x104	/var/go/go/pkg/mod/github.com/assembla/cony@v0.3.2/client.go:156

1 @ 0x438936 0x448392 0xf742b9 0x468741
#	0xf742b8	github.com/assembla/cony.(*Consumer).serve+0x1d8	/var/go/go/pkg/mod/github.com/assembla/cony@v0.3.2/consumer.go:83

1 @ 0x438936 0x449473 0x44944d 0x464745 0x474172 0xea750a 0x468741
#	0x464744	sync.runtime_Semacquire+0x24					/usr/local/go/src/runtime/sema.go:56
#	0x474171	sync.(*WaitGroup).Wait+0x51					/usr/local/go/src/sync/waitgroup.go:136
#	0xea7509	github.com/eclipse/paho%2emqtt%2egolang.startComms.func3+0x29	/var/go/go/pkg/mod/github.com/eclipse/paho.mqtt.golang@v1.4.2/net.go:438

1 @ 0x438936 0x449473 0x44944d 0x464865 0x4726e5 0x81ce1b 0x81cdf1 0x81c4c5 0x81bfde 0x468741
#	0x464864	sync.runtime_SemacquireMutex+0x24				/usr/local/go/src/runtime/sema.go:71
#	0x4726e4	sync.(*Mutex).lockSlow+0x164					/usr/local/go/src/sync/mutex.go:162
#	0x81ce1a	sync.(*Mutex).Lock+0x9a						/usr/local/go/src/sync/mutex.go:81
#	0x81cdf0	net/http.(*http2clientStream).encodeAndWriteHeaders+0x70	/usr/local/go/src/net/http/h2_bundle.go:8064
#	0x81c4c4	net/http.(*http2clientStream).writeRequest+0x4a4		/usr/local/go/src/net/http/h2_bundle.go:7986
#	0x81bfdd	net/http.(*http2clientStream).doRequest+0x1d			/usr/local/go/src/net/http/h2_bundle.go:7904

1 @ 0x462bc5 0xee1f75 0xee1d8d 0xedeccb 0xeed9a5 0xf38167 0x83b78f 0x83d129 0xefa162 0x83ed7b 0x83a237 0x468741
#	0x462bc4	runtime/pprof.runtime_goroutineProfileWithLabels+0x24		/usr/local/go/src/runtime/mprof.go:753
#	0xee1f74	runtime/pprof.writeRuntimeProfile+0xb4				/usr/local/go/src/runtime/pprof/pprof.go:725
#	0xee1d8c	runtime/pprof.writeGoroutine+0x4c				/usr/local/go/src/runtime/pprof/pprof.go:685
#	0xedecca	runtime/pprof.(*Profile).WriteTo+0x14a				/usr/local/go/src/runtime/pprof/pprof.go:332
#	0xeed9a4	net/http/pprof.handler.ServeHTTP+0x4a4				/usr/local/go/src/net/http/pprof/pprof.go:253
#	0xf38166	XXXXXX/app.pprofIndexOverride+0x206	XXXXXX/app/server.go:161
#	0x83b78e	net/http.HandlerFunc.ServeHTTP+0x2e				/usr/local/go/src/net/http/server.go:2084
#	0x83d128	net/http.(*ServeMux).ServeHTTP+0x148				/usr/local/go/src/net/http/server.go:2462
#	0xefa161	github.com/gorilla/mux.(*Router).ServeHTTP+0x201		/var/go/go/pkg/mod/github.com/gorilla/mux@v1.7.2/mux.go:212
#	0x83ed7a	net/http.serverHandler.ServeHTTP+0x43a				/usr/local/go/src/net/http/server.go:2916
#	0x83a236	net/http.(*conn).serve+0x5d6					/usr/local/go/src/net/http/server.go:1966

@dmitshur
Copy link
Contributor

CC @neild.

@dmitshur dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Dec 28, 2022
@dmitshur dmitshur added this to the Backlog milestone Dec 28, 2022
@rhysh
Copy link
Contributor

rhysh commented Jan 5, 2023

Thanks for splitting this out. Can you clarify what specifically the problem is? It sounds like the number of goroutines is the bug for you ("I expected all go routines that the std library creates to handle the HTTP request to be cleaned up"), but looking at the goroutines stacks I wonder if the larger problem is that the Transport continues to schedule new requests onto the stuck http2 connection even after many requests on that connection have all timed out.


I see goroutines doing the following outbound http/2 work:

16 outbound http/2 connections that exist in the app (rooted at net/http.(*http2ClientConn).readLoop)

11 requests the app is trying to send over http/2 right now (in net/http.(*http2ClientConn).RoundTrip)

1258 + 12 trying to clean up requests that were scheduled onto the stuck connection (in net/http.(*http2clientStream).cleanupWriteRequest)

9+2 in net/http.(*http2clientStream).doRequest calling net/http.(*http2clientStream).writeRequest, with 9 on line 7929 and 2 on line 8042. I think these correspond to the 11 goroutines in net/http.(*http2ClientConn).RoundTrip.

1 more in net/http.(*http2clientStream).doRequest, net/http.(*http2clientStream).writeRequest (line 7986), net/http.(*http2clientStream).encodeAndWriteHeaders, sync.(*Mutex).Lock on cc.wmu. This request holds cc.reqHeaderMu, but does not hold cc.wmu.

1 goroutine trying to write a frame to an http/2 client connection (net/http.(*http2clientStream).doRequest, net/http.(*http2clientStream).cleanupWriteRequest, net/http.(*http2ClientConn).writeStreamReset, bufio.(*Writer).Flush, ..., crypto/tls.(*Conn).Write, ..., internal/poll.(*pollDesc).waitWrite. This request holds cc.wmu.

The following outbound http/1.1 work (I think unrelated, but helpful to classify):
12 outbound http/1.1 connections that exist in the app (rooted at net/http.(*persistConn).writeLoop)
6+3 idle http/1.1 client connections, 6 using https and 3 using http (net/http.(*persistConn).readLoop leading to net/http.(*persistConn).Read)
3 active http/1.1 client connections, reading response body (net/http.(*body).Read leading to net/http.(*persistConn).Read), corresponding to another 3 goroutines in net/http.(*persistConn).readLoop without any Read calls on their stack.
2 outbound http/1.1 requests still waiting for response headers (net/http.(*persistConn).roundTrip).

The following inbound http/2 work (I think unrelated, but helpful to classify):
10 inbound http/2 connections for gRPC, google.golang.org/grpc/internal/transport.(*http2Client).reader


As I understand it:

  1. The request whose goroutine holds cc.wmu and is blocked in crypto/tls.(*Conn).Write remains blocked for a long time because there's a problem with the underlying TCP connection (maybe the network is dropping all packets, or all packets for that peer, or all packets for that five-tuple). This request has been waiting since the kernel's write buffer for the TCP connection filled up / writes to the file descriptor started blocking.
  2. The request whose goroutine holds cc.reqHeaderMu cannot proceed because it's trying to acquire cc.wmu. This request has been waiting since shortly after writes to the file descriptor started blocking.
  3. The 2 goroutines in writeRequest on line 8042 have managed to use cc.reqHeaderMu before their context values timed out. That's interesting .. maybe the TCP connection isn't completely broken, but is instead very, very slow? Or is it possible that these 2 requests didn't have a timeout set, and have been around since the start of the problem?
  4. The 9 goroutines in writeRequest on line 7929 are waiting for cc.reqHeaderMu (or for their context values to expire), but they can't get it because the http/2 connection is busy (items 2 and 1). The requests for these goroutines are very recent / current. The Transport chose this http/2 connection as the right one to use for putting those requests on the wire.
  5. The 1258 + 12 goroutines in cleanupWriteRequest (I haven't investigated the difference between them) are each for a request that was scheduled to a (the?) slow connection and have not yet had a chance to use cc.wmu. That gives an indication of how long the crypto/tls.(*Conn).Write call has been blocked (a long time).

@stampy88
Copy link
Author

stampy88 commented Jan 9, 2023

Hi @rhysh , well, the bug isn't necessarily the go routines themselves, but rather the potential for these resources, e.g. go routines, memory, etc. to keep growing because of the problem you stated.

@adg
Copy link
Contributor

adg commented Jun 19, 2023

I think I'm seeing a manifestation of the same problem here. HTTP/2 requests where the context deadline has been reached, but the requests remain outstanding, with each goroutine blocked trying to acquire a mutex in abortStream. In this case, the upstream server is not responding. Here are the active goroutines:

goroutine profile: total 45
7 @ 0x43dd76 0x44f4af 0x44f486 0x46dd06 0x47c925 0x732e46 0x732e14 0x737d0e 0x733ae5 0x73363b 0x7678cb 0x74e7f9 0x70a9d7 0x70a25b 0x70c55b 0x101cedd 0x101cecc 0x101d6fd 0x101d41a 0x101f2d3 0x101e28f 0x10484b4 0x1046ae5 0x1038225 0x1042345 0x96a258 0x103bf23 0x950039 0xb79d43 0x950039 0xae3659 0x94fdcf
#	0x46dd05	sync.runtime_SemacquireMutex+0x25							/usr/local/go/src/runtime/sema.go:77
#	0x47c924	sync.(*Mutex).lockSlow+0x164								/usr/local/go/src/sync/mutex.go:171
#	0x732e45	sync.(*Mutex).Lock+0x65									/usr/local/go/src/sync/mutex.go:90
#	0x732e13	net/http.(*http2clientStream).abortStream+0x33						/usr/local/go/src/net/http/h2_bundle.go:7373
#	0x737d0d	net/http.(*http2ClientConn).RoundTrip+0x56d						/usr/local/go/src/net/http/h2_bundle.go:8248
#	0x733ae4	net/http.(*http2Transport).RoundTripOpt+0x1c4						/usr/local/go/src/net/http/h2_bundle.go:7523
#	0x73363a	net/http.(*http2Transport).RoundTrip+0x1a						/usr/local/go/src/net/http/h2_bundle.go:7475
#	0x7678ca	net/http.(*Transport).roundTrip+0x7ea							/usr/local/go/src/net/http/transport.go:601
#	0x74e7f8	net/http.(*Transport).RoundTrip+0x18							/usr/local/go/src/net/http/roundtrip.go:17
#	0x70a9d6	net/http.send+0x5f6									/usr/local/go/src/net/http/client.go:252
#	0x70a25a	net/http.(*Client).send+0x9a								/usr/local/go/src/net/http/client.go:176
#	0x70c55a	net/http.(*Client).do+0x8fa								/usr/local/go/src/net/http/client.go:716
#	0x101cedc	net/http.(*Client).Do+0x11c								/usr/local/go/src/net/http/client.go:582
#	0x101cecb	redacted ...
#	0x101d6fc	redacted ...
#	0x101d419	redacted ...
#	0x101f2d2	redacted ...
#	0x101e28e	redacted ...
#	0x10484b3	redacted ...
#	0x1046ae4	redacted ...
#	0x1038224	redacted ...
#	0x1042344	redacted ...
#	0x96a257	redacted ...
#	0x103bf22	redacted ...
#	0x950038	google.golang.org/grpc.getChainUnaryHandler.func1+0xb8					/builder/home/go/pkg/mod/google.golang.org/grpc@v1.54.0/server.go:1164
#	0xb79d42	redacted ...
#	0x950038	google.golang.org/grpc.getChainUnaryHandler.func1+0xb8					/builder/home/go/pkg/mod/google.golang.org/grpc@v1.54.0/server.go:1164
#	0xae3658	redacted ...
#	0x94fdce	google.golang.org/grpc.chainUnaryInterceptors.func1+0x8e				/builder/home/go/pkg/mod/google.golang.org/grpc@v1.54.0/server.go:1155

7 @ 0x43dd76 0x44f4af 0x44f486 0x46dd06 0x47c925 0x73944a 0x739413 0x7380ee 0x471f81
#	0x46dd05	sync.runtime_SemacquireMutex+0x25			/usr/local/go/src/runtime/sema.go:77
#	0x47c924	sync.(*Mutex).lockSlow+0x164				/usr/local/go/src/sync/mutex.go:171
#	0x739449	sync.(*Mutex).Lock+0x89					/usr/local/go/src/sync/mutex.go:90
#	0x739412	net/http.(*http2clientStream).cleanupWriteRequest+0x52	/usr/local/go/src/net/http/h2_bundle.go:8476
#	0x7380ed	net/http.(*http2clientStream).doRequest+0x2d		/usr/local/go/src/net/http/h2_bundle.go:8262

7 @ 0x43dd76 0x44f4af 0x44f486 0x46dd06 0x47c925 0x73ce8a 0x73ce77 0x43aa33 0x45223d 0x45220d 0x72235c 0x73fe34 0x73d874 0x73cacf 0x471f81
#	0x46dd05	sync.runtime_SemacquireMutex+0x25			/usr/local/go/src/runtime/sema.go:77
#	0x47c924	sync.(*Mutex).lockSlow+0x164				/usr/local/go/src/sync/mutex.go:171
#	0x73ce89	sync.(*Mutex).Lock+0x149				/usr/local/go/src/sync/mutex.go:90
#	0x73ce76	net/http.(*http2clientConnReadLoop).cleanup+0x136	/usr/local/go/src/net/http/h2_bundle.go:9125
#	0x43aa32	runtime.gopanic+0x212					/usr/local/go/src/runtime/panic.go:884
#	0x45223c	runtime.panicmem+0x37c					/usr/local/go/src/runtime/panic.go:260
#	0x45220c	runtime.sigpanic+0x34c					/usr/local/go/src/runtime/signal_unix.go:841
#	0x72235b	net/http.(*http2pipe).Write+0x17b			/usr/local/go/src/net/http/h2_bundle.go:3710
#	0x73fe33	net/http.(*http2clientConnReadLoop).processData+0x253	/usr/local/go/src/net/http/h2_bundle.go:9642
#	0x73d873	net/http.(*http2clientConnReadLoop).run+0x3f3		/usr/local/go/src/net/http/h2_bundle.go:9221
#	0x73cace	net/http.(*http2ClientConn).readLoop+0x6e		/usr/local/go/src/net/http/h2_bundle.go:9082

2 @ 0x43dd76 0x436397 0x46c229 0x4ebff2 0x4ed3d9 0x4ed3c7 0x649189 0x65ac65 0x6a26fd 0x4ac3f8 0x6a28e5 0x69fdd6 0x6a5ccf 0x6a5cd0 0x56cfbb 0x4a35ba 0x8f6c4e 0x8f6c08 0x8f7495 0x920574 0x94e409 0x94dc86 0x471f81
#	0x46c228	internal/poll.runtime_pollWait+0x88						/usr/local/go/src/runtime/netpoll.go:306
#	0x4ebff1	internal/poll.(*pollDesc).wait+0x31						/usr/local/go/src/internal/poll/fd_poll_runtime.go:84
#	0x4ed3d8	internal/poll.(*pollDesc).waitRead+0x298					/usr/local/go/src/internal/poll/fd_poll_runtime.go:89
#	0x4ed3c6	internal/poll.(*FD).Read+0x286							/usr/local/go/src/internal/poll/fd_unix.go:167
#	0x649188	net.(*netFD).Read+0x28								/usr/local/go/src/net/fd_posix.go:55
#	0x65ac64	net.(*conn).Read+0x44								/usr/local/go/src/net/net.go:183
#	0x6a26fc	crypto/tls.(*atLeastReader).Read+0x3c						/usr/local/go/src/crypto/tls/conn.go:788
#	0x4ac3f7	bytes.(*Buffer).ReadFrom+0x97							/usr/local/go/src/bytes/buffer.go:202
#	0x6a28e4	crypto/tls.(*Conn).readFromUntil+0xe4						/usr/local/go/src/crypto/tls/conn.go:810
#	0x69fdd5	crypto/tls.(*Conn).readRecordOrCCS+0x115					/usr/local/go/src/crypto/tls/conn.go:617
#	0x6a5cce	crypto/tls.(*Conn).readRecord+0x16e						/usr/local/go/src/crypto/tls/conn.go:583
#	0x6a5ccf	crypto/tls.(*Conn).Read+0x16f							/usr/local/go/src/crypto/tls/conn.go:1316
#	0x56cfba	bufio.(*Reader).Read+0x1ba							/usr/local/go/src/bufio/bufio.go:237
#	0x4a35b9	io.ReadAtLeast+0x99								/usr/local/go/src/io/io.go:332
#	0x8f6c4d	io.ReadFull+0x6d								/usr/local/go/src/io/io.go:351
#	0x8f6c07	golang.org/x/net/http2.readFrameHeader+0x27					/builder/home/go/pkg/mod/golang.org/x/net@v0.10.0/http2/frame.go:237
#	0x8f7494	golang.org/x/net/http2.(*Framer).ReadFrame+0x94					/builder/home/go/pkg/mod/golang.org/x/net@v0.10.0/http2/frame.go:498
#	0x920573	google.golang.org/grpc/internal/transport.(*http2Server).HandleStreams+0x173	/builder/home/go/pkg/mod/google.golang.org/grpc@v1.54.0/internal/transport/http2_server.go:637
#	0x94e408	google.golang.org/grpc.(*Server).serveStreams+0x188				/builder/home/go/pkg/mod/google.golang.org/grpc@v1.54.0/server.go:950
#	0x94dc85	google.golang.org/grpc.(*Server).handleRawConn.func1+0x45			/builder/home/go/pkg/mod/google.golang.org/grpc@v1.54.0/server.go:892

2 @ 0x43dd76 0x44e3be 0x9052f5 0x905a51 0x91d16e 0x471f81
#	0x9052f4	google.golang.org/grpc/internal/transport.(*controlBuffer).get+0x114	/builder/home/go/pkg/mod/google.golang.org/grpc@v1.54.0/internal/transport/controlbuf.go:417
#	0x905a50	google.golang.org/grpc/internal/transport.(*loopyWriter).run+0x90	/builder/home/go/pkg/mod/google.golang.org/grpc@v1.54.0/internal/transport/controlbuf.go:549
#	0x91d16d	google.golang.org/grpc/internal/transport.NewServerTransport.func2+0xcd	/builder/home/go/pkg/mod/google.golang.org/grpc@v1.54.0/internal/transport/http2_server.go:336

2 @ 0x43dd76 0x44e3be 0x924993 0x471f81
#	0x924992	google.golang.org/grpc/internal/transport.(*http2Server).keepalive+0x232	/builder/home/go/pkg/mod/google.golang.org/grpc@v1.54.0/internal/transport/http2_server.go:1150

1 @ 0x40e794 0x46e40f 0xff9319 0x471f81
#	0x46e40e	os/signal.signal_recv+0x2e	/usr/local/go/src/runtime/sigqueue.go:152
#	0xff9318	os/signal.loop+0x18		/usr/local/go/src/os/signal/signal_unix.go:23

1 @ 0x432a76 0x46bd85 0xb15ab5 0xb158cd 0xb1264b 0xb72c45 0xb73793 0x75768f 0x759029 0x75a7d6 0x7561b2 0x471f81
#	0x46bd84	runtime/pprof.runtime_goroutineProfileWithLabels+0x24	/usr/local/go/src/runtime/mprof.go:844
#	0xb15ab4	runtime/pprof.writeRuntimeProfile+0xb4			/usr/local/go/src/runtime/pprof/pprof.go:734
#	0xb158cc	runtime/pprof.writeGoroutine+0x4c			/usr/local/go/src/runtime/pprof/pprof.go:694
#	0xb1264a	runtime/pprof.(*Profile).WriteTo+0x14a			/usr/local/go/src/runtime/pprof/pprof.go:329
#	0xb72c44	net/http/pprof.handler.ServeHTTP+0x4a4			/usr/local/go/src/net/http/pprof/pprof.go:259
#	0xb73792	net/http/pprof.Index+0xf2				/usr/local/go/src/net/http/pprof/pprof.go:376
#	0x75768e	net/http.HandlerFunc.ServeHTTP+0x2e			/usr/local/go/src/net/http/server.go:2122
#	0x759028	net/http.(*ServeMux).ServeHTTP+0x148			/usr/local/go/src/net/http/server.go:2500
#	0x75a7d5	net/http.serverHandler.ServeHTTP+0x315			/usr/local/go/src/net/http/server.go:2936
#	0x7561b1	net/http.(*conn).serve+0x611				/usr/local/go/src/net/http/server.go:1995

1 @ 0x43dd76 0x40901d 0x408b18 0xb76f13 0x103ba4b 0xb57d73 0x471f81
#	0xb76f12	redacted ...
#	0x103ba4a	redacted ...
#	0xb57d72	redacted ...

1 @ 0x43dd76 0x40901d 0x408b18 0xb770c9 0x471f81
#	0xb770c8	redacted ...

1 @ 0x43dd76 0x40901d 0x408b18 0xff97f5 0x471f81
#	0xff97f4	redacted ...

1 @ 0x43dd76 0x40901d 0x408b58 0xb55bbc 0x471f81
#	0xb55bbb	redacted ...

1 @ 0x43dd76 0x436397 0x46c229 0x4ebff2 0x4ed3d9 0x4ed3c7 0x649189 0x65ac65 0x750351 0x56c9df 0x56cb3d 0x75631c 0x471f81
#	0x46c228	internal/poll.runtime_pollWait+0x88		/usr/local/go/src/runtime/netpoll.go:306
#	0x4ebff1	internal/poll.(*pollDesc).wait+0x31		/usr/local/go/src/internal/poll/fd_poll_runtime.go:84
#	0x4ed3d8	internal/poll.(*pollDesc).waitRead+0x298	/usr/local/go/src/internal/poll/fd_poll_runtime.go:89
#	0x4ed3c6	internal/poll.(*FD).Read+0x286			/usr/local/go/src/internal/poll/fd_unix.go:167
#	0x649188	net.(*netFD).Read+0x28				/usr/local/go/src/net/fd_posix.go:55
#	0x65ac64	net.(*conn).Read+0x44				/usr/local/go/src/net/net.go:183
#	0x750350	net/http.(*connReader).Read+0x170		/usr/local/go/src/net/http/server.go:782
#	0x56c9de	bufio.(*Reader).fill+0xfe			/usr/local/go/src/bufio/bufio.go:106
#	0x56cb3c	bufio.(*Reader).Peek+0x5c			/usr/local/go/src/bufio/bufio.go:144
#	0x75631b	net/http.(*conn).serve+0x77b			/usr/local/go/src/net/http/server.go:2030

1 @ 0x43dd76 0x436397 0x46c229 0x4ebff2 0x4f18fd 0x4f18eb 0x64b315 0x663c25 0x662d1d 0x75ad45 0x1033f8a 0x1033f8b 0x471f81
#	0x46c228	internal/poll.runtime_pollWait+0x88		/usr/local/go/src/runtime/netpoll.go:306
#	0x4ebff1	internal/poll.(*pollDesc).wait+0x31		/usr/local/go/src/internal/poll/fd_poll_runtime.go:84
#	0x4f18fc	internal/poll.(*pollDesc).waitRead+0x2bc	/usr/local/go/src/internal/poll/fd_poll_runtime.go:89
#	0x4f18ea	internal/poll.(*FD).Accept+0x2aa		/usr/local/go/src/internal/poll/fd_unix.go:614
#	0x64b314	net.(*netFD).accept+0x34			/usr/local/go/src/net/fd_unix.go:172
#	0x663c24	net.(*TCPListener).accept+0x24			/usr/local/go/src/net/tcpsock_posix.go:148
#	0x662d1c	net.(*TCPListener).Accept+0x3c			/usr/local/go/src/net/tcpsock.go:297
#	0x75ad44	net/http.(*Server).Serve+0x384			/usr/local/go/src/net/http/server.go:3059
#	0x1033f89	net/http.Serve+0x49				/usr/local/go/src/net/http/server.go:2581
#	0x1033f8a	redacted ...

1 @ 0x43dd76 0x436397 0x46c229 0x4ebff2 0x4f18fd 0x4f18eb 0x64b315 0x663c25 0x662d1d 0x94d2b5 0xb7704e 0x471f81
#	0x46c228	internal/poll.runtime_pollWait+0x88				/usr/local/go/src/runtime/netpoll.go:306
#	0x4ebff1	internal/poll.(*pollDesc).wait+0x31				/usr/local/go/src/internal/poll/fd_poll_runtime.go:84
#	0x4f18fc	internal/poll.(*pollDesc).waitRead+0x2bc			/usr/local/go/src/internal/poll/fd_poll_runtime.go:89
#	0x4f18ea	internal/poll.(*FD).Accept+0x2aa				/usr/local/go/src/internal/poll/fd_unix.go:614
#	0x64b314	net.(*netFD).accept+0x34					/usr/local/go/src/net/fd_unix.go:172
#	0x663c24	net.(*TCPListener).accept+0x24					/usr/local/go/src/net/tcpsock_posix.go:148
#	0x662d1c	net.(*TCPListener).Accept+0x3c					/usr/local/go/src/net/tcpsock.go:297
#	0x94d2b4	google.golang.org/grpc.(*Server).Serve+0x474			/builder/home/go/pkg/mod/google.golang.org/grpc@v1.54.0/server.go:824
#	0xb7704d	redacted ...

1 @ 0x43dd76 0x44e3be 0x103a5c5 0x471f81
#	0x103a5c4	redacted ...

1 @ 0x43dd76 0x44e3be 0x51964d 0x471f81
#	0x51964c	database/sql.(*DB).connectionOpener+0x8c	/usr/local/go/src/database/sql/sql.go:1218

1 @ 0x43dd76 0x44e3be 0xb2540d 0x471f81
#	0xb2540c	go.opencensus.io/stats/view.(*worker).start+0xac	/builder/home/go/pkg/mod/go.opencensus.io@v0.24.0/stats/view/worker.go:292

1 @ 0x43dd76 0x44e3be 0xb33c89 0xb32310 0x103bcb1 0x953379 0xb7a2d2 0x953379 0xae3ac5 0x95310f 0x954723 0x955f50 0x94e7d8 0x471f81
#	0xb33c88	google.golang.org/grpc/health.(*Server).Watch+0x2c8			/builder/home/go/pkg/mod/google.golang.org/grpc@v1.54.0/health/server.go:92
#	0xb3230f	google.golang.org/grpc/health/grpc_health_v1._Health_Watch_Handler+0xcf	/builder/home/go/pkg/mod/google.golang.org/grpc@v1.54.0/health/grpc_health_v1/health_grpc.pb.go:187
#	0x103bcb0	redacted ...
#	0x953378	google.golang.org/grpc.getChainStreamHandler.func1+0xb8			/builder/home/go/pkg/mod/google.golang.org/grpc@v1.54.0/server.go:1491
#	0xb7a2d1	redacted ...
#	0x953378	google.golang.org/grpc.getChainStreamHandler.func1+0xb8			/builder/home/go/pkg/mod/google.golang.org/grpc@v1.54.0/server.go:1491
#	0xae3ac4	redacted ...
#	0x95310e	google.golang.org/grpc.chainStreamInterceptors.func1+0x8e		/builder/home/go/pkg/mod/google.golang.org/grpc@v1.54.0/server.go:1482
#	0x954722	google.golang.org/grpc.(*Server).processStreamingRPC+0x1362		/builder/home/go/pkg/mod/google.golang.org/grpc@v1.54.0/server.go:1646
#	0x955f4f	google.golang.org/grpc.(*Server).handleStream+0x9ef			/builder/home/go/pkg/mod/google.golang.org/grpc@v1.54.0/server.go:1726
#	0x94e7d7	google.golang.org/grpc.(*Server).serveStreams.func1.2+0x97		/builder/home/go/pkg/mod/google.golang.org/grpc@v1.54.0/server.go:966

1 @ 0x43dd76 0x44e3be 0xb38c65 0x471f81
#	0xb38c64	redacted ...

1 @ 0x43dd76 0x44f4af 0x44f486 0x46dbe7 0x47e42b 0xb57c6f 0x1033d2b 0x43d947 0x471f81
#	0x46dbe6	sync.runtime_Semacquire+0x26					/usr/local/go/src/runtime/sema.go:62
#	0x47e42a	sync.(*WaitGroup).Wait+0x4a					/usr/local/go/src/sync/waitgroup.go:116
#	0xb57c6e	redacted ...
#	0x1033d2a	redacted ...
#	0x43d946	runtime.main+0x206						/usr/local/go/src/runtime/proc.go:250

1 @ 0x43dd76 0x46ed75 0xb54a32 0x471f81
#	0x46ed74	time.Sleep+0x134						/usr/local/go/src/runtime/time.go:195
#	0xb54a31	redacted ...

1 @ 0x471f81

1 @ 0x488cd6 0x8970b8 0xb3671d 0xb359d5 0x471f81
#	0x488cd5	syscall.Syscall6+0x35						/usr/local/go/src/syscall/syscall_linux.go:91
#	0x8970b7	golang.org/x/sys/unix.EpollWait+0x57				/builder/home/go/pkg/mod/golang.org/x/sys@v0.8.0/unix/zsyscall_linux_amd64.go:56
#	0xb3671c	github.com/fsnotify/fsnotify.(*fdPoller).wait+0x7c		/builder/home/go/pkg/mod/github.com/fsnotify/fsnotify@v1.5.1/inotify_poller.go:87
#	0xb359d4	github.com/fsnotify/fsnotify.(*Watcher).readEvents+0x274	/builder/home/go/pkg/mod/github.com/fsnotify/fsnotify@v1.5.1/inotify.go:193

@pond177
Copy link

pond177 commented Jan 21, 2024

How to fix😂

@90wukai
Copy link

90wukai commented Mar 25, 2024

@stampy88 Do you have any ideas for fix it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

6 participants