Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/http2: RoundTrip and/or CloseIdleConnections may leave connections in use after returning #50027

Open
bcmills opened this issue Dec 7, 2021 · 3 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Dec 7, 2021

In #43176, we observed that x/net/http2.TestTransportGroupsPendingDials sometimes fails on various platforms.

The test performs multiple concurrent calls to a Transport's RoundTrip method, waits for them to return, closes the Response bodies, and then calls CloseIdleConnections. If all of those functions are synchronous — their signatures suggest that they should be, and nothing in their documentation states otherwise — then the Transport's client connection pool should be empty as soon as that sequence is complete.

However, the connection pool is empirically not empty at that point, and to my knowledge never has been. The test has had a retry loop ever since it was added (in CL 17134), and even then a single call to CloseIdleConnections before the retry loop is empirically not enough to ensure that it succeeds.

We should add more explicit tests for the expected invariants after RoundTrip and CloseIdleConnections, and/or update the documentation for those methods (and the corresponding methods and interfaces in the net/http package) to make those invariants clearer for end users, who may want to make use of any such invariants in their own tests.

(CC @neild @tombergan @bradfitz)

@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Dec 7, 2021
@bcmills bcmills added this to the Backlog milestone Dec 7, 2021
@gopherbot
Copy link

Change https://golang.org/cl/369936 mentions this issue: http2: in TestTransportGroupsPendingDials, call CloseIdleConnections on every retry

@neild
Copy link
Contributor

neild commented Dec 7, 2021

CloseIdleConnections closes all idle connections synchronously.

"Closing" a connection marks the connection as closed (sets *ClientConn.closed), so it should never be used for any future requests. It also closes the connection net.Conn. However, closing a connection does not block until the connection read loop returns, and the read loop is what removes the *ClientConn from the connection pool.

So after CloseIdleConnections, some connections may be no longer in use, but still temporarily in the connection pool.

Perhaps closing a connection should block until the connection goroutines have all returned.

@edaniels
Copy link
Contributor

It would be great if CloseIdleConnections waited for all goroutines to have returned. Right now, after an error in a http.Client.Do, closing idle connections results in at least 1 goroutine leaked:

Goroutine 184 in state IO wait, with internal/poll.runtime_pollWait on top of the stack:
goroutine 184 [IO wait]:
internal/poll.runtime_pollWait(0x111747278, 0x72)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/runtime/netpoll.go:234 +0xa4
internal/poll.(*pollDesc).wait(0x14000628798, 0x72, 0x0)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/internal/poll/fd_poll_runtime.go:84 +0x38
internal/poll.(*pollDesc).waitRead(...)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0x14000628780, {0x1400086a000, 0x163b, 0x163b})
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/internal/poll/fd_unix.go:167 +0x1dc
net.(*netFD).Read(0x14000628780, {0x1400086a000, 0x163b, 0x163b})
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/net/fd_posix.go:56 +0x44
net.(*conn).Read(0x140006de010, {0x1400086a000, 0x163b, 0x163b})
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/net/net.go:183 +0x4c
crypto/tls.(*atLeastReader).Read(0x14000748078, {0x1400086a000, 0x163b, 0x163b})
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/crypto/tls/conn.go:777 +0x58
bytes.(*Buffer).ReadFrom(0x140004cb3f8, {0x108108aa0, 0x14000748078})
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/bytes/buffer.go:204 +0xa4
crypto/tls.(*Conn).readFromUntil(0x140004cb180, {0x111874818, 0x140006de010}, 0x5)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/crypto/tls/conn.go:799 +0xe0
crypto/tls.(*Conn).readRecordOrCCS(0x140004cb180, 0x0)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/crypto/tls/conn.go:606 +0xf4
crypto/tls.(*Conn).readRecord(...)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/crypto/tls/conn.go:574
crypto/tls.(*Conn).Read(0x140004cb180, {0x14000728000, 0x1000, 0x1000})
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/crypto/tls/conn.go:1277 +0x164
bufio.(*Reader).Read(0x140002e1560, {0x1400059c578, 0x9, 0x9})
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/bufio/bufio.go:227 +0x20c
io.ReadAtLeast({0x1081088c0, 0x140002e1560}, {0x1400059c578, 0x9, 0x9}, 0x9)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/io/io.go:328 +0xa0
io.ReadFull(...)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/io/io.go:347
net/http.http2readFrameHeader({0x1400059c578, 0x9, 0x9}, {0x1081088c0, 0x140002e1560})
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/net/http/h2_bundle.go:1555 +0x5c
net/http.(*http2Framer).ReadFrame(0x1400059c540)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/net/http/h2_bundle.go:1813 +0x90
net/http.(*http2clientConnReadLoop).run(0x1400007dfb0)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/net/http/h2_bundle.go:8608 +0x10c
net/http.(*http2ClientConn).readLoop(0x1400079c600)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/net/http/h2_bundle.go:8531 +0x58
created by net/http.(*http2Transport).newClientConn
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/net/http/h2_bundle.go:7325 +0xb7c
[originating from goroutine 183]:
net/http.(*http2Transport).newClientConn(...)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/net/http/h2_bundle.go:7326 +0xb7c
net/http.(*http2Transport).NewClientConn(...)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/net/http/h2_bundle.go:7256 +0x50
net/http.(*http2addConnCall).run(...)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/net/http/h2_bundle.go:916 +0x3c
created by net/http.(*http2clientConnPool).addConnIfNeeded
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/net/http/h2_bundle.go:895 +0x314
[originating from goroutine 227]:
net/http.(*http2clientConnPool).addConnIfNeeded(...)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/net/http/h2_bundle.go:903 +0x314
net/http.http2configureTransports.func1(...)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/net/http/h2_bundle.go:6833 +0x80
net/http.(*Transport).dialConn(...)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/net/http/transport.go:1736 +0x14d4
net/http.(*Transport).dialConnFor(...)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/net/http/transport.go:1446 +0x90
created by net/http.(*Transport).queueForDial
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/net/http/transport.go:1415 +0x37c
[originating from goroutine 58]:
net/http.(*Transport).queueForDial(...)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/net/http/transport.go:1416 +0x37c
net/http.(*Transport).getConn(...)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/net/http/transport.go:1373 +0x420
net/http.(*Transport).roundTrip(...)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/net/http/transport.go:581 +0x7cc
net/http.(*Transport).RoundTrip(...)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/net/http/roundtrip.go:18 +0x30
net/http.send(...)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/net/http/client.go:252 +0x594
net/http.(*Client).send(...)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/net/http/client.go:176 +0xa4
net/http.(*Client).do(...)
	/opt/homebrew/Cellar/go/1.17.5/libexec/src/net/http/client.go:725 +0x858

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

4 participants