New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net/http: http.Transport leaks request objects in certain situations #51354
Comments
Additional notes that I didn't think belong in the issue body:
|
CC @neild |
Follows up on CL 245357 and adds missing returns in waitCondition (CL 477196) Fixes golang#51354
This is because of Line 546 in ed8cbaf
Lines 661 to 665 in ed8cbaf
I think this was overlooked in CL https://go-review.googlesource.com/c/go/+/245357 fixing #40453 In the client, when deadline (Timeout) Lines 193 to 198 in ed8cbaf
is not zero Lines 352 to 355 in ed8cbaf
there is also a piece that calls Transport.CancelRequest(req) on the original request Lines 380 to 389 in ed8cbaf
which breaks if custom transport does not implement CancelRequest like in the subject reproducer. The reproducer https://go.dev/play/p/iUpPf4G8vZy could be fixed by adding func (*Transport) CancelRequest(r *http.Request) {
http.DefaultTransport.(*http.Transport).CancelRequest(r)
} |
Change https://go.dev/cl/515435 mentions this issue: |
Follows up on CL 245357 and adds missing returns in waitCondition (CL 477196) Fixes golang#51354 Change-Id: I7950ff889ad72c4927a969c35fedc0186e863bd6 GitHub-Last-Rev: 52ce05b GitHub-Pull-Request: golang#61724 Reviewed-on: https://go-review.googlesource.com/c/go/+/515435 Reviewed-by: Damien Neil <dneil@google.com> Reviewed-by: Bryan Mills <bcmills@google.com> Run-TryBot: Damien Neil <dneil@google.com> Run-TryBot: Bryan Mills <bcmills@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Auto-Submit: Damien Neil <dneil@google.com>
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Go 1.17.7 is the latest stable release at the time of writing. I was also able to reproduce the issue on the current "Go dev branch" version on the playground.
What operating system and processor architecture are you using (
go env
)?go env
OutputWe also observed this issue in various environments using the
golang:1.17
Docker image on commodity cloud hardware, and our reproduction case demonstrates erroneous behavior in the Go playground.What did you do?
https://go.dev/play/p/iUpPf4G8vZy is a small example demonstrating this behavior. We start a local HTTP server, and then create two
http.Client
s with timeouts set. One client useshttp.DefaultTransport
directly, and the other uses a localTransport
type that trivially wrapshttp.DefaultTransport
. Then, we make a large number ofPOST
requests to that local HTTP server using each client, allow them to time out by neglecting to read the response body, and read the size ofhttp.Transport.reqCanceler
using reflection.(The issue can be reproduced with
const count = 1
set as well, though since there's a race condition involved it may take several runs of the test program to see an incorrect result.)What did you expect to see?
Our understanding is that
http.Transport.reqCanceler
should always be empty when there are no HTTP requests in flight. Since it maps "requests" to "functions that cancel requests", once a request has completed, errored, timed out, or otherwise entered a state where "cancellation" is not a meaningful operation, we'd expect that request to be removed fromreqCanceler
. For this example, we would expect to see three lines ofcount = 0
printed.What did you see instead?
Under certain conditions (described below), a request will not be correctly removed from
reqCanceler
. This results in the request and associated objects being unable to be garbage-collected until the associated HTTP transport is dropped, which could be "for the lifetime of the program" if the associated transport ishttp.DefaultTransport
. In the example, the last line of the output has a non-zerocount
despite there being no outstanding HTTP requests. In production, we observed excessive heap use involvingpersistConn
objects, leading in some cases to application servers running out of memory.A detailed investigation
We believe the problem to be line 2220 of net/http/transport.go:
The original HTTP request has been stored in
pc.t.reqCanceler
underrc.cancelKey
. If the original HTTP request had a body (like with aPOST
),rc.req
will not be the same request thatrc.cancelKey
refers to, and if the<-rc.req.Cancel
case is executed, this code will remove the wrong key fromreqCanceler
(often one that doesn't exist to begin with). If no otherCancelRequest
orcancelRequest
call is made with the correct cancel key, then the request stored inreqCanceler
underrc.cancelKey
will not be removed and will remain attached to the transport.This will take place if:
net.http.knownRoundTripperImpl
returnedfalse
for the given request, causingsetRequestCancel
to setreq.Cancel
to a new channel and start a timer to close it after the request's timeout period expires, andreq.Context().Done()
. Since both timers will have the same timeout, this does not always happen. In our testing, we found it occurs slightly more than half the time.Should all these things happen, only the
<-rc.req.Cancel
case will be executed.We believe the correct fix for this problem would be to replace that line with a call to
cancelRequest
with the correctrc.cancelKey
, similar to what's done later on in http.Transport (in fact, this case will trigger if you do read the response body and cancel the request correctly, which is why that's a precondition). We have been experimenting with an appropriately-patchednet/http
and found it to resolve the issue, both in that the test case given will correctly print three lines ofcount = 0
and that our production memory issues ceased.The text was updated successfully, but these errors were encountered: