New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net/http: using Transport.IdleConnTimeout may lead to goroutine leak in an edge case #25621
Comments
CC @bradfitz |
A belated thanks for the fun bug report. Sorry for missing this earlier. I'm able to reproduce it and have started debugging. It's not immediately obvious what's happening, but I'll keep looking. |
One problem is the rsp.Body.Close() shouldn't be deferred since we don't ever exit the block. That doesn't fix the underlying issue. |
@fraenkel, no that's not the issue. It is in a func scope above, so the defer does work. But the bug also repros with an immediate Body.Close before the sleep. I've found that it's related to the Transport's idle connection management, but then I had to do something else. I'll look at it again today. |
Each increase of 3 goroutines (the Transport persistConn's readLoop+writeLoop + the server side handling it) happens right after each time this "not idle" case sometimes fires: func (pc *persistConn) closeConnIfStillIdle() {
// ....
if _, ok := t.idleLRU.m[pc]; !ok {
// Not idle.
return
} So whatever is removing it from that map isn't also closing the conn, which would break the readLoop's Peek which would make that goroutine return, closing the writeLoop, and the server goroutine would end when the conn closed also. |
The cases which fail always are connections pulled from getIdleConn. |
So from what I see, we call getIdleConn, and then see there is an idleTimer associated with it so its removed from the LRU but not returned. Then the closeConnIfStillIdle is invoked and decides its not idle, therefore we lose the pconn and its never closed. |
Yeah, that check just seems wrong. Sending a CL with explanation. |
Change https://golang.org/cl/123315 mentions this issue: |
What version of Go are you using (
go version
)?1.10.2
Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (
go env
)?What did you do?
I ran an HTTP short-polling client and the polling timeout value was equal to the IdleConnTimeout of the used http.Transport.
What did you expect to see?
I expected the existing connection from the pool either reused or not, without other issues (even if having the same timeouts for the short-polling and the IdleConnTimeout is quite an edge case).
What did you see instead?
I saw a steadily growing number of goroutines and memory usage.
The text was updated successfully, but these errors were encountered: