Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: receiving DATA after sending (RST_STREAM) kills connection. #16974

Closed
bmizerany opened this issue Sep 2, 2016 · 18 comments
Closed
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Milestone

Comments

@bmizerany
Copy link
Contributor

bmizerany commented Sep 2, 2016

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version devel +e492d9f Tue Aug 16 21:52:44 2016 +0000 darwin/amd64

I'm also using the http2 lib directly at: 1358eff22f0dd0c54fc521042cc607f6ff4b531a

What operating system and processor architecture are you using (go env)?

$ go env
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/bmizerany"
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
CC="clang"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/dx/jcs03lw901vc05cjjqyjf8y00000gn/T/go-build664828230=/tmp/go-build -gno-record-gcc-switches -fno-common"
CXX="clang++"
CGO_ENABLED="1"

What did you do?

I'm seeing DATA frames come in after sending a RST_STREAM which causes a write on a closed pipe buffer here:

https://github.com/golang/net/blob/master/http2/transport.go#L1651

Log:

client: http2: Framer 0xc420deefc0: wrote RST_STREAM stream=477 len=4 ErrCode=CANCEL
client: http2: Framer 0xc420deefc0: read DATA flags=END_STREAM stream=477 len=23 data="\x1f\x8b\b\x00\x00\tn\x88\x00\xff\x01\x00\x00\xff\xff\x00\x00\x00\x00\x
client: http2: Transport received DATA flags=END_STREAM stream=477 len=23 data="\x1f\x8b\b\x00\x00\tn\x88\x00\xff\x01\x00\x00\xff\xff\x00\x00\x00\x00\x00\x00\
client: http2: Transport conn 0xc4218b0ea0 received error from processing frame DATA flags=END_STREAM stream=477 len=23 data="\x1f\x8b\b\x00\x00\tn\x88\x00\xf
server: http2: Framer 0xc4201600c0: read RST_STREAM stream=477 len=4 ErrCode=CANCEL
server: http2: server read frame RST_STREAM stream=477 len=4 ErrCode=CANCEL

What did you expect to see?

The transport ignore the frame.

What did you see instead?

The transport attempts to write data to the response body causing an error and that error returned high enough to close the connection and error out all inflight requests.

@bmizerany bmizerany changed the title net/http: recv DATA after respond body close (RST_STREAM) kills connection. net/http: recv DATA after response body close (RST_STREAM) kills connection. Sep 2, 2016
@bmizerany bmizerany changed the title net/http: recv DATA after response body close (RST_STREAM) kills connection. net/http: receiving DATA after sending (RST_STREAM) kills connection. Sep 2, 2016
@quentinmit quentinmit added this to the Go1.8 milestone Sep 6, 2016
@quentinmit
Copy link
Contributor

/cc @bradfitz

@quentinmit quentinmit added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 7, 2016
@bradfitz
Copy link
Contributor

@bmizerany, from our in person chat, I seem to recall you thought this was a false alarm?

Your cited logs don't seem to say the same thing as your description.

Is this still relevant?

@bradfitz bradfitz added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Oct 21, 2016
@bradfitz
Copy link
Contributor

Blake, ping.

@bmizerany
Copy link
Contributor Author

Sorry. I believe this is still a thing but can be fixed by discarding the response body before the close as to allow the transport to complete the state machine. This isn't how h1 behaves though. I assumed closing the body before reading would automatically discard the body.

@bradfitz
Copy link
Contributor

I don't follow. Closing the body before reading should automatically discard the body. Please provide evidence that it doesn't.

@bmizerany
Copy link
Contributor Author

bmizerany commented Oct 26, 2016

Closing the body before reading should automatically discard the body.

Correct, but it doesn't always do that. We've seen the race in production and were able to reproduce after long runs of our server under load locally to get the debug log output that I've posted above and annotated below.

Annotated Logs

The client closes the Response.Body BEFORE receiving DATA frames from the server. This sends a RST_STREAM to the server and then closes the bodies pipe with the non-nil error errClosedResponseBody but DOES NOT remove the stream from the list of open streams.

client: http2: Framer 0xc420deefc0: wrote RST_STREAM stream=477 len=4 ErrCode=CANCEL
client: http2: Framer 0xc420deefc0: read DATA flags=END_STREAM stream=477 len=23 data="\x1f\x8b\b\x00\x00\tn\x88\x00\xff\x01\x00\x00\xff\xff\x00\x00\x00\x00\x

See: https://github.com/golang/net/blob/master/http2/transport.go#L1608

Next the DATA frame for the now closed stream comes in:

client: http2: Transport received DATA flags=END_STREAM stream=477 len=23 data="\x1f\x8b\b\x00\x00\tn\x88\x00\xff\x01\x00\x00\xff\xff\x00\x00\x00\x00\x00\x00\

The client sees the stream in the list of open streams:
https://github.com/golang/net/blob/master/http2/transport.go#L1614

and then attempts to write to the closed pipe:
https://github.com/golang/net/blob/master/http2/transport.go#L1672

which returns an err:
https://github.com/golang/net/blob/master/http2/transport.go#L1674

which causes run to err:
https://github.com/golang/net/blob/master/http2/transport.go#L1315

produce debug logs:
https://github.com/golang/net/blob/master/http2/transport.go#L1336

client: http2: Transport conn 0xc4218b0ea0 received error from processing frame DATA flags=END_STREAM stream=477 len=23 data="\x1f\x8b\b\x00\x00\tn\x88\x00\xf

then return the err:
https://github.com/golang/net/blob/master/http2/transport.go#L1338

and close the conn:
https://github.com/golang/net/blob/master/http2/transport.go#L1210

Here is the servers log for ref:

server: http2: Framer 0xc4201600c0: read RST_STREAM stream=477 len=4 ErrCode=CANCEL
server: http2: server read frame RST_STREAM stream=477 len=4 ErrCode=CANCEL

Does that help? I was having trouble getting a test together to isolate at the time but was able to reproduce with private code if I didn't drain the body before closing which keeps the pipe in a non-errored state to avoid the path above.

I hope this clears things up.

@bradfitz
Copy link
Contributor

Cool, thanks. That's great. I should be able to write a test for that. I have a pending change from earlier to fix this, but without a repro I didn't like the fix.

@bmizerany
Copy link
Contributor Author

No problem. Thank you for taking a look at this.

@bradfitz
Copy link
Contributor

Are you sure you're running Go 1.7 or tip?

I wrote a test for this, but it doesn't occur like you describe. You wrote:

See: https://github.com/golang/net/blob/master/http2/transport.go#L1608

That line for reference is:

cs.bufPipe.BreakWithError(errClosedResponseBody)

... which sets the pipe's p.breakErr:

        breakErr error         // immediate read error (caller doesn't see rest of b)

It does not set its p.err.

Future Writes occur fine.

There's likely still a problem here (we should note that we gave up on that stream and not even try to write), but the problem isn't a you describe, and I still can't make the connection die as you've said.

What version are you running?

@bradfitz
Copy link
Contributor

I sent https://go-review.googlesource.com/32233 but I still have no clue what is happening for you.

Please try that out and see if there's any difference.

I still want to know what you're hitting.

Can you reproduce it easily?

@gopherbot
Copy link

CL https://golang.org/cl/32233 mentions this issue.

@bmizerany
Copy link
Contributor Author

I was able to repro easily in an old commit of our codebase. I took out the discard and got the same logs as above after several minutes of the server running.

@bradfitz
Copy link
Contributor

What version of the http2 code are you running?

@bmizerany
Copy link
Contributor Author

Our vendor-log shows 6c89f9617983ee917132513a791d8b5850fe90c5

@bradfitz
Copy link
Contributor

Even at golang/net@6c89f96 the code did https://github.com/golang/net/blob/6c89f9617983ee917132513a791d8b5850fe90c5/http2/transport.go#L1585 (BreakWithError) instead of CloseWithError, so the whole code path you described doesn't follow, since that first step you cite doesn't set the sticky error you later cite as causing as a Write error.

What is causing the Write error you see?

We still don't have a repro here.

@bmizerany
Copy link
Contributor Author

I'm emailing you private logs and a diff.

@gopherbot
Copy link

CL https://golang.org/cl/32571 mentions this issue.

gopherbot pushed a commit to golang/net that referenced this issue Nov 2, 2016
…ncels

If the client canceled a request, the Transport would then send a
RST_STREAM, but also would close the Response.Body's pipe. Meanwhile,
the server's DATA response could already be on the wire in
flight. We'd then read it, attempt to write its bytes to the buffer,
find it already closed, bubble up that error, and ultimately close the
whole TCP connection (breaking all other open streams).

So, don't do that.

Keep track of which connections we've sent RST_STREAM to, and ignore
DATA frames on those streams.

Updates golang/go#16974 (fixes after bundle to std)

Change-Id: Ic29a3aefff5241146cd2ca80aafa35fc4fb18b6e
Reviewed-on: https://go-review.googlesource.com/32571
Reviewed-by: Tom Bergan <tombergan@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
@gopherbot
Copy link

CL https://golang.org/cl/32578 mentions this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

4 participants