Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/http2: closing a requestBody in the ServerHTTP method drops conn-level flow control #28634

Closed
jared2501 opened this issue Nov 7, 2018 · 12 comments
Labels
FrozenDueToAge help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.

Comments

@jared2501
Copy link

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go1.11 darwin/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/jnewman/Library/Caches/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/jnewman/go"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/_t/hg9f_j4x1q743pqh00kdv6m00000gn/T/go-build803039253=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do? / What did you expect to see? / What did you see instead?

I'm using an httputil.ReverseProxy to proxy a client-streaming RPC (using go grpc). If the upstream connection that the ReverseProxy is copying to breaks then the http2 transport attempts to close the request's body. This causes the http2.pipe that back's the http2.requestBody to have it's pipeBuffer set to nil see here. When the http2 goes to retrieve the length of the requestBody so that it can return the remaining data to the connection-level flow control it discovers the length is 0 and does not return the data in the pipe to the conn-leel flow control. Therefore, the client get's stuck.

Here's some logs I added:

interceptor write. len=16393
http2 server sendWindowUpdate32. n=16384, avail=32768
smux write. id=7, len=16413
controlBuffer.get. block=true
controlBuffer.get completed. block=true
handle: *transport.incomingWindowUpdate
sendQuota. 16384
controlBuffer.get. block=false
controlBuffer.get completed. block=false
handle: *transport.incomingWindowUpdate
sendQuota. 16384
processing dataItem
down here
controlBuffer.get. block=false
controlBuffer.get completed. block=false
sendQuota. 0
interceptor write. len=16393
all sent. id=7 len=16413
controlBuffer.get. block=true
smux write. id=7, len=38
all sent. id=7 len=38
smux write. id=7, len=16413
http2 server sendWindowUpdate32. n=16384, avail=16384
controlBuffer.get completed. block=true
handle: *transport.incomingWindowUpdate
sendQuota. 16384
controlBuffer.get. block=false
controlBuffer.get completed. block=false
sendQuota. 16384
controlBuffer.get. block=false
controlBuffer.get completed. block=false
sendQuota. 16384
controlBuffer.get. block=true
controlBuffer.get completed. block=true
handle: *transport.incomingWindowUpdate
sendQuota. 16384
processing dataItem
down here
controlBuffer.get. block=false
controlBuffer.get completed. block=false
sendQuota. 0
interceptor write. len=16393
controlBuffer.get. block=true
2018/11/07 09:12:55 http: proxy error: write tcp 10.11.74.1:49006->10.111.142.1:6001: write: connection reset by peer
requestBody.Close invoked BreakWithError
setting p.b = nil, prevLen=1048576
smux write. id=7, len=31
smux write. id=5, len=31
server closeStream! Len=
trying to return closed stuff... len=0!
2018-11-07T09:12:55.613Z	INFO	remote/mux.go:567	Disconnecting from peer. peerHostID=12
2018-11-07T09:12:55.613Z	ERROR	remote/mux.go:247	Error connecting to peer. peerAddr=10.111.142.1:6001, err=error in client flux stream

Here's the diff to apply the logs:

commit 5cb80cec760b692f307ccd41e21f384fb583e0d1
Author: Jared Newman <jnewman2501@gmail.com>
Date:   Wed Nov 7 01:08:00 2018 -0800

    wip

diff --git a/lifecycle/lifecycle.go b/lifecycle/lifecycle.go
index 0936ef4..aa4b1b4 100644
--- a/lifecycle/lifecycle.go
+++ b/lifecycle/lifecycle.go
@@ -106,7 +106,7 @@ func NewLifecycle(config *FluxConfig) *Lifecycle {
 		MeshStreamRetryInterval:      5 * time.Second,
 		GroupKeyRotationInterval:     10 * time.Minute,
 		PeerReconnectInterval:        5 * time.Second,
-		BackfillLoopSleepTimeout:     10 * time.Second,
+		BackfillLoopSleepTimeout:     5 * time.Second,
 		ExpireEventLogOpenAfter:      2 * time.Minute,
 		MoveHWMForwardPeriod:         5 * time.Second,
 		GetNextEpochRangeTimeout:     10 * time.Second,
diff --git a/remote/mux.go b/remote/mux.go
index 78c3d24..97743e5 100644
--- a/remote/mux.go
+++ b/remote/mux.go
@@ -25,6 +25,7 @@ import (
 	"golang.org/x/net/http2"
 	"google.golang.org/grpc"
 	"io"
+	"io/ioutil"
 	"math/big"
 	"net"
 	"net/http"
@@ -405,6 +406,7 @@ func (s *Server) ServeHTTP(resp http.ResponseWriter, req *http.Request) {
 		FlushInterval: 100 * time.Millisecond,
 		Director:      func(*http.Request) {},
 		Transport: roundTripperFunc(func(req *http.Request) (*http.Response, error) {
+			req.Body = ioutil.NopCloser(req.Body)
 			return nextHopPeer.RoundTrip(priority, destHostID, environment, req)
 		}),
 	}
diff --git a/vendor/golang.org/x/net/http2/pipe.go b/vendor/golang.org/x/net/http2/pipe.go
index a614009..f45806d 100644
--- a/vendor/golang.org/x/net/http2/pipe.go
+++ b/vendor/golang.org/x/net/http2/pipe.go
@@ -8,6 +8,7 @@ import (
 	"errors"
 	"io"
 	"sync"
+	"fmt"
 )
 
 // pipe is a goroutine-safe io.Reader/io.Writer pair. It's like
@@ -117,6 +118,9 @@ func (p *pipe) closeWithError(dst *error, err error, fn func()) {
 	}
 	p.readFn = fn
 	if dst == &p.breakErr {
+		if p.b != nil{
+			fmt.Printf("setting p.b = nil, prevLen=%d\n", p.b.Len())
+		}
 		p.b = nil
 	}
 	*dst = err
diff --git a/vendor/golang.org/x/net/http2/server.go b/vendor/golang.org/x/net/http2/server.go
index b57b6e2..4ab8542 100644
--- a/vendor/golang.org/x/net/http2/server.go
+++ b/vendor/golang.org/x/net/http2/server.go
@@ -1454,6 +1454,7 @@ func (sc *serverConn) processResetStream(f *RSTStreamFrame) error {
 }
 
 func (sc *serverConn) closeStream(st *stream, err error) {
+	fmt.Printf("server closeStream! Len=\n")
 	sc.serveG.check()
 	if st.state == stateIdle || st.state == stateClosed {
 		panic(fmt.Sprintf("invariant; can't close stream in state %v", st.state))
@@ -1478,6 +1479,7 @@ func (sc *serverConn) closeStream(st *stream, err error) {
 		}
 	}
 	if p := st.body; p != nil {
+		fmt.Printf("trying to return closed stuff... len=%d!\n", p.Len())
 		// Return any buffered unread bytes worth of conn-level flow control.
 		// See golang.org/issue/16481
 		sc.sendWindowUpdate(nil, p.Len())
@@ -2219,6 +2221,7 @@ func (sc *serverConn) sendWindowUpdate32(st *stream, n int32) {
 	var ok bool
 	if st == nil {
 		ok = sc.inflow.add(n)
+		fmt.Printf("http2 server sendWindowUpdate32. n=%d, avail=%d\n", n, sc.inflow.available())
 	} else {
 		ok = st.inflow.add(n)
 	}
@@ -2240,6 +2243,7 @@ type requestBody struct {
 
 func (b *requestBody) Close() error {
 	if b.pipe != nil && !b.closed {
+		fmt.Printf("requestBody.Close invoked BreakWithError\n")
 		b.pipe.BreakWithError(errClosedBody)
 	}
 	b.closed = true
diff --git a/vendor/golang.org/x/net/http2/transport.go b/vendor/golang.org/x/net/http2/transport.go
index 2c9fe88..fbfa0eb 100644
--- a/vendor/golang.org/x/net/http2/transport.go
+++ b/vendor/golang.org/x/net/http2/transport.go
@@ -2008,6 +2008,7 @@ func (b transportResponseBody) Close() error {
 		cc.mu.Unlock()
 	}
 
+	fmt.Printf("transportResponseBody.Close invoked BreakWithError\n")
 	cs.bufPipe.BreakWithError(errClosedResponseBody)
 	cc.forgetStreamID(cs.ID)
 	return nil
@jared2501
Copy link
Author

Possibly related to #28441.

@jared2501
Copy link
Author

I'm happy to explain the logs in more detail. Note that a quick work around for me is to do:

proxy := httputil.ReverseProxy{
	FlushInterval: 100 * time.Millisecond,
	Director:      func(*http.Request) {},
	Transport: roundTripperFunc(func(req *http.Request) (*http.Response, error) {
		req.Body = ioutil.NopCloser(req.Body)
		return rt.RoundTrip(req)
	}),
}

But I think this is a hack since I don't think the http2.Server should break if a http handler closed the body.

@jared2501
Copy link
Author

jared2501 commented Nov 7, 2018

I think @wujieliulan actually describes this problem in #16481.

@bradfitz
Copy link
Contributor

bradfitz commented Nov 7, 2018

Got a self-contained repro?

You link to #16481 but that looks like it was fixed. Do you think it wasn't fixed?

@bradfitz bradfitz added help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Nov 7, 2018
@jared2501
Copy link
Author

jared2501 commented Nov 7, 2018

Yeah, I think this patch handles an unhandled edge case from #16481:

commit 6ab7b917aea89ad6a26b16ab8cc1c370d2b79b94 (HEAD -> master, anduril/master)
Author: jared2501 <jnewman2501@gmail.com>
Date:   Wed Nov 7 14:09:41 2018 -0800

    Closed requestBody should return conn-level flow control (#4)

diff --git a/http2/pipe.go b/http2/pipe.go
index a614009..46e2a1e 100644
--- a/http2/pipe.go
+++ b/http2/pipe.go
@@ -14,13 +14,14 @@ import (
 // io.Pipe except there are no PipeReader/PipeWriter halves, and the
 // underlying buffer is an interface. (io.Pipe is always unbuffered)
 type pipe struct {
-       mu       sync.Mutex
-       c        sync.Cond     // c.L lazily initialized to &p.mu
-       b        pipeBuffer    // nil when done reading
-       err      error         // read error once empty. non-nil means closed.
-       breakErr error         // immediate read error (caller doesn't see rest of b)
-       donec    chan struct{} // closed on error
-       readFn   func()        // optional code to run in Read before error
+       mu        sync.Mutex
+       c         sync.Cond     // c.L lazily initialized to &p.mu
+       b         pipeBuffer    // nil when done reading
+       unusedLen int           // contains the unused if b is closed with BreakWithError
+       err       error         // read error once empty. non-nil means closed.
+       breakErr  error         // immediate read error (caller doesn't see rest of b)
+       donec     chan struct{} // closed on error
+       readFn    func()        // optional code to run in Read before error
 }

 type pipeBuffer interface {
@@ -38,6 +39,15 @@ func (p *pipe) Len() int {
        return p.b.Len()
 }

+func (p *pipe) UnusedLen() int {
+       p.mu.Lock()
+       defer p.mu.Unlock()
+       if p.b == nil {
+               return p.unusedLen
+       }
+       return p.b.Len()
+}
+
 // Read waits until data is available and copies bytes
 // from the buffer into p.
 func (p *pipe) Read(d []byte) (n int, err error) {
@@ -117,6 +127,9 @@ func (p *pipe) closeWithError(dst *error, err error, fn func()) {
        }
        p.readFn = fn
        if dst == &p.breakErr {
+               if p.b != nil {
+                       p.unusedLen = p.b.Len()
+               }
                p.b = nil
        }
        *dst = err
diff --git a/http2/server.go b/http2/server.go
index 75d3071..01f4ecc 100644
--- a/http2/server.go
+++ b/http2/server.go
@@ -1480,9 +1480,9 @@ func (sc *serverConn) closeStream(st *stream, err error) {
        if p := st.body; p != nil {
                // Return any buffered unread bytes worth of conn-level flow control.
                // See golang.org/issue/16481
-               sc.sendWindowUpdate(nil, p.Len())
+               sc.sendWindowUpdate(nil, p.UnusedLen())

-               p.CloseWithError(err)
+               p.BreakWithError(err)
        }
        st.cw.Close() // signals Handler's CloseNotifier, unblocks writes, etc
        sc.writeSched.CloseStream(st.id)

Applying that to my fork I see my issues go away. I'll work on a contained reproduction sometime next week, although maybe the patch gives you some insight on the issue?

@prashantv
Copy link
Contributor

prashantv commented Jul 23, 2019

(found this via #28204)

We ran into the same issue a few times, and it seems to be exactly as @jared2501 points out -- closing a request body on the server doesn't update the connection flow-control, leading to the connection stalling as the client side thinks it has exhausted the connection flow-control.

I have a repro up:
https://github.com/prashantv/go-bug-28634-repro

(Edit: The repro almost always fails on Linux within 100 iterations, but requires a higher number of iterations on other platforms).

The repro does the following in a loop:

  • Client makes a huge request, but the server closes the request body without reading it.
  • Client then makes a small echo request to verify the connection is still working

It typically breaks in < 50 iterations, but sometimes (probably 5% of the time) makes it through 100 iterations. When it fails, we see a timeout on the echo call. If I remove the r.Body.Close(), then it works without any issues.

Digging into the code for Close,

The last call nils out the pipe buffer field b, which drops unread bytes in the buffer. When the handler returns, we try to send a window update for any unread bytes, but there's none at that point.

Logs from the repro running with GODEBUG=http2debug=2:
https://gist.github.com/prashantv/5c70ea7a38bec121352b2f36110bb3c2

I tried the patch above on a branch, and it seemed to improve the success rate of getting through 100 iterations (probably to 10-20% instead of 5%) but it didn't seem to fix the issue.

@agnivade
Copy link
Contributor

@fraenkel

@fraenkel
Copy link
Contributor

I have a slightly easier fix, but I am trying to come up with a testcase that makes sense.

@gopherbot
Copy link

Change https://golang.org/cl/187377 mentions this issue: http2: Track unread bytes when the pipe is broken

@prashantv
Copy link
Contributor

Is there anything we can do to help make progress on this issue? We still see the issue periodically.

@bradfitz
Copy link
Contributor

Reviewed.

gopherbot pushed a commit to golang/net that referenced this issue Oct 14, 2019
Once the pipe is broken, any remaining data needs to be reported as well
as any data that is written but dropped.

The client side flow control can eventually run out of available bytes
to be sent since no WINDOW_UPDATE is sent to reflect the data that is
never read in the pipe.

Updates golang/go#28634

Change-Id: I83f3c9d3614cd92517af2687489d2ccbf3a65456
Reviewed-on: https://go-review.googlesource.com/c/net/+/187377
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
@gopherbot
Copy link

Change https://golang.org/cl/209077 mentions this issue: net/http: update bundled x/net/http2

@golang golang locked and limited conversation to collaborators Nov 26, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

6 participants