Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: check connection from pool before use #51202

Open
donotlietoanyone opened this issue Feb 15, 2022 · 6 comments
Open

net/http: check connection from pool before use #51202

donotlietoanyone opened this issue Feb 15, 2022 · 6 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@donotlietoanyone
Copy link

What version of Go are you using (go version)?

go1.16.3 darwin/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

GO111MODULE="on"
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/laozhang/Library/Caches/go-build"
GOENV="/Users/laozhang/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/laozhang/go/pkg/mod"
GONOPROXY="git.code.oa.com"
GONOSUMDB="git.code.oa.com,git.code.woa.com"
GOOS="darwin"
GOPATH="/Users/laozhang/go"
GOPRIVATE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GOVCS=""
GOVERSION="go1.16.3"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/dev/null"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch x86_64 -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/pg/7d_r_s8j6v362rgv9nqkrsbh0000gn/T/go-build3091927261=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

I used "net/http/httputil/reverseproxy.go" to build a http proxy, but it often happens: Request forwarding failed with error "EOF". It should be because: after the link is taken out from the link pool, the peer end closes the link when the proxy sends the request, causing the proxy to report an "EOF" error when it reads the response. Because it is a POST request, retrying may cause other unexpected errors.
Can an optimization be done here? After removing the link from the link pool, do a health check before using it. For example, if you try to read a byte from the link, the theoretical result should be: neither data can be read nor an error reported; any other result means that The currently fetched link is unavailable and should be discarded.

What did you expect to see?

has already been said above.

What did you see instead?

has already been said above.

@donotlietoanyone donotlietoanyone changed the title net/http; net/http/httputil net/http; net/http/httputil:After removing the link from the link pool, can we do a health check before using it Feb 15, 2022
@donotlietoanyone donotlietoanyone changed the title net/http; net/http/httputil:After removing the link from the link pool, can we do a health check before using it net/http; net/http/httputil:After removing the link from the link pool, should a health check be done before using it Feb 15, 2022
@dmitshur dmitshur changed the title net/http; net/http/httputil:After removing the link from the link pool, should a health check be done before using it net/http/httputil, net/http: after removing the link from the link pool, should a health check be done before using it Feb 15, 2022
@dmitshur dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Feb 15, 2022
@dmitshur dmitshur added this to the Backlog milestone Feb 15, 2022
@seankhliao seankhliao changed the title net/http/httputil, net/http: after removing the link from the link pool, should a health check be done before using it net/http: check connection from pool before use Feb 16, 2022
@donotlietoanyone
Copy link
Author

donotlietoanyone commented Feb 21, 2022

Sorry to ask, is anyone following this question? @dmitshur

@mengzhuo
Copy link
Contributor

There is failover looking for a good connection.
Can you provide a minium example?

go/src/net/http/transport.go

Lines 1032 to 1080 in 851ecea

// Look for most recently-used idle connection.
if list, ok := t.idleConn[w.key]; ok {
stop := false
delivered := false
for len(list) > 0 && !stop {
pconn := list[len(list)-1]
// See whether this connection has been idle too long, considering
// only the wall time (the Round(0)), in case this is a laptop or VM
// coming out of suspend with previously cached idle connections.
tooOld := !oldTime.IsZero() && pconn.idleAt.Round(0).Before(oldTime)
if tooOld {
// Async cleanup. Launch in its own goroutine (as if a
// time.AfterFunc called it); it acquires idleMu, which we're
// holding, and does a synchronous net.Conn.Close.
go pconn.closeConnIfStillIdle()
}
if pconn.isBroken() || tooOld {
// If either persistConn.readLoop has marked the connection
// broken, but Transport.removeIdleConn has not yet removed it
// from the idle list, or if this persistConn is too old (it was
// idle too long), then ignore it and look for another. In both
// cases it's already in the process of being closed.
list = list[:len(list)-1]
continue
}
delivered = w.tryDeliver(pconn, nil)
if delivered {
if pconn.alt != nil {
// HTTP/2: multiple clients can share pconn.
// Leave it in the list.
} else {
// HTTP/1: only one client can use pconn.
// Remove it from the list.
t.idleLRU.remove(pconn)
list = list[:len(list)-1]
}
}
stop = true
}
if len(list) > 0 {
t.idleConn[w.key] = list
} else {
delete(t.idleConn, w.key)
}
if stop {
return delivered
}
}

Kindly cc owner @bradfitz

@donotlietoanyone
Copy link
Author

donotlietoanyone commented Feb 21, 2022

There are two cases in which the link in the link pool is closed. One is that the client actively closes the link, and the other is that the server closes the link due to the timeout of the link idle time. The problem is in the second case.
The following is the specific problem I encountered. I encountered two kinds of errors, and the errors were printed in the ErrorHandler of ReverseProxy. One is "connection reset by peer" and the other is "EOF", I think the reason for these two problems is because:

  1. "connection reset by peer": the client first takes the link out of the connection pool, then the peer closes it, and then when the client sends a request, an error is reported;
  2. "EOF": The client first takes the link out of the link pool, then sends the request and the sending is completed. At the same time, the peer closes the link before receiving the request data, and then the client reports an error when reading the response ;

Through observation and monitoring, the probability of the above two problems occurring is one in a million.
In fact, I have solved this problem, that is, I set the IdleConnTimeout of the Transport to be shorter, as long as it is shorter than the idle timeout time of the server, so that the idle link will be actively closed by the client, not by the server. , in this case, there is no problem above. The fact is indeed this: after the change, the problem is gone.
But I think, after taking out the link from the link pool, is it possible to detect whether the link is healthy? Or is there any other way to avoid the above two problems from happening when the peer end actively closes the link?
@mengzhuo

@donotlietoanyone
Copy link
Author

The code looks like this:

var (
	defaultTransport *http.Transport
	once             sync.Once
)

func GetDefaultTransport() *http.Transport {
	once.Do(func() {
		defaultTransport = &http.Transport{
			DialContext: (&net.Dialer{
				Timeout:   30 * time.Second, 
				KeepAlive: 15 * time.Second, 
			}).DialContext,
			MaxIdleConns:        200,             
			IdleConnTimeout:     30 * time.Second, 
			MaxIdleConnsPerHost: 100,         
			MaxConnsPerHost:     200,             
		}
	})
	return defaultTransport
}

func (s *Service) registerHTTPCGI() {
	router := mux.NewRouter()
	router.PathPrefix("/").HandlerFunc(ToInnerAPI)
	s.CGIRouter = router
}

func ToInnerAPI(rsp http.ResponseWriter, inReq *http.Request) {
	forwardedHost := inReq.Header.Get(omodel.HTTPHeaderXForwardedHost)
	if forwardedHost == "" {
		rlog.Error("failed to redirect request, header[X-Forwarded-Host] is empty", rlog.CTX(inReq.Context()))
		responseErr(rsp, inReq, http.StatusNotAcceptable, "header[X-Forwarded-Host] is empty, "+
			"value should be a string in format of IP:Port or IP")
		return
	}
	proxy := newReverseProxyWithReporting(&httputil.ReverseProxy{
		Director: func(req *http.Request) {
			req.URL.Host = forwardedHost
			req.Host = forwardedHost
			req.URL.Scheme = getRequestScheme(inReq)
		},
		Transport: GetDefaultTransport(),
		ErrorHandler: func(w http.ResponseWriter, req *http.Request, err error) {
			rlog.Error("failed to redirect request", rlog.CTX(req.Context()), rlog.Err(err))
			responseErr(w, req, http.StatusServiceUnavailable,
				fmt.Sprintf("%s,hint:[%+v]\n", "failed to redirect request", err))
		},
	})
	proxy.serveHTTP(rsp, inReq)
}


@ianlancetaylor
Copy link
Contributor

CC @neild

@suiriass
Copy link

@neild

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

5 participants