Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: Client.Do() in HTTP/2 mode returns unexpected nil error after client timeout #65375

Closed
bobvawter opened this issue Jan 30, 2024 · 2 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@bobvawter
Copy link

Go version

go1.20.13 darwin/arm64, go1.21.6 darwin/arm64

Output of go env in your module/workspace:

GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/Users/bob/Library/Caches/go-build"
GOENV="/Users/bob/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/bob/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/bob/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/Users/bob/go/current"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/Users/bob/go/current/pkg/tool/darwin_arm64"
GOVCS=""
GOVERSION="go1.20.13"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/bob/cdc-sink/go.mod"
GOWORK=""
CGO_CFLAGS="-O2 -g"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-O2 -g"
CGO_FFLAGS="-O2 -g"
CGO_LDFLAGS="-O2 -g"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/2x/kgchp7tx4938vcs04rk3nyz40000gq/T/go-build1903809151=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

It appears that if http.Client.Do() encounters a client-initiated timeout, it can sometimes returns a nil error and a server response if the http.Request object has been reused. This only appears to happen if the client is speaking HTTP/2.

Here's a minimum reproduction case: https://gist.github.com/bobvawter/c5fb99bbc8de533bb53881fe78b8db44

In the linked repro, there's an HTTP/2 enabled server that shows a pathologically slow endpoint. This endpoint returns 200/OK if it is able to completely consume the request body, or 400/Bad Request if the body can't be read after a delay. The http.Client is configured with a timeout that will fire most of the time. The client loop retries the request if there is an error, since we can inspect http.NewRequestWithContext() to see that its implementation of GetBody makes copies of a bytes.Reader.

After about 30 seconds of running the test on my M1 laptop, the repro will demonstrate that a timed-out request to the server can cause Client.Do() to return a nil error and that the HTTP 400 status code is visible to the client. We often, but not always, see the expected "context deadline exceeded" error.

The final log output that the repro creates before halting will look like:

repro_400_test.go:141: request declared a Content-Length of 6228922 but only wrote 0 bytes
repro_400_test.go:124: BAD STATUS 400

The message from line 141 is an error generated by reading the http.Request.Body. There are cases too, where the body is some non-zero, but partial, amount of the content length. Since only the client-configured timeout should be introducing errors, it's odd that we see the resulting HTTP 400 sent by the handler.

This repro proceeds as expected if:

  • HTTP/2 is turned off in the client.
  • The request object is not reused.

What did you see happen?

The repro case linked above gets to the block identified as ISSUE, where client.Do() has returned a nil error and we see the 400 status code returned by the trivial http handler appear, the BAD STATUS message.

What did you expect to see?

I would expect that if the client timeout fires or the client is unable to transmit the entire payload to the server, that http.Client.Do() should return a non-nil error.

I would also expect the behaviors to be broadly similar between HTTP/1.1 and HTTP/2 protocols.

@mknyszek
Copy link
Contributor

CC @neild

@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jan 30, 2024
@mknyszek mknyszek added this to the Backlog milestone Jan 30, 2024
@bobvawter
Copy link
Author

It was pointed out to me off-thread that Body field must be re-set during reuse. The GetBody helper set up by http.NewRequestWithContext() is only used when processing redirects and not if a request is re-used. Thus, there was a condition where a previous retry iteration could have consumed some or all of the payload, leading to a legitimate short-write to the server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

2 participants