Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/http2: downloading files from kernel.org results in PROTOCOL_ERROR stream error #19035

Closed
stapelberg opened this issue Feb 11, 2017 · 9 comments
Labels
FrozenDueToAge help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@stapelberg
Copy link
Contributor

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version go1.8rc3 linux/amd64 (also reproducible using go version go1.7.1 linux/amd64)

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/michael/go"
GORACE=""
GOROOT="/home/michael/go1.8rc3"
GOTOOLDIR="/home/michael/go1.8rc3/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build924866453=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
PKG_CONFIG="pkg-config"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"

What did you do?

https://play.golang.org/p/KUK5nfKxgu

What did you expect to see?

The program should terminate without an error.

What did you see instead?

(map reformatted for readability)

2017/02/11 11:31:59 resp = &{200 OK 200 HTTP/2.0 2 0 map[
Date:[Sat, 11 Feb 2017 10:31:59 GMT]
Content-Type:[application/x-xz]
Last-Modified:[Thu, 09 Feb 2017 07:19:24 GMT]
Accept-Ranges:[bytes]
Strict-Transport-Security:[max-age=15768000]
Server:[nginx]
Content-Length:[93208832]
Etag:["58e4100-54813ccf8f700"]
X-Frame-Options:[DENY]] {0xc4203e0500} 93208832 [] false false map[] 0xc42000a800 0xc4201aa370}
2017/02/11 11:32:59 stream error: stream ID 1; PROTOCOL_ERROR
panic: stream error: stream ID 1; PROTOCOL_ERROR

goroutine 1 [running]:
log.Panic(0xc420051f08, 0x1, 0x1)
	/home/michael/go1.8rc3/src/log/log.go:322 +0xc0
main.main()
	/tmp/poc.go:27 +0x283
exit status 2

I think this is an x/net/http2 related error, because running the program with the environment variable GODEBUG=http2client=0 works as expected.

Other HTTP2 clients, such as curl --http2 -O https://www.kernel.org/pub/linux/kernel/v4.x/linux-4.9.9.tar.xz, work as expected.

@stapelberg
Copy link
Contributor Author

Here’s the result of running with GODEBUG=http2debug=2: debug.gz

@bradfitz bradfitz added this to the Go1.9 milestone Feb 11, 2017
@bradfitz
Copy link
Contributor

Thanks for the report.

So, the trace says that kernel.org is sending us the PROTOCOL_ERROR:

...
2017/02/11 11:38:34 http2: Framer 0xc4200680d0: wrote WINDOW_UPDATE stream=1 len=4 incr=12288
2017/02/11 11:39:34 http2: Framer 0xc4200680d0: read RST_STREAM stream=1 len=4 ErrCode=PROTOCOL_ERROR

But I see nothing obviously wrong. I imagine its accounting of flow control differs from ours and it's complaining about that.

The response says they're using nginx.

Unfortunately, I'm unable to reproduce using your program. Oh, except the machine I'm running it on doesn't have IPv6, if that's relevant. (Your example connected via IPv6)

/cc @tombergan

@stapelberg
Copy link
Contributor Author

Thanks for the quick reply.

IPv6 does not seem to be relevant: debug.gz is another trace of the same problem after deleting my IPv6 address and default route.

I can reproduce the issue on my Google workstation. I’ve granted you login permission to it, see the email I sent you on your corp account. Hope that helps.

Note that all computers from which I can reproduce this are located in Zürich, Switzerland, whereas all kernel.org mirrors (as per traceroute) are located in the US, i.e. ≥ 100ms of latency away.

@tombergan
Copy link
Contributor

I can repro the problem. @stapelberg's example downloaded 155400 bytes before failing and my repro downloaded 85768 bytes before failing, so the failure appears to be nondeterministic. Go's WINDOW_UPDATEs look correct to me, in both traces. There is consistently a 1 minute gap between Go sending the last WINDOW_UPDATE and receiving the RST_STREAM(PROTOCOL_ERROR). My ping time to the server is only 5ms. This suggests we're hitting a timeout, possibly due to flow control not being properly updated as @bradfitz suggested. We might be hitting this code in nginx.

I do not see any WINDOW_UPDATEs on the stream, only on the connection. But Go sends a connection-level WINDOW_UPDATE with incr=1GB at startup, so nginx should not be blocked on connection-level flow control tokens.

I suspect a bug in nginx, since everything looks ok on Go's end, but I cannot be sure. @VBart, any ideas?

/cc @VBart

@bradfitz
Copy link
Contributor

Ping @VBart.

@VBart
Copy link

VBart commented Apr 24, 2017

The exact reason of RST_STREAM should be in error log. But yes, this looks like a request timeout (see the send_timeout directive), which is 1 minute by default.

While from Go logs you see that WINDOW_UPDATE was sent to nginx, I suspect nginx didn't receive it and closed stream after a minute of waiting. That can be a network problem, e.g. some intermediate buffering. Enabling debug log in nginx will show the truth.

@bradfitz bradfitz added help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Apr 24, 2017
@bradfitz
Copy link
Contributor

It would be helpful if somebody could put together a standalone repro for this, ideally in a Docker container with both nginx and Go, along with any necessary network delaying (e.g. http://bencane.com/2012/07/16/tc-adding-simulated-network-latency-to-your-linux-server/), which probably doesn't run in a container.

But it also seems like an nginx bug to me.

@VBart says:

That can be a network problem, e.g. some intermediate buffering.

But Go only speaks HTTP/2 over TLS, so almost certainly not any intermediate buffering.

In any case, a repro that doesn't involve kernel.org so we can enable nginx debug logs and modify nginx source code at will would be super helpful.

@bradfitz bradfitz modified the milestones: Go1.10, Go1.9 Jun 14, 2017
@bradfitz
Copy link
Contributor

I'm back from leave. Can somebody try and reproduce this and see if it's still happening, ideally with Go tip?

Maybe something in Go or something in nginx changed.

@stapelberg
Copy link
Contributor Author

Cannot reproduce this anymore with go +21672b36eb:

% go run /tmp/http2.go
2017/11/29 08:31:46 resp = &{200 OK 200 HTTP/1.1 1 1 map[Date:[Wed, 29 Nov 2017 07:31:46 GMT] Content-Type:[application/x-xz] Strict-Transport-Security:[max-age=15768000] X-Frame-Options:[DENY] Server:[nginx] Content-Length:[93208832] Last-Modified:[Thu, 09 Feb 2017 07:19:24 GMT] Accept-Ranges:[bytes] X-Content-Type-Options:[nosniff] X-Frontend:[packetnet-amsterdam]] 0xc420072100 93208832 [] false false map[] 0xc420126000 0xc420530d10}
% 

@golang golang locked and limited conversation to collaborators Nov 29, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

5 participants