x/net/http2: max connections overflow #39389

stevenh · 2020-06-03T20:44:34Z

What version of Go are you using (`go version`)?

go version go1.14.4 freebsd/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (`go env`)?

go env Output

GO111MODULE=""
GOARCH="amd64"
GOBIN="/data/go/bin"
GOCACHE="/root/.cache/go-build"
GOENV="/root/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="freebsd"
GOINSECURE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="freebsd"
GOPATH="/data/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/freebsd_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/data/go/src/github.com/multiplay/go/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build080411694=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Highly concurrent requests to a http2 endpoint.

Sample reproduction code:

package main

import (
        "flag"
        "log"
        "net"
        "net/http"
        "sync"
        "sync/atomic"
        "time"
)

var (
        client    *http.Client
        successes int32
        errors    int32
        max       int
        url          string
)

func worker(wg *sync.WaitGroup, run <-chan struct{}) {
        defer wg.Done()

        req, err := http.NewRequest(http.MethodGet, url, nil)
        if err != nil {
                log.Println(err)
                atomic.AddInt32(&errors, 1)
                return
        }

        <-run
        if _, err = client.Do(req); err != nil {
                log.Println(err)
                atomic.AddInt32(&errors, 1)
        } else {
                atomic.AddInt32(&successes, 1)
        }
}

func main() {
        flag.IntVar(&max, "max", 10, "maximum concurrent requests")
        flag.StringVar(&url, "max", "https://example/com", "url to test against")
        flag.Parse()

        client = &http.Client{
                Transport: &http.Transport{
                        Proxy: http.ProxyFromEnvironment,
                        DialContext: (&net.Dialer{
                                Timeout:   30 * time.Second,
                                KeepAlive: 30 * time.Second,
                                DualStack: true,
                        }).DialContext,
                        ForceAttemptHTTP2:     true,
                        MaxIdleConns:          max,
                        IdleConnTimeout:       90 * time.Second,
                        TLSHandshakeTimeout:   10 * time.Second,
                        ExpectContinueTimeout: 1 * time.Second,
                },
        }

        var wg sync.WaitGroup
        wg.Add(max)
        run := make(chan struct{})
        for i := 0; i < max; i++ {
                go worker(&wg, run)
        }
        close(run)
        wg.Wait()
        log.Printf("successes: %d errors: %d\n", successes, errors)
}

In our tests we ran the above with -max=1000 to trigger the issue as the server has MAX_CONCURRENT_STREAMS=256.

What did you expect to see?

All requests succeed.

What did you see instead?

http2: server sent GOAWAY and closed the connection; LastStreamID=533, ErrCode=PROTOCOL_ERROR

Debugging so far

After doing some debugging the issue is down to the fact transport configures maxConcurrentStreams to 1000 and with lots of very quick requests as generated by the above a connection ends up getting more than the max streams before it processes the settings header from the server which includes MAX_CONCURRENT_STREAMS=256.

The result of this is that all streams > 256 get refused with a mixture of:

RST_STREAM ErrCode=REFUSED_STREAM
GOAWAY ErrCode=PROTOCOL_ERROR

The former is handled by http2canRetryError but go away isn't as there are two distinct types of GOAWAY errors http2errClientConnGotGoAway and http2GoAwayError and it's http2GoAwayError I'm seeing.

Which begs the question should http2GoAwayError be handled hence that fix is simply:

@@ -7028,10 +7047,14 @@ func http2canRetryError(err error) bool {
        if err == http2errClientConnUnusable || err == http2errClientConnGotGoAway {
                return true
        }
-       if se, ok := err.(http2StreamError); ok {
-               return se.Code == http2ErrCodeRefusedStream
+       switch err := err.(type) {
+       case http2GoAwayError:
+               return err.ErrCode == http2ErrCodeProtocol
+       case http2StreamError:
+               return err.Code == http2ErrCodeRefusedStream
+       default:
+               return false
        }
-       return false
 }

 func (t *http2Transport) dialClientConn(addr string, singleUse bool) (*http2ClientConn, error) {

With this in place I now see retries don't get GOAWAY errors but start getting unexpected EOF.

Finally if I change maxConcurrentStreams to 256 as desired by the server everything works perfectly.

Reading the HTTP2 spec it seems valid for the client to send additional frames after its connection preface and in particular before its received the servers connection preface so setting up additional streams is valid.

Given all this I wonder if setting a lower SETTINGS_MAX_CONCURRENT_STREAMS such as the min recommended by the spec of 100 should be done until a settings frame has been received, as which it could be upgraded to either the value sent or the current default if no SETTINGS_MAX_CONCURRENT_STREAMS is present. This should have minimal impact on normal operation but in high concurrent situations avoid significant extra overhead of trying to setup streams which will never seen.

To put this into context I've seen the go client hit 500 streams before it processes the settings frame, so in our case which is talking to a cloudflare endpoint means up to half the streams need to be retried on different connections.

Thoughts?

The text was updated successfully, but these errors were encountered:

dmitshur · 2020-06-04T01:43:37Z

/cc @bradfitz @tombergan

Prevent the client trying to establish more streams than the server is willing to accept during the initial life time of a connection by limiting maxConcurrentStreams to 100, the http2 specifications recommended minimum, until we've received the initial SETTINGS frame from the server. After a SETTINGS frame has been received use the servers MAX_CONCURRENT_STREAMS, if present, otherwise use 1000 as a reasonable value. For normal consumers this will have very little impact, allowing a decent level of concurrency from the start, and for highly concurrent consumers or large bursts it will prevent significant number of rejected streams being attempted hence actually increasing performance. Fixes golang/go#39389

gopherbot · 2020-06-04T09:56:52Z

Change https://golang.org/cl/236497 mentions this issue: http2: limit client initial MAX_CONCURRENT_STREAMS

networkimprov · 2020-06-07T03:24:03Z

cc @fraenkel

@dmitshur, note that Michael is a good reference for http2 issues.

fraenkel · 2020-06-07T13:18:55Z

My one concern is the number of connections opened prior to upgrading the MAX_CONCURRENT_STREAMS. You described a situation whereby 250+ requests were processed before the initial SETTINGS. This would create at least 4 additional connections prior to adjusting the count.

I am looking at other implementations both client and server to get a better idea of what defaults are being used. Envoy is 1 << 29, nginx is 128, Java went from 100 -> Integer.MAX_VALUE.

There are more examples of using a larger default over a smaller default.

stevenh · 2020-06-07T13:48:22Z

I think our example is very much worse case scenario being a load test which was specifically designed to request as many connections at startup as configured which was 10k resulting in 100 - 2000 GOAWAY from cloudflare due to this. Enforcing a limit may even allow the client to process SETTINGS as from enabling debugging it was clear the SETTINGS frame was received way before it was processed, likely due to lock contention from the new streams being opened.

Given this ensuring we stay within the limit seems more important in real world scenarios.

If you compare this to http 1 all would be individual connections, hence in the grand scheme of things 4 connections is not a huge number.

So I guess the question is keep min at 100 or go slightly higher with 128 for nginx thoughts?

ofavre · 2021-03-24T13:42:10Z

In my use case, I suddenly have thousands of requests that I must perform at once.

We've lowered the default value to 1 to avoid seeing REFUSED_STREAM and PROTOCOL_ERROR, at the expense of creating a second connection until the first connection receives MAX_CONCURRENT_STREAMS from the server.
As the connections seem to be created only one by one, the first connection received its settings and thus can accept more streams, so the overhead in number of opened connections is low.

I have seen servers in the wild that default to MAX_CONCURRENT_STREAMS=50, or even a dynamically determined value, like Apple Push Service for instance.

I see in the code that REFUSED_STREAM are retried... provided that you used http.Request.GetBody instead of http.Request.Body, or if you're lucky enough that the body has not already been read.

I see at least two possibilities: waiting for the SETTINGS frame before opening new connections ; or defaulting the MAX_CONCURRENT_STREAMS to 1 instead of 100 like proposed.
I do not know which is better, or even if there are better alternatives.

nekomeowww · 2021-09-01T07:06:57Z

Are there any restrictions that preventing setting a default value for clients in Transport configurations?
Or we should implement it like right now, and set the MAX_CONCURRENT_STREAMS when we receive SETTINGS?

What if the origin server (or remote server) does have the ability to handle the concurrency, and they will never return a SETTINGS to the client to specify a MAX_CONCURRENT_STREAMS value? Such as the gRPC setup. So in this case, the client and server are both the high concurrency ends. We aren't expecting unlimited or very low value of concurrency connections, we are expecting to dynamically change the value based on our needs (or configure the value inside the configuration files).

What would be the best practice on that?

Prevent the client trying to establish more streams than the server is willing to accept during the initial life time of a connection by limiting `maxConcurrentStreams` to `100`, the http2 specifications recommended minimum, until we've received the initial `SETTINGS` frame from the server. After a `SETTINGS` frame has been received use the servers `MAX_CONCURRENT_STREAMS`, if present, otherwise use `1000` as a reasonable value. For normal consumers this will have very little impact, allowing a decent level of concurrency from the start, and for highly concurrent consumers or large bursts it will prevent significant number of rejected streams being attempted hence actually increasing performance. Fixes golang/go#39389 Change-Id: I35fecd501ca39cd059c7afd1d44090b023f16e1e GitHub-Last-Rev: 0d1114d3a558cefed17008aba3e4a4d7b2ad3866 GitHub-Pull-Request: golang/net#73 Reviewed-on: https://go-review.googlesource.com/c/net/+/236497 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Trust: Brad Fitzpatrick <bradfitz@golang.org> Trust: Joe Tsai <joetsai@digital-static.net> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>

dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 4, 2020

dmitshur added this to the Backlog milestone Jun 4, 2020

stevenh mentioned this issue Jun 4, 2020

http2: limit client initial MAX_CONCURRENT_STREAMS golang/net#73

Closed

nekomeowww mentioned this issue Sep 1, 2021

x/net/http2: implementation issue #48111

Open

gopherbot closed this as completed in golang/net@f8bed21 Sep 7, 2021

dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Sep 7, 2021

dmitshur modified the milestones: Backlog, Go1.18 Sep 7, 2021

golang locked and limited conversation to collaborators Sep 7, 2022

gopherbot added the FrozenDueToAge label Sep 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

x/net/http2: max connections overflow #39389

x/net/http2: max connections overflow #39389

stevenh commented Jun 3, 2020

dmitshur commented Jun 4, 2020

gopherbot commented Jun 4, 2020

networkimprov commented Jun 7, 2020

fraenkel commented Jun 7, 2020

stevenh commented Jun 7, 2020

ofavre commented Mar 24, 2021

nekomeowww commented Sep 1, 2021 •

edited

x/net/http2: max connections overflow #39389

x/net/http2: max connections overflow #39389

Comments

stevenh commented Jun 3, 2020

What version of Go are you using (go version)?

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using (go env)?

What did you do?

What did you expect to see?

What did you see instead?

Debugging so far

dmitshur commented Jun 4, 2020

gopherbot commented Jun 4, 2020

networkimprov commented Jun 7, 2020

fraenkel commented Jun 7, 2020

stevenh commented Jun 7, 2020

ofavre commented Mar 24, 2021

nekomeowww commented Sep 1, 2021 • edited

What version of Go are you using (`go version`)?

What operating system and processor architecture are you using (`go env`)?

nekomeowww commented Sep 1, 2021 •

edited