Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: http/2 throughput is very slow compared to http/1.1 #47840

Open
kgersen opened this issue Aug 20, 2021 · 44 comments
Open

net/http: http/2 throughput is very slow compared to http/1.1 #47840

kgersen opened this issue Aug 20, 2021 · 44 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@kgersen
Copy link

kgersen commented Aug 20, 2021

What version of Go are you using (go version)?

1.7

$ go version
go version go1.17 linux/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

Linux, Windows, etc

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/user/.cache/go-build"
GOENV="/home/user/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/user/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/user/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.17"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/user/dev/http2issue/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build38525640=/tmp/go-build -gno-record-gcc-switches"

What did you do?

test http/2 vs http1.1 transfert speed with client and server from standard lib

see a complete POC here: https://github.com/nspeed-app/http2issue

The issue is general, loopback (localhost) or over the wire.

What did you expect to see?

same speed order than http1.1

What did you see instead?

http/2 is at least x5 slower or worst

@neild
Copy link
Contributor

neild commented Aug 20, 2021

You're comparing encrypted HTTP/2 with unencrypted HTTP. You need to compare HTTP/2 with HTTPS to compare like with like.

@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 20, 2021
@mknyszek mknyszek added this to the Backlog milestone Aug 20, 2021
@kgersen
Copy link
Author

kgersen commented Aug 20, 2021

hi, the POC is using H2C to avoid encryption.
I also compared encrypted versions with the product we're developing and we have the same issue.

@kgersen
Copy link
Author

kgersen commented Aug 25, 2021

Caddy a web server written in Go has the same issue. I've updated the POC with a sample Caddy config: https://github.com/nspeed-app/http2issue/tree/main/3rd-party/Caddy

@neild
Copy link
Contributor

neild commented Nov 8, 2021

hi, the POC is using H2C to avoid encryption.

Apologies, I missed this in the original issue and didn't see the followup.

I have not had time to look at this further, adding to my queue. (But no promises on timing.)

@andig
Copy link
Contributor

andig commented Nov 8, 2021

Wouldn‘t the first step be to run this against a non-go server/client to localize it on either side if possible?

@tandr
Copy link

tandr commented Nov 10, 2021

Robert Engels has a CL related to this https://go-review.googlesource.com/c/net/+/362834

@robaho
Copy link

robaho commented Nov 10, 2021

Copying my comment from https://groups.google.com/d/msgid/golang-nuts/89926c2f-ec73-43ad-be49-a8bc76a18345n%40googlegroups.com

Http2 is a multiplexed protocol with independent streams. The Go implementation uses a common reader thread/routine to read all of the connection content, and then demuxes the streams and passes the data via pipes to the stream readers. This multithreaded nature requires the use of locks to coordinate. By managing the window size, the connection reader should never block writing to a steam buffer - but a stream reader may stall waiting for data to arrive - get descheduled - only to be quickly rescheduled when reader places more data in the buffer - which is inefficient.

Out of the box on my machine, http1 is about 37 Gbps, and http2 is about 7 Gbps on my system.

Some things that jump out:

  1. The chunk size is too small. Using 1MB pushed http1 from 37 Gbs to 50 Gbps, and http2 to 8 Gbps.

  2. The default buffer in io.Copy() is too small. Use io.CopyBuffer() with a larger buffer - I changed to 4MB. This pushed http1 to 55 Gbs, and http2 to 8.2. Not a big difference but needed for later.

  3. The http2 receiver frame size of 16k is way too small. There is overhead on every frame - the most costly is updating the window.

I made some local mods to the net library, increasing the frame size to 256k, and the http2 performance went from 8Gbps to 38Gbps.

  1. I haven’t tracked it down yet, but I don’t think the window size update code is not working as intended - it seems to be sending window updates (which are expensive due to locks) far too frequently. I think this is the area that could use the most improvement - using some heuristics there is the possibility to detect the sender rate, and adjust the refresh rate (using high/low water marks).

  2. The implementation might need improvements using lock-free structures, atomic counters, and busy-waits in order to achieve maximum performance.

So 38Gbps for http2 vs 55 Gbps for http1. Better but still not great. Still, with some minor changes, the net package could allow setting of a large frame size on a per stream basis - which would enable much higher throughput. The gRPC library allows this.

@robaho
Copy link

robaho commented Nov 12, 2021

My CL goes a long way to addressing the issue.

Still, some additional testing has shown that the calls to update the window from the client (the downloader) don't seem to be optimal for large transfers - even with the 10x frame size. The window update calls cause contention on the locks.

@robaho
Copy link

robaho commented Nov 12, 2021

Changing trasport.go:2418 from
if v < transportDefaultStreamFlow-transportDefaultStreamMinRefresh {
to
if v < transportDefaultStreamFlow/2 {
results in a nearly 50% increase in throughput using the default frame size of 16k.

Ideally, this code would make the determination based on receive and consume rates along with the frame size.

@aojea
Copy link
Contributor

aojea commented Dec 1, 2021

I don't know if you are referring to this buffer in one of your comments, but using the profiler it shows a lot of contention on the bufWriterPool

go/src/net/http/h2_bundle.go

Lines 3465 to 3468 in 0e1d553

// TODO: pick a less arbitrary value? this is a bit under
// (3 x typical 1500 byte MTU) at least. Other than that,
// not much thought went into it.
const http2bufWriterPoolBufferSize = 4 << 10

Increasing the value there to the maxFrameSize value, and using the shared code in the description, you can go from 6Gbps to 12.6 Gbps

@robaho
Copy link

robaho commented Dec 1, 2021

As I pointed out above, increasing the frame size can achieve 38 Gbps. The issue is that constant is used for all connections. The 'max frame size' is connection dependent.

More importantly, that constant does not exist in golang.org/x/net/http - which is the basis of the future version.

@aojea
Copy link
Contributor

aojea commented Dec 2, 2021

As I pointed out above, increasing the frame size can achieve 38 Gbps. The issue is that constant is used for all connections. The 'max frame size' is connection dependent.

yeah, I've should explained myself better, sorry, in addition to increase the frame size that require manual configuration, it is an user decision to maximize throughput, I just wanted to add that maybe we can improve the default throughput with that change, since it is clear that the author left open that parameter to debate and it is a 2x win (at a cost of increasing memory cost of course, but this is inside a sync.Pool that may alleviate this problem a bit)

More importantly, that constant does not exist in golang.org/x/net/http - which is the basis of the future version.

it does, just with a different name 😄

https://github.com/golang/net/blob/0fccb6fa2b5ce302a9da5afc2513d351bd175889/http2/http2.go#L256-L259

IIUIC the http2 code in golang/go is a bundle created from the x/net/http2

@neild
Copy link
Contributor

neild commented Dec 6, 2021

Thanks for the excellent analysis!

Ideally, we shouldn't require the user to twiddle configuration parameters to get good performance. However, making the maximum client-initiated frame size user-configurable seems like a reasonable first step.

@gopherbot
Copy link

Change https://golang.org/cl/362834 mentions this issue: http2: add Transport.MaxReadFrameSize configuration setting

@kgersen
Copy link
Author

kgersen commented May 19, 2022

any update to this ? is someone at Google even working on this or is there no point waiting ?

@robaho
Copy link

robaho commented May 20, 2022

There has been a CL submitted - it is stuck in review.

@jfgiorgi
Copy link

There has been a CL submitted - it is stuck in review.

some update: according to https://groups.google.com/g/golang-dev/c/qzNbs3phVuI/m/fcFcCCsaBQAJ
Your CL awaits a response from you.

@robaho
Copy link

robaho commented Aug 31, 2022

The scope get going beyond what was necessary to address the issue. The last comment isn’t a question or change request - it simply an idea.

@robaho
Copy link

robaho commented Aug 31, 2022

I am more than willing to make additional changes but I need specifics.

@neild
Copy link
Contributor

neild commented Sep 2, 2022

Sorry you felt that the review scope was expanding. For what it's worth, I thought my last comment on that CL was actionable: The Transport.MaxReadFrameSize setting needs to be consistent with Server.MaxReadFrameSize:

  • The default value when Transport.MaxReadFrameSize is 0 should be 1MiB, for consistency with Server.MaxReadFrameSize.
  • The doc comment for Transport.MaxReadFrameSize should match that of Server.MaxReadFrameSize.
  • An out-of-spec value for Transport.MaxReadFrameSize (outside [16KiB, 16MiB]) should be treated as the default, again for consistency with Server.MaxReadFrameSize.
  • The client should always send the SETTINGS_MAX_FRAME_SIZE option.

Apologies for not being clearer.

For general reference, Gerrit has a feature called the "attention set", which is the set of people who will see a CL in their dashboards. Reviewers with their name in bold are in the attention set. If you'd like a response from a reviewer make sure that they're in the attention set, because otherwise it's pretty invisible. The author making comments on a CL should add all the reviewers into the attention set.

In addition, if you think that a reviewer's comments are insufficiently actionable, out of scope, or just not worth doing, it's always okay to tell us that.

One thing I mistakenly didn't note in the review of https://go.dev/cl/362834 is that since this is expanding the API surface of golang.org/x/net/http2, it should have a proposal. I just filed #54850 with a proposal.

@robaho, if you would like to update your CL, I'm happy to continue reviewing it. Otherwise, I'll take over adding the Transport.MaxReadFrameSize setting after #54850 is approved.

@robaho
Copy link

robaho commented Sep 3, 2022

Sorry, after signing in I saw that a couple of comments I had drafted remained in the draft state and were never sent.

gopherbot pushed a commit to golang/net that referenced this issue Nov 15, 2022
For golang/go#47840.
Fixes golang/go#54850.

Change-Id: I44efec8d1f223b3c2be82a2e11752fbbe8bf2cbf
Reviewed-on: https://go-review.googlesource.com/c/net/+/362834
Run-TryBot: Damien Neil <dneil@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
Auto-Submit: Damien Neil <dneil@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
WeiminShang pushed a commit to WeiminShang/net that referenced this issue Nov 16, 2022
For golang/go#47840.
Fixes golang/go#54850.

Change-Id: I44efec8d1f223b3c2be82a2e11752fbbe8bf2cbf
Reviewed-on: https://go-review.googlesource.com/c/net/+/362834
Run-TryBot: Damien Neil <dneil@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Roland Shoemaker <roland@golang.org>
Auto-Submit: Damien Neil <dneil@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
@joshdover
Copy link

For my info, when would https://go.dev/cl/362834 be available in a Go release? Is it eligible for 1.19.x or only 1.20?

@seankhliao
Copy link
Member

per minor release policy this should only be in for 1.20

@francislavoie
Copy link

francislavoie commented Nov 28, 2022

I've been watching this issue as a maintainer of Caddy.

I'm seeing from the proposal #54850 and the merged changes that changes are only being made to the HTTP/2 transport (i.e. client).

Apparently there already exists a MaxReadFrameSize option in the HTTP/2 server, but we're using the http.Server auto-configure right now; I don't see any way to set MaxReadFrameSize via http.Server, unless I'm missing something obvious.

Is the plan to adjust the automatic behaviour so that it's tuned better by default for the server as well? If not, how are we meant to tune this knob? I'd rather not have to stop using http.Server's auto-configure so that we can fix this performance issue.

@edwardwc
Copy link

You can pass in the http.Server to modify MaxReadFrameSize:

	http2.ConfigureServer(<your http.Server obj>, &http2.Server{
		MaxReadFrameSize:             256000, // 256K read frame, https://github.com/golang/go/issues/47840
	})

@robaho
Copy link

robaho commented Nov 28, 2022

You can pass in the http.Server to modify MaxReadFrameSize:

	http2.ConfigureServer(<your http.Server obj>, &http2.Server{
		MaxReadFrameSize:             256000, // 256K read frame, https://github.com/golang/go/issues/47840
	})

Fixing it on the server won’t help. A properly written server uses the value provided by the client or the default - which is 16k.

Changing the server side typically either 1) is telling the client how big a buffer it will receive or 2) restricting the size received from the client.

The connections are bidirectional with independent values for client and server.

@francislavoie
Copy link

francislavoie commented Nov 28, 2022

Thanks @edwardwc, that's a very wacky API! I definitely missed that.

@robaho According to #47840 (comment) they were seeing performance problems with Caddy without an HTTP/2 client (e.g. Caddy's reverse_proxy, which can use an HTTP/2 transport, either H2C or HTTPS). So I'm not sure I understand. How would we fix the performance issue in Caddy, then?

@robaho
Copy link

robaho commented Nov 28, 2022

I don’t know what Caddy is, but often http2 will be used behind the scenes automatically.

@francislavoie
Copy link

francislavoie commented Nov 28, 2022

Caddy is a general purpose HTTP server written in Go. Configurable by users via a config file. Supports HTTP 1/2/3.

Yes, we do get HTTP/2 support automatically from Go stdlib. But you can see from https://github.com/nspeed-app/http2issue#caddy that they had performance issues with HTTP/2. From your analysis earlier in this issue, I assumed the fix would be to increase MaxReadFrameSize. Are you saying that's not the case, and that only helps in client mode?

@robaho
Copy link

robaho commented Nov 28, 2022

Many http2 clients have this issue - not just Go. But I’m not sure I understand your comment.

@francislavoie
Copy link

In https://github.com/nspeed-app/http2issue#caddy, they used curl as the client, with Caddy as the HTTP/2 server (i.e. Go stdlib implementation). Do you think curl is causing a bottleneck, then?

I don't have a deep enough understanding of HTTP/2 internals to know where the problem lies. I'm just trying to grasp if there's anything we (Caddy maintainers) should do to mitigate any performance issues for our users.

@robaho
Copy link

robaho commented Nov 28, 2022

Yes. You need to provide similar options to the curl command to increase the buffer size.

@robaho
Copy link

robaho commented Nov 29, 2022

libcurl uses nghttp2 for http2 support. I submitted similar changes to nghttp2 here

@robaho
Copy link

robaho commented Nov 29, 2022

The other related nghttp2 issue is nghttp2/nghttp2#1647 which doesn't appear the author wants to fix as suggested.

@krotz-dieter
Copy link

Have also the same issue, in my case HTTP/2 is always half of speed with HTTP/1.1 (server exposing DICOMWeb APis).
Tried all the settings like MaxReadFrameSize = 16MB or MaxConcurrentStreams = 250, still no improvements.
Using go version 1.20.2.

@robaho
Copy link

robaho commented Mar 22, 2023

It could be on the server side. You need to provide more analysis.

@guanzo
Copy link

guanzo commented May 17, 2023

I ran some http1 vs http2 benchmarks and found that http2 throughput suffers the larger the files are.

Changing the tcp congestion control algo to "bbr" helped a lot (https://blog.cloudflare.com/http-2-prioritization-with-nginx/).
ttfb and bandwidth improved by 10x...

fwiw, I ran some node.js benchmarks and http2 also performs worse than http1.

Maybe the issue is with the protocol itself. I'll just wait for http3 🤷

@robaho
Copy link

robaho commented May 17, 2023

as mentioned prior, you need to ensure both sides of the connection are fixed.

@mitar
Copy link
Contributor

mitar commented Sep 15, 2023

I think this issue is why Docker/Moby disabled HTTP2 for pushing? moby/buildkit#1420

@jfgiorgi
Copy link

probably but since we opened this issue, the Go team made some changes to http/2 performance:

We made a small program here: https://github.com/kgersen/h3ctx to benchmark the 3 versions of HTTP.

Here are some result for http/1.1 vs http/2 vs http/3 :

HTTP/1.1             13.8 Gbps
HTTP/2               5.2 Gbps
HTTP/3 with GSO:     4.7 Gbps
HTTP/3 without GSO:  1.6 Gbps

It's a test on the loopback interface with tls 1.3 (so it's a cpu bound test but the same perf ratios are observed between multigigabit machines).

Without HTTP/2 explicit settings, HTTP/1.1 is >2x faster.

With Go 1.7 it was x5

Now with Go 1.21.1 when using explicit HTTP/2 settings (MaxReadFrameSize) we can double HTTP/2 performance:

Here are some sample results from our app made with Go and using such explicit settings for HTTP/2:

batch #0:
 Id| Read speed| Write speed| Time| Bytes read| Bytes written|command
 #1|  16.4 Gbps|       0 bps| 8.00|    16.4 GB|           0 B|get -http1.1 https://[::1]:40129/20g (IPv6 - 0.122 ms - HTTP/1.1 - TLS 1.3)

batch #1:
 Id| Read speed| Write speed| Time| Bytes read| Bytes written|command
 #3|  10.4 Gbps|       0 bps| 8.00|    10.4 GB|           0 B|get -http2 https://[::1]:45075/20g (IPv6 - 0.99 ms - HTTP/2.0 - TLS 1.3)

so 16 Gbps for HTTP/1.1 vs 10 Gbps HTTP/2

Over the Internet, the congestion protocol can also change the result if there is congestion and packet loss, even more if you're competing with other machines using bbr without using bbr yourself but this impact should be the same for both version of HTTP.

Our preliminary measures with HTTP/3 (quic-go) are that not good (there is no MaxReadFrameSize setting to boost performance like with HTTP/2). The Go team own HTTP/3 implementation is still been developed so no results yet.

tl;dr: keep using HTTP/1.1 if bandwidth performance is an issue (pushing big data over high speed networks.

@mitar
Copy link
Contributor

mitar commented Sep 15, 2023

I came here by trying to understand this issue from GitLab, where multiple people are noticing very slow uploads when using kaniko for Docker push. So it seems to me there are still some issues with speed?

@robaho
Copy link

robaho commented Sep 15, 2023

As I showed above it is possible to have http/2 match the performance of http/1 with the changes I detailed.

The changes are aligned with the http/2 specification.

@suleimi
Copy link

suleimi commented Sep 21, 2023

Fixing it on the server won’t help. A properly written server uses the value provided by the client or the default - which is 16k.

@robaho For my understanding, does the current http2 server implementation in go honour the value when provided by the client (up to a maximum)? Or would that be the next step after https://go.dev/cl/362834 was merged?

@robaho
Copy link

robaho commented Sep 21, 2023

As I said earlier - there is a setting for both directions. Slow uploads imply a too small frame size on the server. Slow downloads imply a too small frame size on the client.

It has been a while since I looked at this so your mileage may vary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests