Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: Simple HTTP server causes high thread count on macOS Monterey #49679

Open
chrisprobst opened this issue Nov 19, 2021 · 19 comments
Open
Labels
arch-arm64 compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin
Milestone

Comments

@chrisprobst
Copy link

chrisprobst commented Nov 19, 2021

What version of Go are you using (go version)?

$ go version
go version go1.17.2 darwin/arm64

Does this issue reproduce with the latest release?

Cannot test 1.17.3 currently.

What operating system and processor architecture are you using (go env)?

Apple M1 Max MacBook Pro 16.

go env Output
$ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/Users/XXX/Library/Caches/go-build"
GOENV="/Users/XXX/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/XXX/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/XXX/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/opt/homebrew/Cellar/go/1.17.2/libexec"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/opt/homebrew/Cellar/go/1.17.2/libexec/pkg/tool/darwin_arm64"
GOVCS=""
GOVERSION="go1.17.2"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/dev/null"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/06/1d1vrn6541133ymsn2s2vwvw0000gn/T/go-build3566489552=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

  1. Spawn a simple http server
package main

import "net/http"

func main() {
	http.HandleFunc("/", func(rw http.ResponseWriter, r *http.Request) {
		rw.WriteHeader(200)
	})
	if err := http.ListenAndServe(":8080", nil); err != nil {
		panic(err)
	}
}

  1. Run some requests against it
    wrk -c500 -d10 -t1 http://localhost:8080

What did you expect to see?

With the previous macOS versions (as with every other OS I've tested lately) the app uses a couple of threads (usually around 8-12 all the time).

What did you see instead?

Thread count goes very high and never drops:
Screen Shot 2021-11-19 at 16 22 40

@chrisprobst chrisprobst changed the title runtime: Net-related code causes high thread usage on macOS Monterey runtime: Simple HTTP server code causes high thread usage on macOS Monterey Nov 19, 2021
@chrisprobst chrisprobst changed the title runtime: Simple HTTP server code causes high thread usage on macOS Monterey runtime: Simple HTTP server causes high thread count on macOS Monterey Nov 19, 2021
@heschi heschi added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Nov 22, 2021
@heschi heschi added this to the Go1.18 milestone Nov 22, 2021
@heschi
Copy link
Contributor

heschi commented Nov 22, 2021

cc @neild

@odeke-em
Copy link
Member

odeke-em commented Jan 6, 2022

@chrisprobst thank you for the report and happy new year to you and to you @heschi :-)

@chrisprobst, I am on MacOS Monterrey as well and I've failed to reproduce this problem with various values of -c and -t in ranges too

Screen Shot 2022-01-05 at 8 36 35 PM

go1.17.5

go-49679-1.17.5.mp4

go1.18

go-49679-1.18.mp4

@chrisprobst can you reproduce this even right now?

@chrisprobst
Copy link
Author

@odeke-em Hi, thanks for checking. I am running on M1 Max CPU and 12.0.1 (so still not updated to 12.1, will do shortly). The problem exists both for 1.17.5 and 1.18Beta1. Will notify you when updated to 12.1.

@chrisprobst
Copy link
Author

@odeke-em I've updated to 12.1, same results. Can you reproduce with an M1 Max? I have the feeling it is linked to the M1 architecture. But the problem is still present. With Rust, etc. this problem is not present so it is related to thread management of Go probably.

@odeke-em
Copy link
Member

odeke-em commented Jan 7, 2022

Interesting, thank you @chrisprobst! Let me kindly ask a colleague @kirbyquerby who has an M1 to try to reproduce it, and then I shall go purchase an M1 Pro or M1 Max computer tomorrow morning too and try to reproduce it then debug it.

@kirbyquerby
Copy link

Here's my results with Go 1.17 on Big Sur and then after updating to Monterey. The number of threads seemed to increase (peak 22 on Big Sur -> 32 on Monterey) but didn't balloon to large values like in the OP.

Big Sur results:
https://drive.google.com/file/d/1IXAvDKuBSsuCX3GfpXCju2gKljn-sarV/view?usp=sharing

Monterey results:
https://drive.google.com/file/d/1GIg5I7rsUc19cp0aOvpnbXw7SHZX3y1-/view?usp=sharing

`go env` output
$ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN="/Users/nathan/go/bin"
GOCACHE="/Users/nathan/Library/Caches/go-build"
GOENV="/Users/nathan/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/nathan/go/pkg/mod"
GONOPROXY="github.com/orijtech/*"
GONOSUMDB="github.com/orijtech/*"
GOOS="darwin"
GOPATH="/Users/nathan/go"
GOPRIVATE="github.com/orijtech/*"
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_arm64"
GOVCS=""
GOVERSION="go1.17"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/dev/null"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/h5/hnjslbmn52zb11xzqg9dmy7w0000gn/T/go-build2021634348=/tmp/go-build -gno-record-gcc-switches -fno-common"

Let me know if I can help test anything else :D

@chrisprobst
Copy link
Author

I will record a video as soon as I can. In my example, the Threads go up to the hundreds.

@chrisprobst
Copy link
Author

1.17.5.mp4

@chrisprobst
Copy link
Author

@odeke-em @kirbyquerby As you can see, the threads go extremely high. It's now macOS 12.1 and Go 1.17.5 (1.18beta1 is the same).

@chrisprobst
Copy link
Author

@kirbyquerby I noticed that you are using an M1, maybe it's related to M1 Max? The Low-Power cores are different, maybe it is somehow affected by this?

@davecheney
Copy link
Contributor

Is this possibly related to some security software on that laptop; previously users have reported that “anti virus” software from semantic and McAfee tend to have detrimental impact on macOS computers.

@kirbyquerby
Copy link

I diff'd our go envs and noticed that you installed using brew. I tried the brew version of go1.17.5 and still don't get the issue, so we can rule that out :)

@chrisprobst
Copy link
Author

@davecheney It's a stock device used in our lab. Based on the feedback it really sounds like an M1 Max issue.

@chrisprobst
Copy link
Author

I tested this app on a few other M1 Pro and also Max devices (same CPU), same result.
Still, what could cause that different behaviour? How can a CPU model can have this impact on the app?
Is this even possible, I mean theoretically? Or is it maybe due to different system libraries that Apple might use for these CPUs?

@ianlancetaylor
Copy link
Contributor

CC @cherrymui

@odeke-em
Copy link
Member

odeke-em commented Mar 2, 2022

Unfortunately there isn't much we can do for Go1.18 but we are keeping this open for investigation for Go1.19 and if/when we encounter a need for a fix, we can backport. Moving this to the Go1.19 milestone. Thank you @chrisprobst and everyone.

@odeke-em odeke-em modified the milestones: Go1.18, Go1.19 Mar 2, 2022
@chrisprobst
Copy link
Author

chrisprobst commented Mar 22, 2022

After the release 1.18 I did some more tests and found some interesting observations that I would like to share, maybe it helps.

The M1 Max (Pro) has 8 performance cores and 2 efficiency cores. When I limit the processor count using GOMAXPROCS, the behaviour changes completely.

If I set a value between 1-4 (GOMAXPROCS=1 ... GOMAXPROCS=4), the problem disappears. The problem suddenly starts with GOMAXPROCS=5+. Also, with a value between 1-4 the performance is superb, as usual and expected. The numbers mean nothing, but the delta is interesting:

With 1-4 I get 175K req/s with wrk and the thread count stays under 10.
With 5+ I get "only" 60-75K req/s and the thread count goes wild, 300+.

I am not too much into how Go's scheduler works exactly but I do know that slow syscalls might cause other threads to spawn to take over. Now, maybe the efficiency cores are too slow to take over and Go thinks the syscall takes too long and therefore simply spawns another thread?

If this is true, even modern Alder Lake CPUs with Big.Little might have similar issues.
Again, it's a guess, but I have the feeling that it is related to this architecture design.

Let me know your thoughts.

@odeke-em @ianlancetaylor

@ianlancetaylor
Copy link
Contributor

CC @golang/runtime

I guess this did't get investigated for 1.19. Rolling forward to 1.20.

@ianlancetaylor ianlancetaylor modified the milestones: Go1.19, Go1.20 Jun 24, 2022
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 7, 2022
@mknyszek mknyszek modified the milestones: Go1.20, Backlog Nov 30, 2022
@chrisprobst
Copy link
Author

chrisprobst commented Dec 4, 2022

Just tested lately with latest 1.19 and up-to-date M1 Pro device, same result. The simple HTTP server listed above still generates hundreds of threads on macOS.

I've already mentioned that only a value of GOMAXPROCS=5+ causes issues. An M1 Pro/Max has 8 performance cores and 2 efficiency cores. A value of 4 works perfectly, so half of the performance cores. Not sure, maybe it's linked.

I really believe that the efficiency cores are the problem, maybe they take so much longer compared to high performance cores for certain operations that Go thinks it has to spawn a new thread to keep up with the load. Could this be related?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-arm64 compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin
Projects
Status: No status
Development

No branches or pull requests

8 participants