Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: build fails in tests with "too many open files" #40498

Closed
MichaelTJones opened this issue Jul 30, 2020 · 12 comments
Closed

cmd/go: build fails in tests with "too many open files" #40498

MichaelTJones opened this issue Jul 30, 2020 · 12 comments
Labels
FrozenDueToAge help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin Testing An issue that has been verified to require only test changes, not just a test failure.
Milestone

Comments

@MichaelTJones
Copy link
Contributor

MichaelTJones commented Jul 30, 2020

What version of Go are you using (go version)?

$ go version
go version devel +54e75e8f9d Thu Jul 30 16:00:05 2020 +0000 darwin/amd64

Does this issue reproduce with the latest release?

no, an in-develpment build issue

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE="off"
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/mtj/Library/Caches/go-build"
GOENV="/Users/mtj/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/mtj/Documents/gocode/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/mtj/Documents/gocode"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/Users/mtj/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/Users/mtj/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/7b/r8p9wkn118bcmz4r3blf8mwc0000gn/T/go-build078910258=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

cd $GOROOT/src; ./all.bash

What did you expect to see?

build run to completion and all tests pass.

What did you see instead?

build worked but one test fails.
this is our old nemesis "too many open files" on high core count (18 +18smt) apple machines.

[failure excerpt]
--- FAIL: TestScript (0.01s)
    --- FAIL: TestScript/mod_gopkg_unstable (0.59s)
        script_test.go:210: 
            > env GO111MODULE=on
            > cp go.mod.empty go.mod
            > go get -d gopkg.in/dummy.v2-unstable
            [stderr]
            go: downloading gopkg.in/dummy.v2-unstable v2.0.0
            go: gopkg.in/dummy.v2-unstable upgrade => v2.0.0
            > cp x.go.txt x.go
            > cp go.mod.empty go.mod
            > go list
            FAIL: testdata/script/mod_gopkg_unstable.txt:8: unexpected error starting command: pipe: too many open files
            
    --- FAIL: TestScript/mod_go_version_mixed (0.23s)
        script_test.go:210: 
            FAIL: testdata/script/mod_go_version_mixed.txt:0: open $WORK/gopath/src/sub/go.mod: too many open files
            
    --- FAIL: TestScript/mod_get_tags (0.09s)
        script_test.go:210: 
            FAIL: testdata/script/mod_get_tags.txt:0: open testdata/script/mod_get_tags.txt: too many open files
            
    --- FAIL: TestScript/mod_get_none (0.14s)
        script_test.go:210: 
            > env GO111MODULE=on
            > go mod init example.com/foo
            FAIL: testdata/script/mod_get_none.txt:3: unexpected error starting command: pipe: too many open files
            
    --- FAIL: TestScript/modfile_flag (5.39s)
        script_test.go:210: 
            # Tests the behavior of the -modfile flag in commands that support it.
            # The go.mod file exists but should not be read or written.
            # Same with go.sum. (0.037s)
            # go mod init should create a new file, even though go.mod already exists. (0.134s)
            # 'go env GOMOD' should print the path to the real file.
            # 'go env' does not recognize the '-modfile' flag. (0.084s)
            # 'go list -m' should print the effective go.mod file as GoMod though. (0.209s)
            # go mod edit should operate on the alternate file (0.200s)
            # other 'go mod' commands should work. 'go mod vendor' is tested later. (1.628s)
            # 'go list' and other commands with build flags should work.
            # They should update the alternate go.mod when a dependency is missing. (1.292s)
            # 'go mod vendor' should work. (0.122s)
            # Automatic vendoring should be broken by editing an explicit requirement
            # in the alternate go.mod file. (1.023s)
            # 'go generate' should use the alternate file when resolving packages.
            # Recursive go commands started with 'go generate' should not get an explicitly
            # passed -modfile, but they should see arguments from GOFLAGS. (0.217s)
            # The original files should not have been modified. (0.010s)
            # If the altnernate mod file does not have a ".mod" suffix, an error
            # should be reported. (0.000s)
            > cp go.alt.mod goaltmod
            FAIL: testdata/script/modfile_flag.txt:78: open $WORK/gopath/src/go.alt.mod: too many open files
            
    --- FAIL: TestScript/mod_goroot_errors (0.82s)
        script_test.go:210: 
            # Regression test for https://golang.org/issue/34769.
            # Missing standard-library imports should refer to GOROOT rather than
            # complaining about a malformed module path.
            # This is especially important when GOROOT is set incorrectly,
            # since such an error will occur for every package in std.
            # Building a nonexistent std package directly should fail usefully. (0.150s)
            # Building a nonexistent std package indirectly should also fail usefully. (0.226s)
            # Building an *actual* std package should fail if GOROOT is set to something bogus. (0.000s)
            > [!short] go build ./importjson  # Prove that it works when GOROOT is valid.
            > env GOROOT=$WORK/not-a-valid-goroot
            > ! go build ./importjson
            FAIL: testdata/script/mod_goroot_errors.txt:38: unexpected error starting command: pipe: too many open files
            
    --- FAIL: TestScript/mod_sumdb_proxy (4.46s)
        script_test.go:210: 
            # basic fetch (through proxy) works (1.514s)
            # can fetch by explicit URL (1.113s)
            # direct access fails (because localhost.localdev does not exist)
            # The text of the error message is hard to predict because some DNS servers
            # will resolve unknown domains like localhost.localdev to a real IP
            # to serve ads. (0.102s)
            # proxy 404 falls back to direct access (which fails) (0.173s)
            # proxy non-200/404/410 stops direct access (0.184s)
            # the error from the last attempted proxy should be returned. (0.070s)
            # if proxies are separated with '|', fallback is allowed on any error. (0.378s)
            > cp go.mod.orig go.mod
            > env GOSUMDB=$sumdb
            > env GOPROXY=$proxy/sumdb-503|https://0.0.0.0|$proxy
            > go get -d rsc.io/fortune@v1.0.0
            [stderr]
            go: finding module for package rsc.io/quote
            go: found rsc.io/quote in rsc.io/quote v1.5.2
            > rm $GOPATH/pkg/mod/cache/download/sumdb
            FAIL: testdata/script/mod_sumdb_proxy.txt:63: readdirnames $WORK/gopath/pkg/mod/cache/download/sumdb/localhost.localdev/sumdb/lookup: fcntl: too many open files
            
    --- FAIL: TestScript/mod_dot (1.19s)
        script_test.go:210: 
            # golang.org/issue/32917 and golang.org/issue/28459: 'go build' and 'go test'
            # in an empty directory should refer to the path '.' and should not attempt
            # to resolve an external module. (0.151s)
            > cd dir
            $WORK/gopath/src/dir
            > ! go get .
            [stderr]
            go get .: path $WORK/gopath/src/dir is not a package in module rooted at $WORK/gopath/src/dir
            [exit status 1]
            > stderr 'go get \.: path .* is not a package in module rooted at .*[/\\]dir$'
            > ! go list
            FAIL: testdata/script/mod_dot.txt:9: unexpected error starting command: pipe: too many open files
            
FAIL
FAIL	cmd/go	62.525s
@cagedmantis cagedmantis changed the title build fails in tests with "too many open files" cmd/go: build fails in tests with "too many open files" Jul 30, 2020
@cagedmantis cagedmantis added this to the Backlog milestone Jul 30, 2020
@cagedmantis cagedmantis added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jul 30, 2020
@cagedmantis
Copy link
Contributor

/cc @bcmills @jayconrod @matloob

@MichaelTJones
Copy link
Contributor Author

FYI, this is MacOS Catalina

plum:~ mtj$ uname -a
Darwin plum.local 19.6.0 Darwin Kernel Version 19.6.0: Sun Jul  5 00:43:10 PDT 2020; root:xnu-6153.141.1~9/RELEASE_X86_64 x86_64

@bcmills
Copy link
Contributor

bcmills commented Jul 31, 2020

In general we expect available resources to scale more-or-less linearly with core count. (That's why the -parallel test flag tracks GOMAXPROCS, and why GOMAXPROCS tracks the core count.)

@MichaelTJones, could you supply the output of ulimit -n and the default value of runtime.GOMAXPROCS(0) on your machine? (I assume that ulimit -n is still at its default value, and hasn't been manually lowered?)

Have you tried using ulimit -n to raise the limit explicitly? If so, at what limit do the tests pass reliably?

@bcmills bcmills added OS-Darwin WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Jul 31, 2020
@MichaelTJones
Copy link
Contributor Author

$ ulimit -n
256

runtime.GOMAXPROCS(0) = 36

did not think to try, but have done so now
build passes at 'ulimit -n 512'

do you want me to do bisectional analysis to find the least value that passes?

@brock1337
Copy link

I have seen this with Python too. @MichaelTJones, you might want to try 2048.

@bcmills bcmills added Testing An issue that has been verified to require only test changes, not just a test failure. and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Aug 1, 2020
@bcmills
Copy link
Contributor

bcmills commented Aug 4, 2020

@MichaelTJones, we're trying to figure out whether 256 is the hard limit or the soft limit. Could you run ulimit -n -H and ulimit -n -S and post the results?

@bcmills bcmills added WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Aug 4, 2020
@jayconrod
Copy link
Contributor

As a datapoint, with a MBP on macOS 10.15.5:

~ $ ulimit -n -S
256
~ $ ulimit -n -H
unlimited

@MichaelTJones
Copy link
Contributor Author

plum:~ mtj$ ulimit -n -S
256
plum:~ mtj$ ulimit -n -H
unlimited
plum:~ mtj$ sw_vers
ProductName:	Mac OS X
ProductVersion:	10.15.6
BuildVersion:	19G73

@bcmills
Copy link
Contributor

bcmills commented Aug 4, 2020

Ok, so it sounds like maybe we can use syscall.Setrlimit to raise the limit explicitly in TestMain, on macOS and any other platform that defaults to too low a value (are there any others?).

That's probably worth pursuing before we try reducing test concurrency, since ideally we really do want the test to make use of more resources when on a multi-core machine.

@MichaelTJones
Copy link
Contributor Author

MichaelTJones commented Aug 4, 2020 via email

@bcmills
Copy link
Contributor

bcmills commented Aug 5, 2020

We have plenty of high-core builders — just not for macOS, which seems to have an unusually low limit on open files.

(For comparison, the default soft limit on my 6-core Xeon workstation running a Debian Linux variant is 32,768 — or ~5k files per core — compared to the ~7 files per core observed on your machine.)

@bcmills
Copy link
Contributor

bcmills commented Mar 15, 2022

This should be fixed by CL 392415 (#46279). Please let me know if you find that is not the case.

@bcmills bcmills closed this as completed Mar 15, 2022
@golang golang locked and limited conversation to collaborators Mar 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin Testing An issue that has been verified to require only test changes, not just a test failure.
Projects
None yet
Development

No branches or pull requests

6 participants