Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: test cached run slower than real test run #26562

Open
dbentley opened this issue Jul 24, 2018 · 13 comments
Open

cmd/go: test cached run slower than real test run #26562

dbentley opened this issue Jul 24, 2018 · 13 comments
Labels
GoCommand cmd/go help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. ToolSpeed
Milestone

Comments

@dbentley
Copy link

What version of Go are you using (go version)?

1.10.3

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

linux; amd64

What did you do?

go test --count 1 .; time go test .

foo_test.go:
// package and imports snipped
func TestCache(t testing.T) {
tmp := os.TempDir()
for i := 0; i < 10
1000*1000; i++ {
os.Stat(filepath.Join(tmp, fmt.Sprintf("%d", i)))
}
}

What did you expect to see?

$ go test --count 1 .; time go test .
ok github.com/dbentley/testcache 11.839s
ok github.com/dbentley/testcache (cached)

real 2s

What did you see instead?

$ go test --count 1 .; time go test .
ok github.com/dbentley/testcache 11.839s
ok github.com/dbentley/testcache (cached)

real 1m2.278s

Further investigation

Using GODEBUG=gocachetest=1 or GODEBUG=gocachehash=1 doesn't cause output for 30+s.

I think the function inDir (currently at https://github.com/golang/go/blob/master/src/cmd/go/internal/test/test.go#L1450 ) evaluates symlinks, which requires lots of stat'ing.

It looks like this was introduced in 37d5627

@ALTree ALTree added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jul 24, 2018
@ianlancetaylor
Copy link
Contributor

Agreed. On my laptop, running this test program:

package foo_test

import (
	"fmt"
	"os"
	"path/filepath"
	"testing"
)

// package and imports snipped
func TestCache(t *testing.T) {
	tmp := os.TempDir()
	for i := 0; i < 1000*1000; i++ {
		os.Stat(filepath.Join(tmp, fmt.Sprintf("%d", i)))
	}
}

I see this:

> time go test -test.count=1 ~/foo_test.go
ok  	command-line-arguments	1.746s

real	0m1.931s
user	0m0.796s
sys	0m1.272s
> time go test ~/foo_test.go
ok  	command-line-arguments	(cached)

real	0m5.260s
user	0m3.392s
sys	0m2.328s

In other words, the cached run actually takes longer than the real run.

We may need a symlink cache. Or maybe it's something else, I'm not sure.

@ianlancetaylor ianlancetaylor added this to the Go1.12 milestone Jul 24, 2018
@ianlancetaylor ianlancetaylor changed the title cmd/go: test cache lookup seemingly pauses for a minute cmd/go: test cached run slower than real test run Jul 24, 2018
@rsc
Copy link
Contributor

rsc commented Jul 26, 2018

Or we should just drop that whole symlink thing there. I don't see the point of evaluating symlinks in inDir.

@rsc
Copy link
Contributor

rsc commented Jul 26, 2018

Also the test input list should filter those out when recorded, not each time it is used.

@gopherbot
Copy link

Change https://golang.org/cl/127635 mentions this issue: cmd/go: no longer eval symlinks in inDir

@gopherbot
Copy link

Change https://golang.org/cl/127715 mentions this issue: cmd/go: revert "cmd/go: no longer eval symlinks in inDir"

gopherbot pushed a commit that referenced this issue Aug 3, 2018
It breaks on the Darwin builders, with

--- FAIL: TestTestCacheInputs (21.65s)
    go_test.go:5371: file.txt: old=2018-08-02 15:12:21.136438 -0700 PDT m=+219.458659208, info.ModTime=2018-08-02 15:12:21 -0700 PDT
    go_test.go:5377: running testgo [test testcache]
    go_test.go:5377: standard output:
    go_test.go:5377: ok  	testcache	0.013s

    go_test.go:5378: running testgo [test testcache]
    go_test.go:5378: standard output:
    go_test.go:5378: ok  	testcache	(cached)

    go_test.go:5382: running testgo [test testcache]
    go_test.go:5382: standard output:
    go_test.go:5382: ok  	testcache	0.172s

    go_test.go:5384: running testgo [test testcache]
    go_test.go:5384: standard output:
    go_test.go:5384: ok  	testcache	(cached)

    go_test.go:5387: running testgo [test testcache -run=FileSize]
    go_test.go:5387: standard output:
    go_test.go:5387: ok  	testcache	0.012s

    go_test.go:5388: running testgo [test testcache -run=FileSize]
    go_test.go:5388: standard output:
    go_test.go:5388: ok  	testcache	(cached)

    go_test.go:5391: running testgo [test testcache -run=FileSize]
    go_test.go:5391: standard output:
    go_test.go:5391: ok  	testcache	0.008s

    go_test.go:5393: running testgo [test testcache -run=FileSize]
    go_test.go:5393: standard output:
    go_test.go:5393: ok  	testcache	(cached)

    go_test.go:5396: running testgo [test testcache -run=Chdir]
    go_test.go:5396: standard output:
    go_test.go:5396: ok  	testcache	0.417s

    go_test.go:5397: running testgo [test testcache -run=Chdir]
    go_test.go:5397: standard output:
    go_test.go:5397: ok  	testcache	(cached)

    go_test.go:5400: running testgo [test testcache -run=Chdir]
    go_test.go:5400: standard output:
    go_test.go:5400: ok  	testcache	(cached)

    go_test.go:5401: did not notice file size change
    go_test.go:5401: pattern \(cached\) found unexpectedly in standard output
FAIL

Original CL description:

    Evaluating the symlinks was slowing down test cache checks.

    Fixes #26562
    Fixes #26726

            ijt:~/gopath/src/issue26562$ cat foo_test.go
            package foo_test

            import (
                    "fmt"
                    "os"
                    "path/filepath"
                    "testing"
            )

            // package and imports snipped
            func TestCache(t *testing.T) {
                    tmp := os.TempDir()
                    for i := 0; i < 1000000; i++ {
                            os.Stat(filepath.Join(tmp, fmt.Sprintf("%d", i)))
                    }
            }
            ijt:~/gopath/src/issue26562$ time ~/github/go/bin/go test -count=1
            PASS
            ok      issue26562      9.444s

            real    0m10.021s
            user    0m2.344s
            sys     0m7.835s
            ijt:~/gopath/src/issue26562$ time ~/github/go/bin/go test .
            ok      issue26562      (cached)

            real    0m0.802s
            user    0m0.551s
            sys     0m0.306s

Updates #26562
Updates #26726

Change-Id: I7914ee57dc75bcbd7f0ea01c70bed97d67c810ea
Reviewed-on: https://go-review.googlesource.com/127715
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@bradfitz bradfitz reopened this Aug 3, 2018
@gopherbot
Copy link

Change https://golang.org/cl/127818 mentions this issue: cmd/go: cache results of EvalSymlinks

@bcmills
Copy link
Contributor

bcmills commented Nov 15, 2018

On https://golang.org/cl/127818, @rsc says:

I'd still really like to find a way to make the approach of CL 127635 - stop evaluating symlinks at all - work. I'll try to take a look in the next couple weeks.

@andybons andybons modified the milestones: Go1.12, Go1.13 Feb 12, 2019
@andybons andybons modified the milestones: Go1.13, Go1.14 Jul 8, 2019
@bcmills bcmills unassigned rsc Sep 18, 2019
@bcmills bcmills modified the milestones: Go1.14, Unplanned Sep 18, 2019
@maiamcc
Copy link

maiamcc commented Mar 4, 2021

This reproduces for me on go1.16 darwin/amd64 -- any update on when this will be fixed?

@bcmills
Copy link
Contributor

bcmills commented Mar 4, 2021

@maiamcc, this issue is milestoned Unplanned and tagged “help wanted“. To my knowledge, no one is actively working on it at the moment.

@bcmills
Copy link
Contributor

bcmills commented Jan 26, 2022

On #50775, @millerresearch notes:

Computing the TestInputsID takes a stupendously long time because it's going through a testlog.txt which lists 357000 filenames, each requiring a stat call after being expanded by filepath.EvalSymlinks (which itself requires a stat call per filename component). In fact there are only 5841 unique lines in the testlog.txt, so sorting the list and eliminating duplicates would speed that up quite a bit.

That sounds like delightfully low-hanging fruit.

@bcmills bcmills modified the milestones: Unplanned, Go1.19 Jan 26, 2022
@matthewmueller
Copy link

matthewmueller commented Apr 27, 2022

I've been struggling with this issue. The last place you think to look is go test itself. When I kill -ABRT <pid>, it does seem to be getting stuck in internal symlink libraries. Not very scientific, but I've seen my 30s tests take 15m.

@bcmills it does seem like this deduping has been tried already: #26732, or at least not computing caches so many times. There might be something helpful in that PR.

@ianlancetaylor
Copy link
Contributor

@bcmills @matloob This issue is marked for 1.19. Should it move to Backlog?

@bcmills
Copy link
Contributor

bcmills commented Jun 24, 2022

I've got a cluster of test-caching issues that I'd like to get to in the near future, but given scheduling constraints probably not until 1.21. Backlog it is for now.

@bcmills bcmills modified the milestones: Go1.19, Backlog Jun 24, 2022
@bcmills bcmills added the GoCommand cmd/go label Dec 13, 2023
@bcmills bcmills removed their assignment Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GoCommand cmd/go help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. ToolSpeed
Projects
Status: No status
Development

No branches or pull requests

10 participants