Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: TestLinkXImportPathEscape flaky on windows/amd64 #19491

Closed
josharian opened this issue Mar 10, 2017 · 15 comments
Closed

cmd/go: TestLinkXImportPathEscape flaky on windows/amd64 #19491

josharian opened this issue Mar 10, 2017 · 15 comments
Labels
FrozenDueToAge help wanted NeedsFix The path to resolution is known, but the work has not been done. OS-Windows Testing An issue that has been verified to require only test changes, not just a test failure.
Milestone

Comments

@josharian
Copy link
Contributor

Noticed during a trybot run for CL 38006:

https://storage.googleapis.com/go-build-log/0401f0f8/windows-amd64-gce_ffacd647.log

--- FAIL: TestLinkXImportPathEscape (0.38s)
	go_test.go:260: running testgo [build -o ./linkx.exe -ldflags -X=my.pkg.Text=linkXworked my.pkg/main]
	go_test.go:167: remove ./linkx.exe: Access is denied.
FAIL
FAIL	cmd/go	83.071s
@josharian josharian added the Testing An issue that has been verified to require only test changes, not just a test failure. label Mar 10, 2017
@josharian josharian added this to the Go1.9Maybe milestone Mar 10, 2017
@alexbrainman
Copy link
Member

remove ./linkx.exe: Access is denied.

That is what you get if you try removing executable file while program is still running.

Alex

@bradfitz
Copy link
Contributor

That doesn't appear to be the issue, since linkx.exe is never ran in that test.

The sketchy thing though is that the test is a parallel test, but the deferred cleanup method does an os.Chdir.

I haven't taken the time to gomote a test run with high -count=NNN yet to repro it, but that's my guess.

@johnsonj
Copy link
Contributor

johnsonj commented Apr 28, 2017

Another hit: https://build.golang.org/log/da012f4f18560e41d0a0c309cfd92fd005936a29 (on windows-amd64-2012)

@alexbrainman
Copy link
Member

linkx.exe is never ran in that test.

Why do you say that?

The sketchy thing though is that the test is a parallel test, but the deferred cleanup method does an os.Chdir.

It does not:

C:\dev\go\src\cmd\go>git diff
diff --git a/src/cmd/go/go_test.go b/src/cmd/go/go_test.go
index 0b1fe70..2a3f52e 100644
--- a/src/cmd/go/go_test.go
+++ b/src/cmd/go/go_test.go
@@ -600,13 +600,16 @@ func (tg *testgoData) wantNotStale(pkg, reason, msg string

 // cleanup cleans up a test that runs testgo.
 func (tg *testgoData) cleanup() {
+       tg.t.Logf("ALEX1\n")
        if tg.wd != "" {
+               tg.t.Logf("ALEX2\n")
                if err := os.Chdir(tg.wd); err != nil {
                        // We are unlikely to be able to continue.
                        fmt.Fprintln(os.Stderr, "could not restore working direc
                        os.Exit(2)
                }
        }
+       tg.t.Logf("ALEX3\n")
        for _, path := range tg.temps {
                tg.check(os.RemoveAll(path))
        }

C:\dev\go\src\cmd\go>go test -run=TestLinkXImportPathEscape -v
=== RUN   TestLinkXImportPathEscape
--- PASS: TestLinkXImportPathEscape (0.58s)
        go_test.go:269: running testgo [build -o ./linkx.exe -ldflags -X=my.pkg.Text=linkXworked my.pkg/main]
        go_test.go:603: ALEX1
        go_test.go:612: ALEX3
PASS
ok      cmd/go  12.654s

C:\dev\go\src\cmd\go>

a test run with high -count=NNN yet to repro it, ...

I get nothing here:

C:\dev\go\src>go test -run=TestLinkXImportPathEscape -count=1000 cmd/go
ok      cmd/go  251.248s

C:\dev\go\src>

Alex

@josharian
Copy link
Contributor Author

Two more on the front page of build.golang.org right now (testing 6897030 / CL 42430). The potentially interesting thing is that both windows-386-2008 and windows-amd64-2012 failed on the same commit. I'm not sure what they share (hardware? pods?), but tying this to some shared resource might provide a hint.

@bradfitz
Copy link
Contributor

They share nothing. Each Windows build is done in its own VM and then destroyed.

@johnsonj
Copy link
Contributor

I can repro a similar failure on a GCE buidlet (Server 2012) with 1000x runs:

$ gomote run -path  'C:/godep/gcc64/bin,$WORKDIR/go/bin,$PATH' -e 'GOROOT=C:\workdir\go' "$BUILDLET" go/bin/go.exe test -run=TestLinkXImportPathEscape -count=1000 cmd/go

--- FAIL: TestLinkXImportPathEscape (0.00s)
	go_test.go:160: GetFileAttributesEx ./linkx.exe: Access is denied.
[ repeated several times ]

@bradfitz bradfitz modified the milestones: Go1.9, Go1.9Maybe May 22, 2017
@rsc
Copy link
Contributor

rsc commented Jun 23, 2017

Using windows-amd64-gce builder I was unable to reproduce with:

gomote run -path '$WORKDIR/go/bin,$PATH' $VM go/bin/go test -run=TestLinkXImportPathEscape -v -count=1000 cmd/go

@broady broady modified the milestones: Go1.9Maybe, Go1.9 Jul 17, 2017
@bradfitz bradfitz modified the milestones: Go1.9Maybe, Go1.10 Jul 20, 2017
@rsc
Copy link
Contributor

rsc commented Dec 1, 2017

No update since June when I could not reproduce the problem. Timed out.

@rsc rsc closed this as completed Dec 1, 2017
@mvdan
Copy link
Member

mvdan commented Apr 6, 2018

This is still hitting us from time to time; these failures are all in the four most recent build.golang.org pages:

https://build.golang.org/log/4d702c4750f3910bafed225c17694a537caea679
https://build.golang.org/log/de9ece6da16cb8f3279157f69cb3f56c80363a69
https://build.golang.org/log/49fdf1941aa90a5b6d8bb5b1bd35458226860ee9
https://build.golang.org/log/d995f239f753c573137f125bdd0599b12120b883
https://build.golang.org/log/7014743d36bf8e6a15f5cda5b318a18b78951de6

Given the four Windows builders and 30 builds per page, 5 occurrences in those gives a quick and dirty ~1% failure rate. That seems problematic enough to warrant reopening and investigating further.

@bcmills
Copy link
Contributor

bcmills commented Jun 6, 2019

@bcmills bcmills changed the title cmd/go: TestLinkXImportPathEscape flake on windows/amd64 cmd/go: TestLinkXImportPathEscape flaky on windows/amd64 Jun 10, 2019
@bcmills bcmills added the NeedsFix The path to resolution is known, but the work has not been done. label Jun 10, 2019
@bcmills
Copy link
Contributor

bcmills commented Jun 10, 2019

Is this related to (or the same issue as) #25965?
Now that #32188 is addressed, this seems to be the largest source of test flakes on Windows.

A couple more examples:
https://build.golang.org/log/01ac695423831a5d648300163646dcf173c1c849
https://build.golang.org/log/0d6419f868c42defdfe365364c60d8ab97760ea0

@gopherbot
Copy link

Change https://golang.org/cl/181541 mentions this issue: cmd/go: factor the I/O-retry logic out of renameio

@bcmills
Copy link
Contributor

bcmills commented Jun 11, 2019

Now that I have a workaround in hand, I'm unable to replicate the original failure on a windows-amd64-2016 builder.

I notice that all of the other links here seem to be on pre-2016 Windows releases too. I'm going to try a 2012 builder and see if that changes things — perhaps this was a kernel bug that has since been fixed. (That would be nice!)

@bcmills
Copy link
Contributor

bcmills commented Jun 11, 2019

I was able to get a repro on the windows-amd64-2012 builder only by running other tests alongside TestLinkXImportPathEscape, from which it was difficult to extract a useful signal due to the flakiness of the gomote connection (#28365). That makes it difficult to draw any conclusions about whether the issue is fixed in the 2016 edition.

Nonetheless, the workaround CL does appear to fix the flakes on the 2012 edition.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge help wanted NeedsFix The path to resolution is known, but the work has not been done. OS-Windows Testing An issue that has been verified to require only test changes, not just a test failure.
Projects
None yet
Development

No branches or pull requests

10 participants