Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: TestExecutableGOROOT occasionally fails with "text file busy" #37306

Closed
josharian opened this issue Feb 20, 2020 · 15 comments
Closed

cmd/go: TestExecutableGOROOT occasionally fails with "text file busy" #37306

josharian opened this issue Feb 20, 2020 · 15 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Testing An issue that has been verified to require only test changes, not just a test failure.
Milestone

Comments

@josharian
Copy link
Contributor

Noticed on the build dashboard today:

https://build.golang.org/log/b1c4550d9629753e0fc2909f97f5610c385b0846

@matloob

@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Feb 20, 2020
@bcmills bcmills added this to the Go1.15 milestone Feb 20, 2020
@bcmills bcmills changed the title cmd/go: TestExecutableGOROOT flake cmd/go: TestExecutableGOROOT is flaky Feb 20, 2020
@matloob
Copy link
Contributor

matloob commented Feb 20, 2020

Hm, the only change we made to this test is adding the t.Parallel() call. I'll see if I can figure out what the problem is.

@josharian
Copy link
Contributor Author

I vaguely recall that a few years ago several of us undertook an effort to speed up all.bash. That included sprinkling a bunch of t.Parallel calls. Several of those had to get backed out, particularly in the runtime and cmd/go, because they caused flakiness. In some cases it was scheduling delays, in others assumptions that no one else shared resources we were tracking (fds), in others it was mutating global testing state like hooks.

Which doesn’t shed much light in this case; it is just to say “just adding t.Parallel” is known to be a common cause of flakiness. :)

@matloob
Copy link
Contributor

matloob commented Feb 20, 2020

Yup, that gives me more confidence that it was the right call to convert tests to the script framework rather than put in t.Parallel. I'll remove this particular t.Parallel at if I can't figure out what's going on by EOD.

@bcmills bcmills changed the title cmd/go: TestExecutableGOROOT is flaky cmd/go: TestExecutableGOROOT occasionally fails with "text file busy" Feb 20, 2020
@bcmills bcmills added the Testing An issue that has been verified to require only test changes, not just a test failure. label Feb 20, 2020
@bcmills
Copy link
Contributor

bcmills commented Feb 20, 2020

A hypothesis: perhaps the filesystem can't keep up during the parallel portion of the test, and certain kernels fail with ETXTBSY when attempting to execute a binary that has not yet been flushed to the filesystem?

I would be inclined to add a df.Sync() at the end of the copyFile helper function to see if that helps.

@gopherbot
Copy link

Change https://golang.org/cl/220317 mentions this issue: cmd/go: in tests, add sync before closing file in copyFile helper

@matloob
Copy link
Contributor

matloob commented Feb 20, 2020

Okay, let's see if it helps. Sent a CL to add the Sync in.

gopherbot pushed a commit that referenced this issue Feb 20, 2020
An experiment to see if this helps flakiness go away in
TestExecutableGOROOT.

Updates #37306

Change-Id: I2f4f63bdb507359ca07267d86cdb41fe4968f151
Reviewed-on: https://go-review.googlesource.com/c/go/+/220317
Run-TryBot: Michael Matloob <matloob@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
@bcmills
Copy link
Contributor

bcmills commented Feb 20, 2020

Well, there goes that theory:
https://build.golang.org/log/60bf5a1fb2bf300903733c248c3ea30f3f4bec52

At least the negative signal was immediate! 😅

@bcmills
Copy link
Contributor

bcmills commented Feb 20, 2020

Oh, sweet! This is a symptom of #22315.

@bcmills
Copy link
Contributor

bcmills commented Feb 20, 2020

Actually, it's not. That issue is talking about a bug that occurs despite O_CLOEXEC, but we're not even setting that yet. Let's try adding O_CLOEXEC to the os.OpenFile call.

@matloob
Copy link
Contributor

matloob commented Feb 20, 2020

alright, another CL on its way!

@bcmills
Copy link
Contributor

bcmills commented Feb 20, 2020

Reading #22315 in more detail, I suspect that it will still apply.

A more extreme solution might be to have TestExecutableGOROOT fork off a separate subprocess to copy the executable file. When that subprocess exits, we will know that the file does not remain open on any thread.

A less extreme solution might be to move the copyFile call ahead of the call to tg.parallel().

I think we can ditch the Sync, too.

@matloob
Copy link
Contributor

matloob commented Feb 20, 2020

I've decided to go for the most extreme solution: to just convert the test to the script framework.

@gopherbot
Copy link

Change https://golang.org/cl/220319 mentions this issue: cmd/go: convert TestExecutableGOROOT to the script framework

@bcmills
Copy link
Contributor

bcmills commented Feb 20, 2020

I wonder whether that will even work — I don't think the cp command in script_test.go currently executes in a subprocess...

@golang golang locked and limited conversation to collaborators Feb 19, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Testing An issue that has been verified to require only test changes, not just a test failure.
Projects
None yet
Development

No branches or pull requests

4 participants