Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/tools/cmd/goimports: escaping Go path resulting in 55 million syscalls #17916

Closed
cstockton opened this issue Nov 14, 2016 · 10 comments
Closed
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@cstockton
Copy link

What version of Go are you using (go version)?

Go 1.7.3

What operating system and processor architecture are you using (go env)?

Linux / amd64

What did you do?

I am using Atom with go-plus, which runs goimports on file save. It's always taken a small delay of 500ms-1s to save files, something I've lived with because it's convenient. Today I symlinked a folder in the root of my Go path from my backup partition to access previous work temporarily.. after saving a file that needed a new import added atom locked up for 20-30 seconds, I waited and after this the notice delays were still 4-6 seconds each save.

I decided to see what the cause was and started with a top, noticed all 24 of my dual xeon processors maxed out with goimports. I run goimports -d client_test.go on the file I was working on and noticed the same delay. Then I ran strace -o out -f goimports -d client_test.go to see what work it was doing, this resulted in about 10 minutes and a 6gb strace file -rw-r--r-- 1 cstockton cstockton 6179399530 Nov 14 11:16 out. After inspecting it I noticed it was primarily stat and open calls. I grepped out all of my $GOPATH and noticed it was entering into other directories on my machines main storage.

I believe the reason this was happening is due to a unrelated python project located at $GOPATH(/one)/local/py/ which had a symlink to the parent dir of my $GOPATH (/one). This is the mount points root file system which is bind mounted with mount -t none -o bind /storage/one /one though I don't think this detail matters, it could. The /storage dir is where I keep all my files and goimports ends up visiting every file, music directory, etc. I haven't noticed this because I have 6 ssd's in raid 10 and 256gb of ram, so the first pass is slow but my OS caching makes work reasonable after that despite this heavy I/O.

When I symlinked to my /backup partition to reference another project the same thing happened, but to 8 physical spinning disks with many times the data. I'm not sure how it ended up in the root of my backup partition.

File:
-rw-r--r-- 1 cstockton cstockton 6179399530 Nov 14 11:16 out

Grepping out signal noise, the total syscalls was 55million:
cat out |grep -v " <... " | wc -l 55861212

What did you expect to see?

No escaping my GOPATH under any circumstance, but perhaps it is documented that symlinks outside GOPATH are permitted and this could be considered my "fault" and not a bug. However if memory serves me properly in the past symlinks did not work well with much of Go's tooling and build tools so I've never relied on them, so didn't think anything of it.

What did you see instead?

Escaping GOPATH.

Sorry for the noise if this is not a bug or the appropriate place to post this, I visited the tools repo but it only has contribution guidelines, nothing for bugs or issues.

@bradfitz bradfitz added this to the Unreleased milestone Nov 14, 2016
@bradfitz
Copy link
Contributor

When we don't support symlinks, people complain.
When we DO support symlinks, people complain. :-)

Do you know about the $GOPATH/src/.goimportsignore config file? Sounds like maybe you just need to blacklist some directories.

I don't think we want to disable symlinks.

@bradfitz bradfitz added NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Nov 14, 2016
@cstockton
Copy link
Author

@bradfitz Yes I looked at the goimportsignore, but it doesn't support globbing and paths are relative to the goimportsfile, I tried setting it a few different ways and I couldn't find out how to stop the behavior since it's a blacklist rather than a whitelist. If absolute paths worked I could just dump all root partitions into it and the folders in my /one folder.

# Only allow $GOPATH
/storage
/backup
/home
...
/one/cfg
/one/etc
/one/usr
# allow my $GOPATH
# /one/src   
...

Though supporting regex or even globbing be even better, because I could have a simple negation:

# Ignore anything that is not my $GOPATH
^/one/src

I would implement either one if you guys wanted to support it.

@bradfitz
Copy link
Contributor

I'd prefer no regex or globbing. That'll also hurt performance, having to evaluate that everywhere.

I think the bug is that x/tools/imports/fix.go's func shouldTraverse needs to call func skipDir after it's determined that ts.IsDir() is true.

Want to test and send that?

@cstockton
Copy link
Author

cstockton commented Nov 14, 2016

Sure, I'll look after work. Slightly unrelated- but has their ever been a discussion about a "Go Daemon" that ran the Go tools in a persistent process if someone chose to. Something similar to Go doc's server option, but able to serve other tools? I'm not sure how things in cmd/compile work, that may be unreasonable (but I imagine it would make for some very interesting optimization possibilities for compilation), but the simpler members of the go tooling might be nice (i.e. goimports).

@bradfitz
Copy link
Contributor

It's been discussed. The Google-internal version of goimports worked like that for a while. The Google build system (https://bazel.build/) works like that. There's compiler cleanup that's taking priority at the moment before we're even at a position to prototype enough to see whether it's a good idea. It might end up paying off on Windows & Darwin (slow filesystems, poor caching) more than Linux. But we'll see.

@cstockton
Copy link
Author

Hello, I just looked at this code and based on the description:

..., or lines naming a directory relative to the configuration file to ignore when scanning.

How it works now, assuming the GOPATH structure of:

/one/src
/one/src/backup --> /path/to/backup (symlink)
/one/src/backup2 --> /path/to/backup2 (symlink)
/one/src/local/backup --> /path/to/backup (symlink)
/one/src/local/backup2 --> /path/to/backup2 (symlink)

If I set my ignore path with:

backup

It detects that I want to add them.

Directory added to ignore list: /one/src/backup

It still visits all 4 directories, quadrupling the file system scans, though with the fix you described:

index a241c29..c74bdd2 100644
--- a/imports/fix.go
+++ b/imports/fix.go
@@ -462,6 +462,9 @@ func shouldTraverse(dir string, fi os.FileInfo) bool {
        if !ts.IsDir() {
                return false
        }
+       if skipDir(ts) {
+               return false
+       }

        realParent, err := filepath.EvalSymlinks(dir)
        if err != nil {

It stops visiting /patch/to/backup properly, however it only prevents a single target from being visited. With that bug fixed I think it is operating as intended. That said this code goes through a good bit of trouble to deal with symlinks and book keeping, but I think the complexity is missing the basic use case of, please stay away from this directory. and the design drifts a bit from other ignorefiles. It forces you to guard each individual subdirectory in your file system, which just isn't realistic.

Understanding it's a bit late for any big refactor, I suggest the behavior is modified to prevent traversal if the target path strings.HasSuffix (I don't think filepath.Dir is really needed for this) of the ignored file names. I don't think it would break existing configurations but it could require a leading slash to be enabled if so. This would make it work like most ignore files and allow me to keep goimports from going on wild file systems scans if I inadvertently copy something that has a symlink outside my GOPATH.

@cstockton
Copy link
Author

To clarify, my issue is that I had no idea what sym link was causing this massive scan, it turns out it was multiple symlinks ascending recursively (The resolved symlink following a symlink on the target system, and so on). Keeping a list of the directory file node of the visited symlinks parent doesn't save you from re-ascending the same file system all over a gain in a child directory. I think it would probably be better to have a map of abspath roots that have been traversed and make sure the current dir does not exist in it.

@bradfitz
Copy link
Contributor

Let's start with that minimal fix. Please send a CL with a test and we'll get that in first.

@gopherbot
Copy link

CL https://golang.org/cl/33246 mentions this issue.

@cstockton
Copy link
Author

@bradfitz I pushed a fix + test, gonna close this out, thanks for the help have a good thanksgiving.

gopherbot pushed a commit to golang/tools that referenced this issue Nov 23, 2016
Fixes golang/go#17916

Added tests.

Change-Id: Ie44e4bcbec267b6c16249336c5d48bae86acc2b5
Reviewed-on: https://go-review.googlesource.com/33246
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@golang golang locked and limited conversation to collaborators Nov 23, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

3 participants