Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/tools/internal/fastwalk: use os.File.ReadDir on Darwin to iterate directories #51356

Closed
charlievieth opened this issue Feb 25, 2022 · 6 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. Tools This label describes issues relating to any tools in the x/tools repository.
Milestone

Comments

@charlievieth
Copy link
Contributor

On Darwin, fastwalk.Walk() is 2x faster when os.File.ReadDir is used instead of syscall.ReadDirent to iterate directory entries. The performance difference is due to ReadDirent calling syscall.Getdirentries which on Darwin is simulated and slow (requires many more syscalls) because the macOS AppStore prohibits the use of _getdirentries64 (see: #30933 and 9da6530).

I propose we change x/tools/internal/fastwalk to use os.File.ReadDir on Darwin and all other non-Unix OSes (we could use os.ReadDir, but we don't need the results to be sorted). Non-Unix OSes currently use ioutil.ReadDir and can be safely transitioned to os.File.ReadDir.

I investigated using os.File.ReadDir for all OSes, but this resulted in a ~10% slowdown on Linux. I didn't measure the performance impact on Windows since os.File.ReadDir and ioutil.ReadDir use the same logic.

A more aggressive approach would be to port the readdir_r based logic from os/dir_darwin.go. This ended up being 3x faster, but requires adding stubs for: closedir, readdir_r, and opendir (see: charlievieth/tools@d4e49d7).

Benchmarks:

$ go test -run XXX -bench BenchmarkFastWalk -benchmem -count 10

Using os.File.ReadDir on Darwin: golang/tools@b7525f4 vs. charlievieth/tools@70e44e2:

goos: darwin
goarch: arm64
version: go version go1.17.6 darwin/arm64

name         old time/op    new time/op    delta
FastWalk-10    45.7ms ± 5%    22.6ms ± 9%   -50.69%  (p=0.000 n=10+10)

name         old alloc/op   new alloc/op   delta
FastWalk-10    1.30MB ± 0%    4.26MB ± 0%  +227.61%  (p=0.000 n=9+10)

name         old allocs/op  new allocs/op  delta
FastWalk-10     28.2k ± 0%     47.8k ± 0%   +69.75%  (p=0.000 n=9+8)

Changing all OSes to use os.File.ReadDir resulted in a ~10% slowdown on Linux: golang/tools@b7525f4 vs. charlievieth/tools@e799d18:

goos: linux
goarch: amd64
pkg: golang.org/x/tools/internal/fastwalk
cpu: Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz

name         old time/op    new time/op    delta
FastWalk-16    2.57ms ± 2%    2.84ms ± 1%   +10.48%  (p=0.008 n=5+5)

name         old alloc/op   new alloc/op   delta
FastWalk-16     926kB ± 0%    2436kB ± 0%  +163.01%  (p=0.008 n=5+5)

name         old allocs/op  new allocs/op  delta
FastWalk-16     25.0k ± 0%     44.3k ± 0%   +76.77%  (p=0.016 n=4+5)

Branches

  • fastwalk: changes Darwin and non-Unix OSes to use os.File.ReadDir - 2x speedup on Darwin
  • fastwalk-darwin: changes Darwin to use readdir_r: 3x speedup
  • fastwalk-readdir: changes all OSes to use os.File.ReadDir: 2x speedup on Darwin, 10% slowdown on Linux, but vastly simplifies the code
@gopherbot gopherbot added this to the Proposal milestone Feb 25, 2022
@ianlancetaylor
Copy link
Contributor

This isn't an API change so taking it out of the proposal process.

CC @ianthehat @matloob @heschi

@ianlancetaylor ianlancetaylor changed the title proposal: x/tools/internal/fastwalk: use os.File.ReadDir on Darwin to iterate directories x/tools/internal/fastwalk: use os.File.ReadDir on Darwin to iterate directories Feb 25, 2022
@ianlancetaylor ianlancetaylor added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. and removed Proposal labels Feb 25, 2022
@gopherbot gopherbot added the Tools This label describes issues relating to any tools in the x/tools repository. label Feb 25, 2022
@ianlancetaylor ianlancetaylor modified the milestones: Proposal, Unreleased Feb 25, 2022
@heschi
Copy link
Contributor

heschi commented Mar 7, 2022

@charlievieth, thank you for the research. This looks like a big opportunity for macOS.

I have two concerns. First, while we don't have a strict compatibility guarantee for goimports/gopls, requiring 1.16 is almost certainly going to cause trouble. So relying entirely on os.File.ReadDir isn't going to work. But using it on Darwin behind a build tag seems like a fairly easy win.

Second, that allocation increase is nontrivial. We have to scan entire module caches, and ~tripling the memory usage is unfortunate. If we roll our own logic, can we improve on that?

@charlievieth
Copy link
Contributor Author

charlievieth commented Mar 10, 2022

@heschi, thank you for taking a look at this.

requiring 1.16 is almost certainly going to cause trouble. So relying entirely on os.File.ReadDir isn't going to work. But using it on Darwin behind a build tag seems like a fairly easy win.

Agreed.

Second, that allocation increase is nontrivial. We have to scan entire module caches, and ~tripling the memory usage is unfortunate. If we roll our own logic, can we improve on that?

I agree that the 3x increase in allocations it's unfortunate, but those allocations are transitory and should be quickly GC'd as we don't keep any references to the []fs.DirEntry slice returned by os.File.ReadDir, which is responsible for the increase in allocs.

That said, rolling our own logic and using readdir_r (like os/dir_darwin.go) results in the same alloc size as the current Getdirentries based solution, uses ~9% fewer allocations, and is ~62% faster. The downside of this approach is that we have to implement and link closedir, readdir_r, and opendir on Darwin, which requires .s stub files and I'm not sure that would be welcome in this repo (see my cev/fastwalk-darwin branch and the diff).

Benchmark comparing: golang/tools@b59c441...charlievieth:cev/fastwalk-darwin

name         old time/op    new time/op    delta
FastWalk-10    46.7ms ± 5%    17.4ms ± 1%  -62.66%  (p=0.000 n=10+10)

name         old alloc/op   new alloc/op   delta
FastWalk-10    1.30MB ± 0%    1.29MB ± 0%   -0.46%  (p=0.000 n=10+9)

name         old allocs/op  new allocs/op  delta
FastWalk-10     28.2k ± 0%     25.5k ± 0%   -9.34%  (p=0.000 n=9+10)

@heschi
Copy link
Contributor

heschi commented Mar 10, 2022

(I saw the fastwalk-darwin branch, you just didn't include benchmarks for it :) )

If it were just goimports I'd agree with you, but I'm a little worried about the allocation volume increasing GC pressure for gopls, which has very large heaps. cc @findleyr just in case he feels strongly one way or the other.

You might be right about the allocations not being a problem, but since you've gone to the trouble of writing the readdir_r implementation, I'd say let's try it. I don't think a couple of trivial assembly files will be a problem. Worst case, falling back to the ReadDir implementation will be easy enough if necessary. Send a CL?

charlievieth added a commit to charlievieth/tools that referenced this issue Mar 12, 2022
This changes fastwalk to use readdir_r() instead of syscall.ReadDirent()
on Darwin. This improves performance by ~3x and matches the behavior of
os.readDir(). The ReadDirent function is slow on Darwin because it uses
syscall.Getdirentries(), which due to golang/go#30933 is now simulated
on Darwin.

To support using readdir_r we this change adds the following syscalls:
closedir, readdir_r, and opendir.

goos: darwin
goarch: arm64

name         old time/op    new time/op    delta
FastWalk-10    52.6ms ± 6%    17.8ms ± 1%  -66.15%  (p=0.000 n=20+20)

name         old alloc/op   new alloc/op   delta
FastWalk-10    1.30MB ± 0%    1.29MB ± 0%   -0.48%  (p=0.000 n=20+19)

name         old allocs/op  new allocs/op  delta
FastWalk-10     28.2k ± 0%     25.5k ± 0%   -9.34%  (p=0.000 n=19+19)

Fixes golang/go#51356

Change-Id: I266e2aff98af2711ff5e477c62ca594efe5681d0
charlievieth added a commit to charlievieth/tools that referenced this issue Mar 12, 2022
This changes fastwalk to use readdir_r() instead of syscall.ReadDirent()
on Darwin. This improves performance by ~3x and matches the behavior of
os.readDir(). The ReadDirent function is slow on Darwin because it uses
syscall.Getdirentries(), which due to golang/go#30933 is now simulated
on Darwin.

To support using readdir_r we this change adds the following syscalls:
closedir, readdir_r, and opendir.

goos: darwin
goarch: arm64

name         old time/op    new time/op    delta
FastWalk-10    52.6ms ± 6%    17.8ms ± 1%  -66.15%  (p=0.000 n=20+20)

name         old alloc/op   new alloc/op   delta
FastWalk-10    1.30MB ± 0%    1.29MB ± 0%   -0.48%  (p=0.000 n=20+19)

name         old allocs/op  new allocs/op  delta
FastWalk-10     28.2k ± 0%     25.5k ± 0%   -9.34%  (p=0.000 n=19+19)

Fixes golang/go#51356

Change-Id: Idfd120ba6bd09b2960d75aba09c739428306ce13
charlievieth added a commit to charlievieth/tools that referenced this issue Mar 12, 2022
This changes fastwalk to use readdir_r() instead of syscall.ReadDirent()
on Darwin. This improves performance by ~3x and matches the behavior of
os.readDir(). The ReadDirent function is slow on Darwin because it uses
syscall.Getdirentries(), which due to golang/go#30933 is now simulated
on Darwin.

To support using readdir_r we this change adds the following syscalls:
closedir, readdir_r, and opendir.

goos: darwin
goarch: arm64

name         old time/op    new time/op    delta
FastWalk-10    52.6ms ± 6%    17.8ms ± 1%  -66.15%  (p=0.000 n=20+20)

name         old alloc/op   new alloc/op   delta
FastWalk-10    1.30MB ± 0%    1.29MB ± 0%   -0.48%  (p=0.000 n=20+19)

name         old allocs/op  new allocs/op  delta
FastWalk-10     28.2k ± 0%     25.5k ± 0%   -9.34%  (p=0.000 n=19+19)

Fixes golang/go#51356

Change-Id: I90628ba248e83950613f06cc3ea6e7aeda33f41e
@gopherbot
Copy link

Change https://go.dev/cl/392094 mentions this issue: internal/fastwalk: improve Darwin performance by ~3x

@gopherbot
Copy link

Change https://go.dev/cl/436780 mentions this issue: internal/fastwalk: improve Darwin performance by ~3x XX

@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Nov 14, 2022
rinchsan pushed a commit to rinchsan/gosimports that referenced this issue Feb 19, 2023
On darwin/cgo use readdir_r instead of syscall.ReadDirent() since the
later is simulated and slow (see: golang/go#30933).

Unlike CL 392094, this uses CGO instead of "//go:linkname" and assembly.

goos: darwin
goarch: arm64

FastWalk-10    68.9ms ±11%    20.7ms ± 2%  -69.89%  (p=0.008 n=5+5)

name         old alloc/op   new alloc/op   delta
FastWalk-10    1.49MB ± 0%    1.51MB ± 0%   +1.06%  (p=0.008 n=5+5)

name         old allocs/op  new allocs/op  delta
FastWalk-10     32.2k ± 0%     30.7k ± 0%   -4.61%  (p=0.016 n=5+4)

Fixes golang/go#51356

Change-Id: Ia3afd06c8f14bd2036b2a1ea6e3cafbef81d3530
Reviewed-on: https://go-review.googlesource.com/c/tools/+/436780
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Heschi Kreinick <heschi@google.com>
Reviewed-by: Robert Findley <rfindley@google.com>
Run-TryBot: Robert Findley <rfindley@google.com>
Auto-Submit: Heschi Kreinick <heschi@google.com>
gopls-CI: kokoro <noreply+kokoro@google.com>
zchee pushed a commit to zchee/golang-tools that referenced this issue Jun 23, 2023
This changes fastwalk to use readdir_r() instead of syscall.ReadDirent()
on Darwin. This improves performance by ~3x and matches the behavior of
os.readDir(). The ReadDirent function is slow on Darwin because it uses
syscall.Getdirentries(), which due to golang/go#30933 is now simulated
on Darwin.

To support using readdir_r we this change adds the following syscalls:
closedir, readdir_r, and opendir.

goos: darwin
goarch: arm64

name         old time/op    new time/op    delta
FastWalk-10    52.6ms ± 6%    17.8ms ± 1%  -66.15%  (p=0.000 n=20+20)

name         old alloc/op   new alloc/op   delta
FastWalk-10    1.30MB ± 0%    1.29MB ± 0%   -0.48%  (p=0.000 n=20+19)

name         old allocs/op  new alocs/op  delta
FastWalk-10     28.2k ± 0%     25.5k ± 0%   -9.34%  (p=0.000 n=19+19)

Fixes golang/go#51356

Change-Id: I8aecd7babe26c1194b4e026c563655c0ace49824
@golang golang locked and limited conversation to collaborators Nov 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. Tools This label describes issues relating to any tools in the x/tools repository.
Projects
None yet
Development

No branches or pull requests

5 participants