New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/tools/internal/fastwalk: use os.File.ReadDir on Darwin to iterate directories #51356
Comments
This isn't an API change so taking it out of the proposal process. |
@charlievieth, thank you for the research. This looks like a big opportunity for macOS. I have two concerns. First, while we don't have a strict compatibility guarantee for goimports/gopls, requiring 1.16 is almost certainly going to cause trouble. So relying entirely on Second, that allocation increase is nontrivial. We have to scan entire module caches, and ~tripling the memory usage is unfortunate. If we roll our own logic, can we improve on that? |
@heschi, thank you for taking a look at this.
Agreed.
I agree that the 3x increase in allocations it's unfortunate, but those allocations are transitory and should be quickly GC'd as we don't keep any references to the That said, rolling our own logic and using Benchmark comparing: golang/tools@b59c441...charlievieth:cev/fastwalk-darwin
|
(I saw the fastwalk-darwin branch, you just didn't include benchmarks for it :) ) If it were just goimports I'd agree with you, but I'm a little worried about the allocation volume increasing GC pressure for gopls, which has very large heaps. cc @findleyr just in case he feels strongly one way or the other. You might be right about the allocations not being a problem, but since you've gone to the trouble of writing the |
This changes fastwalk to use readdir_r() instead of syscall.ReadDirent() on Darwin. This improves performance by ~3x and matches the behavior of os.readDir(). The ReadDirent function is slow on Darwin because it uses syscall.Getdirentries(), which due to golang/go#30933 is now simulated on Darwin. To support using readdir_r we this change adds the following syscalls: closedir, readdir_r, and opendir. goos: darwin goarch: arm64 name old time/op new time/op delta FastWalk-10 52.6ms ± 6% 17.8ms ± 1% -66.15% (p=0.000 n=20+20) name old alloc/op new alloc/op delta FastWalk-10 1.30MB ± 0% 1.29MB ± 0% -0.48% (p=0.000 n=20+19) name old allocs/op new allocs/op delta FastWalk-10 28.2k ± 0% 25.5k ± 0% -9.34% (p=0.000 n=19+19) Fixes golang/go#51356 Change-Id: I266e2aff98af2711ff5e477c62ca594efe5681d0
This changes fastwalk to use readdir_r() instead of syscall.ReadDirent() on Darwin. This improves performance by ~3x and matches the behavior of os.readDir(). The ReadDirent function is slow on Darwin because it uses syscall.Getdirentries(), which due to golang/go#30933 is now simulated on Darwin. To support using readdir_r we this change adds the following syscalls: closedir, readdir_r, and opendir. goos: darwin goarch: arm64 name old time/op new time/op delta FastWalk-10 52.6ms ± 6% 17.8ms ± 1% -66.15% (p=0.000 n=20+20) name old alloc/op new alloc/op delta FastWalk-10 1.30MB ± 0% 1.29MB ± 0% -0.48% (p=0.000 n=20+19) name old allocs/op new allocs/op delta FastWalk-10 28.2k ± 0% 25.5k ± 0% -9.34% (p=0.000 n=19+19) Fixes golang/go#51356 Change-Id: Idfd120ba6bd09b2960d75aba09c739428306ce13
This changes fastwalk to use readdir_r() instead of syscall.ReadDirent() on Darwin. This improves performance by ~3x and matches the behavior of os.readDir(). The ReadDirent function is slow on Darwin because it uses syscall.Getdirentries(), which due to golang/go#30933 is now simulated on Darwin. To support using readdir_r we this change adds the following syscalls: closedir, readdir_r, and opendir. goos: darwin goarch: arm64 name old time/op new time/op delta FastWalk-10 52.6ms ± 6% 17.8ms ± 1% -66.15% (p=0.000 n=20+20) name old alloc/op new alloc/op delta FastWalk-10 1.30MB ± 0% 1.29MB ± 0% -0.48% (p=0.000 n=20+19) name old allocs/op new allocs/op delta FastWalk-10 28.2k ± 0% 25.5k ± 0% -9.34% (p=0.000 n=19+19) Fixes golang/go#51356 Change-Id: I90628ba248e83950613f06cc3ea6e7aeda33f41e
Change https://go.dev/cl/392094 mentions this issue: |
Change https://go.dev/cl/436780 mentions this issue: |
On darwin/cgo use readdir_r instead of syscall.ReadDirent() since the later is simulated and slow (see: golang/go#30933). Unlike CL 392094, this uses CGO instead of "//go:linkname" and assembly. goos: darwin goarch: arm64 FastWalk-10 68.9ms ±11% 20.7ms ± 2% -69.89% (p=0.008 n=5+5) name old alloc/op new alloc/op delta FastWalk-10 1.49MB ± 0% 1.51MB ± 0% +1.06% (p=0.008 n=5+5) name old allocs/op new allocs/op delta FastWalk-10 32.2k ± 0% 30.7k ± 0% -4.61% (p=0.016 n=5+4) Fixes golang/go#51356 Change-Id: Ia3afd06c8f14bd2036b2a1ea6e3cafbef81d3530 Reviewed-on: https://go-review.googlesource.com/c/tools/+/436780 TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Heschi Kreinick <heschi@google.com> Reviewed-by: Robert Findley <rfindley@google.com> Run-TryBot: Robert Findley <rfindley@google.com> Auto-Submit: Heschi Kreinick <heschi@google.com> gopls-CI: kokoro <noreply+kokoro@google.com>
This changes fastwalk to use readdir_r() instead of syscall.ReadDirent() on Darwin. This improves performance by ~3x and matches the behavior of os.readDir(). The ReadDirent function is slow on Darwin because it uses syscall.Getdirentries(), which due to golang/go#30933 is now simulated on Darwin. To support using readdir_r we this change adds the following syscalls: closedir, readdir_r, and opendir. goos: darwin goarch: arm64 name old time/op new time/op delta FastWalk-10 52.6ms ± 6% 17.8ms ± 1% -66.15% (p=0.000 n=20+20) name old alloc/op new alloc/op delta FastWalk-10 1.30MB ± 0% 1.29MB ± 0% -0.48% (p=0.000 n=20+19) name old allocs/op new alocs/op delta FastWalk-10 28.2k ± 0% 25.5k ± 0% -9.34% (p=0.000 n=19+19) Fixes golang/go#51356 Change-Id: I8aecd7babe26c1194b4e026c563655c0ace49824
On Darwin,
fastwalk.Walk()
is 2x faster whenos.File.ReadDir
is used instead ofsyscall.ReadDirent
to iterate directory entries. The performance difference is due toReadDirent
callingsyscall.Getdirentries
which on Darwin is simulated and slow (requires many more syscalls) because the macOS AppStore prohibits the use of_getdirentries64
(see: #30933 and 9da6530).I propose we change
x/tools/internal/fastwalk
to useos.File.ReadDir
on Darwin and all other non-Unix OSes (we could useos.ReadDir
, but we don't need the results to be sorted). Non-Unix OSes currently useioutil.ReadDir
and can be safely transitioned toos.File.ReadDir
.I investigated using
os.File.ReadDir
for all OSes, but this resulted in a ~10% slowdown on Linux. I didn't measure the performance impact on Windows sinceos.File.ReadDir
andioutil.ReadDir
use the same logic.A more aggressive approach would be to port the
readdir_r
based logic fromos/dir_darwin.go
. This ended up being 3x faster, but requires adding stubs for:closedir
,readdir_r
, andopendir
(see: charlievieth/tools@d4e49d7).Benchmarks:
$ go test -run XXX -bench BenchmarkFastWalk -benchmem -count 10
Using
os.File.ReadDir
on Darwin: golang/tools@b7525f4 vs. charlievieth/tools@70e44e2:Changing all OSes to use
os.File.ReadDir
resulted in a ~10% slowdown on Linux: golang/tools@b7525f4 vs. charlievieth/tools@e799d18:Branches
os.File.ReadDir
- 2x speedup on Darwinreaddir_r
: 3x speedupos.File.ReadDir
: 2x speedup on Darwin, 10% slowdown on Linux, but vastly simplifies the codeThe text was updated successfully, but these errors were encountered: