Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: allow 'mod verify' to use multiple CPUs #38623

Closed
mvdan opened this issue Apr 23, 2020 · 11 comments
Closed

cmd/go: allow 'mod verify' to use multiple CPUs #38623

mvdan opened this issue Apr 23, 2020 · 11 comments
Labels
FrozenDueToAge GoCommand cmd/go help wanted modules NeedsFix The path to resolution is known, but the work has not been done. ToolSpeed
Milestone

Comments

@mvdan
Copy link
Member

mvdan commented Apr 23, 2020

For example, on a moderately sized module of mine:

$ /usr/bin/time go mod verify
all modules verified
13.33user 2.39system 0:15.78elapsed 99%CPU (0avgtext+0avgdata 24472maxresident)k
1558152inputs+0outputs (0major+4086minor)pagefaults 0swaps

Given that I have four physical CPU cores, and that the bottleneck seems to be hashing moderately sized zip files on disk, I think there's plenty of room for improvement.

For example, if we verified up to GOMAXPROCS concurrently, I expect the process could easily get two or three times faster on my laptop.

I realise that this command doesn't need to be super fast, but I make use of it for "module sanity checks" in CI along with others like go mod tidy, so it's unfortunate that it doesn't make proper use of the resources available.

/cc @bcmills @jayconrod @matloob

@mvdan
Copy link
Member Author

mvdan commented Apr 23, 2020

I also wonder if there are other commands which are similarly bottlenecked on a single CPU and could be improved. I haven't really noticed any others; for example, go mod tidy regularly uses at least three of my cores.

@andybons andybons added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Apr 23, 2020
@andybons andybons added this to the Unplanned milestone Apr 23, 2020
@jayconrod jayconrod added help wanted NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Apr 23, 2020
@jayconrod
Copy link
Contributor

I also wonder if there are other commands which are similarly bottlenecked on a single CPU and could be improved.

Definitely. @matloob is looking at adding tracing that would help identify bottlenecks.

@matloob
Copy link
Contributor

matloob commented Apr 23, 2020

Yeah I'm hoping tracing will help us find more of those commands.

@mvdan
Copy link
Member Author

mvdan commented Apr 24, 2020

This was marked as "help wanted", and I think it's a safe and easy change to make a week before the freeze, so I'll mail a CL shortly.

@gopherbot
Copy link

Change https://golang.org/cl/229817 mentions this issue: cmd/go: make 'mod verify' use multiple CPUs

xujianhai666 pushed a commit to xujianhai666/go-1 that referenced this issue May 21, 2020
'go mod verify' checksums one module zip at a time, which is
CPU-intensive on most modern machines with fast disks. As a result, one
can see a CPU bottleneck when running the command on, for example, a
module where 'go list -m all' lists ~440 modules:

	$ /usr/bin/time go mod verify
	all modules verified
	11.47user 0.77system 0:09.41elapsed 130%CPU (0avgtext+0avgdata 24284maxresident)k
	0inputs+0outputs (0major+4156minor)pagefaults 0swaps

Instead, verify up to GOMAXPROCS zips at once, which should line up
pretty well with the amount of processors we can use on a machine. The
results below are obtained via 'benchcmd -n 5 GoModVerify go mod verify'
on the same large module.

	name         old time/op         new time/op         delta
	GoModVerify          9.35s ± 1%          3.03s ± 2%  -67.60%  (p=0.008 n=5+5)

	name         old user-time/op    new user-time/op    delta
	GoModVerify          11.2s ± 1%          16.3s ± 3%  +45.38%  (p=0.008 n=5+5)

	name         old sys-time/op     new sys-time/op     delta
	GoModVerify          841ms ± 9%          865ms ± 8%     ~     (p=0.548 n=5+5)

	name         old peak-RSS-bytes  new peak-RSS-bytes  delta
	GoModVerify         27.8MB ±13%         50.7MB ±27%  +82.01%  (p=0.008 n=5+5)

The peak memory usage nearly doubles, and there is some extra overhead,
but it seems clearly worth the tradeoff given that we see a ~3x speedup
on my laptop with 4 physical cores. The vast majority of developer
machines nowadays should have 2-4 cores at least.

No test or benchmark is included; one can benchmark 'go mod verify'
directly, as I did above. The existing tests also cover correctness,
including any data races via -race.

Fixes golang#38623.

Change-Id: I45d8154687a6f3a6a9fb0e2b13da4190f321246c
Reviewed-on: https://go-review.googlesource.com/c/go/+/229817
Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
@golang golang locked and limited conversation to collaborators Apr 30, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge GoCommand cmd/go help wanted modules NeedsFix The path to resolution is known, but the work has not been done. ToolSpeed
Projects
None yet
Development

No branches or pull requests

7 participants
@jayconrod @andybons @mvdan @bcmills @gopherbot @matloob and others