Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: build the cgo artifacts for each package in parallel #9887

Open
petermattis opened this issue Feb 16, 2015 · 24 comments
Open

cmd/go: build the cgo artifacts for each package in parallel #9887

petermattis opened this issue Feb 16, 2015 · 24 comments
Milestone

Comments

@petermattis
Copy link

The lack of parallelization for compilations within a single package doesn't matter too much when building go files, but does affect cgo compilation. I have a package with about 32k lines of (generated) C++. Compiling it serially takes ~10.5s. A small hack to cmd/go/build.go to compile C/C++/Obj-C files in parallel brings this down to ~3.7s.

The patch I hacked together to parallelize cgo compilation work is ~50 lines of deltas and is mostly contained in builder.cgo().

PS The compilation times above were measured using go 1.4.1 on Mac OS X, but this code hasn't changed at head and similar compilation times can be seen there.

@minux minux changed the title "go build" doesn't parallelize compilations within a package cmd/go: "go build" doesn't parallelize compilations within a package Feb 16, 2015
@robpike
Copy link
Contributor

robpike commented Feb 16, 2015

Seems reasonable to me. Why not send the patch?

@petermattis
Copy link
Author

Done: https://go-review.googlesource.com/#/c/4931/

@rsc rsc added this to the Go1.5Maybe milestone Apr 10, 2015
@gopherbot
Copy link

CL https://golang.org/cl/4931 mentions this issue.

@petermattis
Copy link
Author

I'm curious if there is any plan to parallelize cgo compilation for go1.7. The finer-grained work scheduling for go tool actions mentioned in https://go-review.googlesource.com/#/c/4931/ would be great if it ever happens, though it doesn't look like there is any movement on that front.

@bcmills bcmills changed the title cmd/go: "go build" doesn't parallelize compilations within a package cmd/go: build the cgo artifacts for each package in parallel Jan 23, 2019
@bcmills
Copy link
Contributor

bcmills commented Jan 23, 2019

Contrast #21605. Are there any similar ordering issues for other languages?

@ianlancetaylor
Copy link
Contributor

Currently we only support C, C++, and Fortran, and C and C++ have no ordering issues.

I'm not aware of ordering issues for any other language.

@r0l1
Copy link

r0l1 commented Dec 19, 2019

Any updates on this? Compiling one of our packages takes around 30 seconds...
Would be great, if cgo compilation could be parallelized.

@bcmills
Copy link
Contributor

bcmills commented Dec 19, 2019

@r0l1, this issue is milestoned to Unplanned, meaning that we do not currently plan to work on it.

The previous CL was closed in favor of “implementation of finer-grained actions”, which I think may be #8893.

@mvdan
Copy link
Member

mvdan commented Jan 7, 2020

I had a brief look at this last week, because this is causing an annoying build time bottleneck in a project of mine. I assume the file we're interested in is src/cmd/go/internal/work/exec.go.

In particular, the Builder.cgo method makes calls to Builder.gcc synchronously in a loop.

It's unclear to me how we'd fix this, though. Can anyone point me to a doc that explains how work.Action is designed at a high level?

For example, I can imagine these ways to fix the issue, but it's unclear which would be the right one:

  • In Builder.cgo, create a child Action for each gcc call, and ensure they're all run before we continue (is this possible?)
  • Before the entire cgo method is called via build, separate the work into a tree of actions (how? cfiles is constructed as part of cgo, for example)

@mvdan
Copy link
Member

mvdan commented Jan 27, 2020

I had a sudden realisation: one of the packages that are really slow to compile with cgo, https://github.com/olebedev/go-duktape/tree/v3, simply bundles a ton of C code as a single 3.5MiB file. I presume that parallelizing cgo compilation per-file wouldn't help at all in that case, and I also assume that there is no way to make that any faster.

Luckily, that's not the norm. For example, https://github.com/ipsn/go-libtor builds tor from its C source, directly building its thousands of C files. That seems like it would get a big win from a fix for this issue.

@diamondburned
Copy link

I had a brief look at this last week, because this is causing an annoying build time bottleneck in a project of mine. I assume the file we're interested in is src/cmd/go/internal/work/exec.go.

In particular, the Builder.cgo method makes calls to Builder.gcc synchronously in a loop.

From my observation, both cmd/go/internal/work and cmd/cgo invoke gcc
sequentially a lot of times, but cmd/cgo takes up the bulk of the time.

I was able to write a prototype that attempts to parallelize both of them,
however the CL for cmd/go/internal/work was denied in favor of another
unplanned refactor. I still believe that parallelizing just cmd/cgo will
significantly improve compile time, though.

If anybody wants to test how much faster the changes are, it can be found at
diamondburned/go/work-gcc-parallelism.
When using this branch, use ./make.bash to compile instead, as the change
currently fails a Cgo test that I'm still not sure why.

I'm currently daily driving Go 1.17 with these 2 patches applied on top almost
daily, and I find compile time to be roughly 3-4x faster. The implementation
will limit the maximum workers at 4 per package regardless of the thread count,
so the number is to be expected.

The branch doesn't account for Fortran, however, so it might not work at all for
codebases that need those.

@r0l1
Copy link

r0l1 commented Dec 12, 2021

@diamondburned thanks for the info. I am currently testing your patch (cmd/go/internal/work) on latest go master and all tests pass. Could you tell me more about the first patch (cmd/cgo)? I didn't find it. Is it already merged into upstream?

Edit:

I tested the patch with our cgo-based project and the build acceleration is enormous. Thanks!

  • unpatched: 66.91 sec
  • patch: 28.34 sec
  • patch with 8 routines on an 8 core: 26.99 sec

@bcmills The linked issue has been deprioritized by @rsc

@rsc willing to include the patch from @diamondburned ?

Edit 2:
@diamondburned found the second patch. Didn't see, that you have split it into multiple branches. With the second patch the build times are 23.6 sec.

@diamondburned
Copy link

Just as a small update, branch go-cgo-gcc-parallelism-1.18 now tracks the 2 patches for Go 1.18.

For Nix users, the drop-in overlay is

(self: super: {
	go = super.go.overrideAttrs (old: {
		version = "1.18";
		src = builtins.fetchurl {
			url    = "https://golang.org/dl/go1.18.linux-amd64.tar.gz";
			sha256 = "0kr6h1ddaazibxfkmw7b7jqyqhskvzjyc2c4zr8b3kapizlphlp8";
		};
		doCheck = false;
		patches = [
			# cmd/go/internal/work: concurrent ccompile routines
			(builtins.fetchurl "https://github.com/diamondburned/go/commit/ec3e1c9471f170187b6a7c83ab0364253f895c28.patch")
			# cmd/cgo: concurrent file generation
			(builtins.fetchurl "https://github.com/diamondburned/go/commit/50e04befeca9ae63296a73c8d5d2870b904971b4.patch")
		];
	});
})

@Evengard
Copy link

Evengard commented Apr 11, 2022

@diamondburned Unfortunately your code have problems. When attempting to compile my project with your version, it failed with a rather cryptic error:

cgo: duplicate pkg.Name nox_xxx_objGetTeamByNetCode_418C80, &{nox_xxx_objGetTeamByNetCode_418C80 _Cfunc_nox_xxx_objGetTeamByNetCode_418C80 nox_xxx_objGetTeamByNetCode_418C80  func <nil> 0xc001b12870 false } == &{nox_xxx_objGetTeamByNetCode_418C80 _Cfunc_nox_xxx_objGetTeamByNetCode_418C80 nox_xxx_objGetTeamByNetCode_418C80  func <nil> 0xc0007171d0 false }

nox_xxx_objGetTeamByNetCode_418C80 is a function imported from C via cgo.
The project compiles fine without multithreaded cgo though (aka regular Golang build)
The project in question is https://github.com/noxworld-dev/opennox, if need to check the bug out.

I used this branch: https://github.com/diamondburned/go/tree/go-cgo-gcc-parallelism-1.18

@diamondburned
Copy link

Unfortunately your code have problems.

The modifications are definitely band-aid. It works for my purposes so I've stopped there, because cmd/cgo's code is very complicated and weird for me to work with.

If you can't get the compiler to work with just the cmd/cgo patch, then you might need to investigate deeper yourself.

@diamondburned
Copy link

I've made go-cgo-gcc-parallelism-1.19 to track Go 1.19.

@Evengard
Copy link

https://github.com/Evengard/go/tree/go-cgo-gcc-parallelism-1.20.3 tracks the Go 1.20.3 release for anyone who needs that.

@podtserkovskiy
Copy link
Contributor

It's been a while since this issue was created and two CLs were abandoned in favor to properly redesign cmd/internal/work to make actions API more fine grained.

This is definetely rught thing to do, but nobody had time to implement that.
I do anticipate that this is not an easy project and personally I don't have enough time to invest into it at the moment.

Are we happy to do a tiny fix to just parallelize the work inside Builder.cgo as a temporary solution?

@gopherbot
Copy link

Change https://go.dev/cl/579815 mentions this issue: cmd/go/internal/work: parallelize C/C++ code compilation

@podtserkovskiy
Copy link
Contributor

I've tried to keep it as less invasive as possible, let me know if it works for you
https://go-review.googlesource.com/c/go/+/579815

@matloob
Copy link
Contributor

matloob commented Apr 17, 2024

It would be ideal to try to fix this by making actions more fine grained, especially since there aren't any other cases where we do work concurrently within an action. It complicates to have two layers of concurrency- one layer being the actions that are running and the other being the C compilations within the action.

@diamondburned
Copy link

It would be ideal to try to fix this by making actions more fine grained, especially since there aren't any other cases where we do work concurrently within an action. It complicates to have two layers of concurrency- one layer being the actions that are running and the other being the C compilations within the action.

AFAIK, this was one of the reasons for not merging the CLs. I definitely agree that that's the ideal solution, but as @podtserkovskiy pointed out, it wasn't really being worked on, unfortunately.

@podtserkovskiy
Copy link
Contributor

It would be ideal to try to fix this by making actions more fine grained

I fully agree that the redesign of the actions API would be ideal, but this is not a trivial change and it requires much more time for proper implementation than this fix.

there aren't any other cases where we do work concurrently within an action

I can recall one more place with a potential to execute things concurrently gcToolchain.asm within an action.

@matloob
Copy link
Contributor

matloob commented Apr 17, 2024

there aren't any other cases where we do work concurrently within an action

I can recall one more place with a potential to execute things concurrently gcToolchain.asm within an action.

Sorry, I meant that there aren't any other places currently where we do work concurrently with an action

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests