Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: cmd/cgo: add support for pre-compiled cgo packages #38917

Closed
eliasnaur opened this issue May 7, 2020 · 15 comments
Closed

proposal: cmd/cgo: add support for pre-compiled cgo packages #38917

eliasnaur opened this issue May 7, 2020 · 15 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Proposal
Milestone

Comments

@eliasnaur
Copy link
Contributor

eliasnaur commented May 7, 2020

This is the GitHub issue for my half-baked proposal posted on golang-dev.

Inspired by #35721 (comment)
and my own desire for easily cross-compiling Gio programs, I wonder whether it is
feasable to re-use cgo processing through pre-generation of the cgo tool output.

Packages that import "C" require quite a lot of support software:
a native toolchain, development libraries, headers. Invoking the host compiler
is also slow. For example, cross-compiling Cgo programs for macOS and iOS is particularly
difficult (and perhaps legally impossible), and cross-compiling for Android requires
a ~700MB NDK.

On the other hand, it's not always possible to avoid C dependencies: Gio uses
system libraries that are impossible to implement in Go.

One option is .syso files (https://github.com/golang/go/wiki/GcToolchainTricks),
pre-compiled native code that the Go linker knows how to link in. However .syso
files are too low level: they can't link to shared (system) libraries and there is no
Go<->C bridging for making calls across the language barrier safe.

So I wonder: can you imagine a Goldilocks option where a package can have access
all Cgo features through pre-generated cgo files? It's similar to the old binary-only package
feature, but only for the Cgo parts. I hope that C's much more stable ABI makes a pre-generated
cgo file format viable than pre-built Go binary packages.

I wouldn't mind if the hypothetical cgo file format changes once in a while, say every Go
release, as long as I could supply multiple files covering the Go releases I support. If none
of the pre-generated files are usable, the Go toolchain falls back to just running cgo processing.

As a strawman, I suggest a way to invoke

$ go tool cgo build <package>

which then outputs, say, <package_go115>.cgo. If is later built, possibly on a different
machine, the Go toolchain will look for *.cgo files in the package directory and use one of them
if possible, skipping cgo processing altogether.

Advantages

With pre-compiled Cgo the building process is faster, and I can avoid having the headers and development libraries available for my target platform, needing only a native toolchain for external linking.

What's really interesting is then adding support for -linkmode=internal when Cgo is used, alleviating the need for a native toolchain completely. This is #38918.

In other words, pre-generated cgo could give Cgo programs some of the nice cross-platform properties of pure Go programs.

@FiloSottile
Copy link
Contributor

This is exactly what I wanted to do ad-hoc for SQLite, would love to see it!

https://twitter.com/FiloSottile/status/1245831899290820608

I think there wouldn't even be any need for new .cgo file types, it should be possible to just generate .go and .syso files that build with the current toolchain. It might even be relatively stable: the ABI is fixed, so the only unstable internal is asmcgocall.

This actually means it could be a fully external tool, at least for prototyping. Indeed, the thing that needs toolchain support is #38918, which is where things get interesting.

@eliasnaur
Copy link
Contributor Author

@FiloSottile I had your SQLite and x509 work in mind when filing this proposal :) Perhaps you'd like to join the next golang-tools call May 13th for further discussion?

@eliasnaur
Copy link
Contributor Author

I think there wouldn't even be any need for new .cgo file types, it should be possible to just generate .go and .syso files that build with the current toolchain. It might even be relatively stable: the ABI is fixed, so the only unstable internal is asmcgocall.

This actually means it could be a fully external tool, at least for prototyping.

Indeed. What I'd like to achieve first is some indication that work towards pre-generation may end up in the main repository. I don't fancy maintaining an out-of-tree tool.

@ianlancetaylor
Copy link
Contributor

I don't really understand how to do this in general.

cgo packages can and do refer to external libraries; presumably we would have to find all of those external libraries and make them available somewhere. But the libraries are going to vary a lot. How do we find them and where do we put them?

Some of those libraries are written in C++. C++ code requires extra work from the linker to support things like global constructors and destructors and exception handling. If we can't invoke the external linker, we have to implement all of that code in cmd/link, and we have to maintain it as C++ changes.

Maybe I'm missing something, but while I can see how this could work in the simplest cases, I don't see how it could work in general.

@eliasnaur
Copy link
Contributor Author

eliasnaur commented May 8, 2020

I don't really understand how to do this in general.

cgo packages can and do refer to external libraries; presumably we would have to find all of those external libraries and make them available somewhere. But the libraries are going to vary a lot. How do we find them and where do we put them?

Can we ask the native toolchain while pre-pregenerating, in a similar way Cgo currently asks the toolchain about the structure of native types, functions etc.?

Some of those libraries are written in C++. C++ code requires extra work from the linker to support things like global constructors and destructors and exception handling. If we can't invoke the external linker, we have to implement all of that code in cmd/link, and we have to maintain it as C++ changes.

Maybe I'm missing something, but while I can see how this could work in the simplest cases, I don't see how it could work in general.

My use-case is system libraries, which are usually designed to have simple C (or Objective-C) interfaces as well as being impossible to replace with Go by definition. Having pre-generation work on a best-effort basis would be ok for me, I would simply not provide pre-generated files for the platforms/libraries that are not (yet) supported.

@toothrot toothrot added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 11, 2020
@toothrot toothrot added this to the Backlog milestone May 11, 2020
@dosgo
Copy link

dosgo commented Aug 15, 2021

I am looking forward to this feature, but there seems to be no progress.

@dosgo
Copy link

dosgo commented Aug 16, 2021

Is there a simple way to implement it? Pack the intermediate files generated by gcc into a zip, and continue to use a third-party linker. The main trouble with C language cross-compilation is various dependencies and libraries. I don't know much about compiling and linking, and I might understand it incorrectly.

@ianlancetaylor
Copy link
Contributor

If you have to have the third party linker and the third party libraries anyhow, then I don't think this idea is going to help very much.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 13, 2022
@myaaaaaaaaa
Copy link

myaaaaaaaaa commented Oct 22, 2023

I think there wouldn't even be any need for new .cgo file types, it should be possible to just generate .go and .syso files that build with the current toolchain. It might even be relatively stable: the ABI is fixed, so the only unstable internal is asmcgocall.

Even this may be unnecessary - it should be possible to precompile CGO code to Go assembly. This would then reduce the scope of the problem to creating a CGO -> C .S assembly -> Go .S assembly tool, which would presumably be run via go generate. The only change to the build toolchain would probably then be a magic comment that defines shared libraries to link to.

This way, CGO can be moved from the build process to the go generate process and become an internal detail purely for developers of CGO packages, making it more in line with other code generation projects. This would also resolve the difficulties of reproducible builds when using system libraries.

I don't really understand how to do this in general.

cgo packages can and do refer to external libraries; presumably we would have to find all of those external libraries and make them available somewhere. But the libraries are going to vary a lot. How do we find them and where do we put them?

Some of those libraries are written in C++. C++ code requires extra work from the linker to support things like global constructors and destructors and exception handling. If we can't invoke the external linker, we have to implement all of that code in cmd/link, and we have to maintain it as C++ changes.

Maybe I'm missing something, but while I can see how this could work in the simplest cases, I don't see how it could work in general.

I think those simplest cases are all that's needed, at least as far as C system libraries (arguably the most important use case) are concerned.

For context, there are plenty of Go projects that involve automatically generated bindings to C libraries, such as go-gl. It would be amazing if they also gained the ability to precompile themselves to all supported systems from CI, giving Go developers the newfound ability to cross-compile windowed applications.

@myaaaaaaaaa
Copy link

I think there wouldn't even be any need for new .cgo file types, it should be possible to just generate .go and .syso files that build with the current toolchain. It might even be relatively stable: the ABI is fixed, so the only unstable internal is asmcgocall.

Even this may be unnecessary - it should be possible to precompile CGO code to Go assembly. This would then reduce the scope of the problem to creating a CGO -> C .S assembly -> Go .S assembly tool, which would presumably be run via go generate.

Some prior art for this approach can be found at c2goasm, with an article describing it here: https://medium.com/@frankwessels_nl/c2goasm-c-to-go-assembly-bb723d2f777f

@myaaaaaaaaa
Copy link

myaaaaaaaaa commented Nov 6, 2023

The only build system change required would probably then be a magic comment that defines shared libraries to link to.

After some further investigation, it looks like this is already how CGO does linking today:

go/src/cmd/cgo/doc.go

Lines 761 to 775 in 883f062

The extra functions here are stubs to satisfy the references in the C
code generated for gcc. The build process links this stub, along with
_cgo_export.c and *.cgo2.c, into a dynamic executable and then lets
cgo examine the executable. Cgo records the list of shared library
references and resolved names and writes them into a new file
_cgo_import.go, which looks like:
//go:cgo_dynamic_linker "/lib64/ld-linux-x86-64.so.2"
//go:cgo_import_dynamic puts puts#GLIBC_2.2.5 "libc.so.6"
//go:cgo_import_dynamic __libc_start_main __libc_start_main#GLIBC_2.2.5 "libc.so.6"
//go:cgo_import_dynamic stdout stdout#GLIBC_2.2.5 "libc.so.6"
//go:cgo_import_dynamic fflush fflush#GLIBC_2.2.5 "libc.so.6"
//go:cgo_import_dynamic _ _ "libpthread.so.0"
//go:cgo_import_dynamic _ _ "libc.so.6"

With that in mind, how feasible would it be to implement precompiled CGO packages via a go tool that takes a list of files that import "C", invokes the system C compiler on them, and then translates the resulting C assembly into Go assembly with //go:cgo_dynamic_linker and //go:cgo_import_dynamic directives?

@bcmills bcmills changed the title cmd/cgo: add support for pre-compiled cgo packages proposal: cmd/cgo: add support for pre-compiled cgo packages Nov 8, 2023
@bcmills bcmills modified the milestones: Backlog, Proposal Nov 8, 2023
@eliottness
Copy link

eliottness commented Nov 10, 2023

👋 Hey. Coming from the dd-trace-go library where we wanted to add a C library bindings at some point where we had already a large user base and adding CGO requirements for all these users where out of question. We studied a lot of different options and we decided to use a tool called purego that uses a lot of internal mechanisms of the runtime directly. This has been working for the last 6 months already but having an official method to not impact users in case a library decides to start using CGO certainly seems more reliable. Looking forward to see this happening.

@rsc
Copy link
Contributor

rsc commented Dec 4, 2023

In general this is very difficult / impossible. One of many difficulties is keeping both static and dynamic builds working correctly. If you need C code, use cgo.

@rsc
Copy link
Contributor

rsc commented Dec 4, 2023

This proposal has been declined as infeasible.
— rsc for the proposal review group

@rsc rsc closed this as completed Dec 4, 2023
@Julio-Guerra
Copy link

Julio-Guerra commented Dec 5, 2023

As @eliottness shared, this is more or less already feasible and this is how we make it in dd-trace-go for the record:

  1. Compile your CGO bindings with the options -x -work to be able to copy the CGO-generated files, especially the Go type definitions generated out of the C ones.
  2. Copy and adapt them to work without CGO, since a lot of things in there will rely on compiler directives that don't work without CGO_ENABLED=1.
  3. Build your C library as a shared library that you can embed in your binary with go:embed.
  4. Use purego, which is a higher-level helper runtime/cgo package, to open the shared lib and call it, the same way as CGO would, as it relies on runtime/cgo too (present with or without CGO_ENABLED).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Proposal
Projects
Status: Declined
Development

No branches or pull requests