Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/cgo: add ${MAIN_MODULE_DIR} or similar #34867

Closed
josharian opened this issue Oct 12, 2019 · 12 comments
Closed

cmd/cgo: add ${MAIN_MODULE_DIR} or similar #34867

josharian opened this issue Oct 12, 2019 · 12 comments

Comments

@josharian
Copy link
Contributor

Problem

I'm using git2go (a libgit2 wrapper), and I need to generate static builds. I've hit a build system conflict of sorts.

Here's how static builds work in GOPATH mode:

  1. git2go has libgit2 as a submodule.
  2. git2go contains build scripts that do a static build of the submodule libgit2, including generating a pkg-config .pc file. The build artifacts are located inside the submodule, which is in a predictable subdirectory relative to the git2go code.
  3. git2go's cgo directives point to the .pc file, using ${SRCDIR}.
  4. Build as usual.

This breaks down immediately in module mode. There's no submodule present in the module cache. And it doesn't make sense to try write a script to manually put a libgit2 checkout in the module cache: (1) the module cache is intentionally and forcefully readonly and (2) the module cache is a cache.

Unfortunately, these considerations seem to rule out using a git2go-relative directory (via ${SRCDIR}) for the libgit2 build artifacts.

But using a hard-coded absolute directory is undesirable, for the usual reasons. (And any absolute directory must be hard-coded, since we can't modify the copy of git2go in the module cache in order to insert an appropriate absolute directory.)

Proposed solution

I could use a directory relative to the module for the code I'm working on. Then I could write a script that would build libgit2 there. But there's no way I see for the git2go package cgo directives to refer to it. If there was a cgo ${MAIN_MODULE_DIR} or the like, I could use that.

This has a major downside: Every cgo package containing the string ${MAIN_MODULE_DIR} now depends on the contents of the main module directory. This could cause lots of cgo rebuilds, which are expensive.

I'd love to hear other alternative solutions as well.

cc @bcmills @jayconrod @ianlancetaylor

@katiehockman katiehockman added this to the Proposal milestone Oct 14, 2019
@jayconrod
Copy link
Contributor

This has a major downside: Every cgo package containing the string ${MAIN_MODULE_DIR} now depends on the contents of the main module directory. This could cause lots of cgo rebuilds, which are expensive.

This sounds like a pretty big change. I don't think we should allow dependency cycles like this. Cycles in general lead to a lot of bugs and complication. I could see something like ${MODULE_DIR} expanding to the top-level directory of the current module (though this might also run into caching issues and cycles at the package level). Maybe another approach is possible?

Is there any way that git2go can be shipped as a module zip that is buildable with relatively little modification to cgo or cmd/go? In general, I don't think it's feasible to do a full build of a complicated C library within cgo, but perhaps it's reasonable to add minimal .go files in directories of C files or just to require that libgit2 is installed on the system?

@josharian
Copy link
Contributor Author

josharian commented Oct 14, 2019

Is there any way that git2go can be shipped as a module zip that is buildable with relatively little modification to cgo or cmd/go?

In theory I guess it could include static builds of libgit2 for all platforms. Seems pretty undesirable, though.

In general, I don't think it's feasible to do a full build of a complicated C library within cgo

Indeed. libgit2 uses cmake, which is pretty common. I guess in theory cgo could be taught to use cmake, but that seems like a giant can of worms.

perhaps it's reasonable to add minimal .go files in directories of C files

There are really two problems: getting ahold of the code in the submodule, and coordinating on where to put (and look for) the C build artifacts. There are a few ways to solve the submodule problem, such as distributing git2go as a module zip or switching from git submodules to git subtrees. Coordinating on where to put the C build artifacts is harder.

require that libgit2 is installed on the system?

This is the absolute directory approach I was really trying to avoid. It may be unavoidable, I guess.

What if we added ${MODULE_VERSION} and ${MODULE_CACHE_DIR} to cgo? Then I could coordinate on putting the C build artifacts in ${MODULE_CACHE_DIR}/libgit2/${MODULE_VERSION}. Then there's no concern with C libraries and Go code falling out of sync, and I can use another technique (distribute module as zip, download on the fly, use git subtrees) to get ahold of and build the C code.

@josharian
Copy link
Contributor Author

(Aside: Had #24094 and #27161 been decided differently, this would be easy. I understand the rationale behind those decisions, but it really feels like I'm fighting cmd/go far, far more than I ever did before Go modules.)

@jayconrod
Copy link
Contributor

How much do you expect to get done within the go command (and cgo) and how much by an external build system that wraps the go command (even if it's simple like a bash script run ahead of time)?

Ideally, everything could be done with the go command and cgo, but cmd/go is not a fully general build system. That simplicity is really valuable in my opinion: any tools built on top of the go command (using go list directly or golang.org/x/tools/go/packages or go/build) needs to support this. So I'd lean toward doing whatever's necessary in a build script outside the go command.

There are really two problems: getting ahold of the code in the submodule, and coordinating on where to put (and look for) the C build artifacts. There are a few ways to solve the submodule problem, such as distributing git2go as a module zip. Coordinating on where to put the C build artifacts is harder.

About distributing a module zip, if there are release version tags, the go command will create zips from the repository at those commits. So the structure doesn't need to match the main development branch exactly.

About build artifacts, please consider the module cache read-only after a module is downloaded. It's hard to be confident in the integrity of downloaded modules without that invariant.

This is the absolute directory approach I was really trying to avoid. It may be unavoidable, I guess.

Not sure if this is feasible, but how about having the cgo package get all its type information from headers (perhaps copied to the same directory) but not compiling anything directly. As part of a build script wrapping the Go command, you could build a static library somewhere local, then pass -extldflags=-Lsomedir -lgit2 to the Go command?

@josharian
Copy link
Contributor Author

How much do you expect to get done within the go command (and cgo) and how much by an external build system that wraps the go command (even if it's simple like a bash script run ahead of time)?

Ideally I'd be able to write a one-time build script that builds the C library, and then I'd be able to use bog standard cmd/go commands thereafter. I would expect that that build script would have to get re-run per machine and after a module version change. That is the pre-module status quo. (See https://github.com/libgit2/git2go/blob/master/script/build-libgit2-static.sh and https://github.com/libgit2/git2go/blob/master/git_static.go.)

My experience is that any time plain 'go build' or 'go test' or 'go install' doesn't work it introduces a bunch of friction.

the structure doesn't need to match the main development branch exactly.

Tree mismatches between "development" and "release" make development harder and are a source of bugs.

But I think I can work around this half of the problem by switching from git submodules to git subtrees.

how about having the cgo package get all its type information from headers (perhaps copied to the same directory) but not compiling anything directly.

This is how git2go static builds work in GOPATH mode. It's what I'm trying to recreate in module mode.

you could build a static library somewhere local, then pass -extldflags=-Lsomedir -lgit2 to the Go command?

Right. There are two sticking spots here.

(1) We've gone from having all configuration "just work" by having everything be relative to the git2go ${SRCDIR} and having the config within git2go itself, to having configuration be duplicated by every user of git2go and requiring build scripts instead of 'go build'.

(2) I have to manually recreate any configuration that cmake has detected. Here's the libgit2.pc from my GOPATH build:

prefix="/Users/josh/src/github.com/libgit2/git2go/vendor/libgit2/install"
libdir=${prefix}/lib
includedir=${prefix}/include

Name: libgit2
Description: The git library, take 2
Version: 0.27.9

Libs: -L${libdir} -lgit2
Libs.private: -lcurl -framework CoreFoundation -framework Security -lz -L/usr/lib -liconv
Requires.private: 

Cflags: -I${includedir}

That's a fair few linker flags. Any mismatch between that and what I hard code in my build scripts are going to cause build failures. Plus I also need separate build scripts per OS. Much messier than just using the .pc file that cmake emits for exactly this purpose. It'd help a bit if there was a way to pass cgo directives via cmd/go; then I could provide an appropriate pkg-config directive. But it doesn't help with sticking point (1) above.

@bcmills
Copy link
Contributor

bcmills commented Oct 14, 2019

If you need to user to run some build step specific to the git2go library on their local machine, then it seems like the clear solution is a replace directive: that is the mechanism by which a user can replace the stock source code for a module with a modified version of that source code.

So, for example, you could have the build script for git2go locate the main module directory (using go list -m -f {{.Dir}}), then construct a git2go subdirectory there containing both the original git2go sources and the corresponding locally-generated artifacts, and finally run go mod edit -replace to locally replace its own source code with the modified copy.

@josharian
Copy link
Contributor Author

A relative dir replace directive would work.

Of course, that means we’ve lost version safety, because replace directives override everything. This sounds like a bug waiting to happen. I guess I could write a go test that parses go.mod and checks that the replacement code is identical to the replaced code except for the build artifacts.

Also it still feels like I’m fighting the go tool. And this is something that everyone using git2go will have to duplicate.

But this suggestion is the least unpalatable I’ve heard. I’ll implement and report back.

@bcmills
Copy link
Contributor

bcmills commented Oct 14, 2019

Of course, that means we’ve lost version safety, because replace directives override everything.

Note that a replace directive with an explicit version on the left side of the directive only replaces that version. So you could at least set it up so that the build breaks explicitly if a new dependency is added that requires a newer git2go.

@josharian
Copy link
Contributor Author

Ah! I missed that. Thanks, that helps a lot.

@josharian
Copy link
Contributor Author

josharian commented Oct 16, 2019

Got this working.

For anyone else who finds themselves in a similar situation, here's what I did.

On the git2go side:

  • remove the libgit2 submodule, because cmd/go does not support submodules
  • set up libgit2 as a git subtree, but in a directory called libgit2, not vendor/libgit2, because cmd/go ignores all subdirs of any dir called vendor
  • update build scripts and cgo directives to refer to libgit2's new home
  • change git2go build tags to do static builds by default

In the repo where you want to use git2go (note that I use github.com/josharian/git2go, my fork, throughout; adjust as needed):

#!/bin/bash

set -e
set -v

# Operate in top level dir (and fail if out of git tree).
TOP=$(git rev-parse --show-toplevel)
cd $TOP

# Remove any replaces lines, since they might be outdated,
# and the replacement might not exist on disk,
# which causes subsequent go command failures.
REPLACEMENTS=$(go mod edit -json | GO111MODULE=off go run find_replaces.go)
echo $REPLACEMENTS
for REPLACEMENT in $REPLACEMENTS
do
    go mod edit -dropreplace=github.com/josharian/git2go@$REPLACEMENT
done
# Now we can use 'go list'.
VERSION=$(go list -m -f '{{.Version}}' github.com/josharian/git2go)
BUILDDIR=build/git2go
DST=$BUILDDIR/$VERSION
# Ensure the module cache is populated.
go mod download github.com/josharian/git2go
SRC=$(go list -m -f '{{.Dir}}' github.com/josharian/git2go)
# Sanity check. If $SRC were "", then the cp command below could wreak some havoc.
if [ -z "$SRC" ]; then
    echo "could not locate git2go sources"
    exit 1
fi

# Clean up any previous builds.
mkdir -p $BUILDDIR  # so that chmod doesn't fail
chmod -R +rw $BUILDDIR  # so that rm -rf doesn't fail
rm -rf $BUILDDIR

# Copy everything to dst, make it writable.
mkdir -p $DST
cp -R $SRC/* $DST
chmod -R +rw $DST

# Do the build.
cd $DST
chmod +x script/build-libgit2-static.sh
script/build-libgit2-static.sh

# Return to top level
cd $TOP

# Update go.mod.
go mod edit -replace=github.com/josharian/git2go@$VERSION=./$DST

# Test build
go install github.com/josharian/git2go

You'll need find_replaces.go in your git repo root:

// +build ignore

package main

import (
	"encoding/json"
	"fmt"
	"log"
	"os"
)

type Module struct {
	Path    string
	Version string
}

type ModEditOut struct {
	Replace []struct {
		Old Module
		New Module
	}
}

func main() {
	dec := json.NewDecoder(os.Stdin)
	var mod ModEditOut
	if err := dec.Decode(&mod); err != nil {
		log.Fatalf("failed to decode stdin: %v", err)
	}
	for _, rep := range mod.Replace {
		if rep.Old.Path == "github.com/josharian/git2go" {
			fmt.Println(rep.Old.Version)
		}
	}
}

@josharian
Copy link
Contributor Author

josharian commented Oct 17, 2019

Drat. The script above fails when a new developer runs it: Attempting to get the version fails, because the relative directory referred to by the replace directive doesn't exist yet, so go list fails. So I need to get the version some other way. The obvious way is go mod edit -json, except that I can't just drop a Go program next to the shell script to extract it, since the module itself is broken, so go run fails.

Filed #34943 to request a way to do this without parsing json.

Edited the script above with a Go helper so that it should work.

@josharian
Copy link
Contributor Author

For anyone who finds this issue later, the Go helper above is not necessary. See the trick discussed at #34943 (comment).

@golang golang locked and limited conversation to collaborators Dec 5, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants