Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: Embed import path to binary to enable rebuilding #16814

Closed
joneskoo opened this issue Aug 21, 2016 · 21 comments
Closed

cmd/go: Embed import path to binary to enable rebuilding #16814

joneskoo opened this issue Aug 21, 2016 · 21 comments
Labels
early-in-cycle A change that should be done early in the 3 month dev cycle. FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@joneskoo
Copy link
Contributor

joneskoo commented Aug 21, 2016

Edit: After more thought and discussion embedding the import path in binary seems like the way to go. This would enable upgrading an installed binary application in many cases.

Original submission below:

I'd like to have go get and go install produce a manifest along with the binary that indicates when the binary was compiled, what package produced it.

This is useful to understand why I have binaries in my bin/ and would allow e.g. updating them in bulk later through some mechanism (TBD outside of this issue).

Why this should be enabled by default? Many existing tools (e.g. editors) have a mechanism to install tools (e.g. go guru). To have a complete manifest, including already shipping editors and external applications, this should be updated by default (if someone insists, with opt-out flag).

Please answer these questions before submitting your issue. Thanks!

  1. What version of Go are you using (go version)?

    $ go version
    go version go1.7 darwin/amd64
    
  2. What operating system and processor architecture are you using (go env)?

    Uh, isn't this redundant to 1.?

  3. What did you do?
    If possible, provide a recipe for reproducing the error.
    A complete runnable program is good.
    A link on play.golang.org is best.

    $ go get golang.org/x/review/git-codereview
    
  4. What did you expect to see?

    A manifest file listing where the binary came from being generated along with the binary

    E.g.

    2016-08-21T07:33:00Z package golang.org/x/review/git-codereview SHA256:619b8b31e9aed60e19051d403e01214847124370752bae131a0d96c022830e25
    

    This could be a simple text file in $GOPATH/bin/.manifest or something more fancy. KISS would support a simple log format. The exact fields open to discussion, but I feel that timestamp, package and hash of the binary generated are minimal; possibly VCS version could be nice.

  5. What did you see instead?

    No manifest file generated.

@joneskoo joneskoo changed the title Generate manifest from go get, go install cmd/go: Generate manifest from go get, go install Aug 21, 2016
@joneskoo
Copy link
Contributor Author

I just discovered the tool gorebuild which extracts import paths from the binary. However it seems to derive it from a filesystem path since it's printing absolute paths for me. The independent existence of that tool is another reason why storing this would be useful and necessary.

Maybe instead of a manifest file this information should in fact be embedded in the binary itself? That would have the additional benefit of carrying to another system than where the binary was built. If already filesystem paths are stored, import path should be even less sensitive to add. Not sure if there exists a cross-platform place to embed such meta-information into binary and retrieve it.

@quentinmit
Copy link
Contributor

+1, I think this information should be embedded directly in the binary and object files.

@quentinmit quentinmit added this to the Go1.8 milestone Sep 6, 2016
@quentinmit
Copy link
Contributor

I'm going to mark this as Go1.8 as I think it is a prereq for having build caching (since we need to know what went into a file to know if it needs to be rebuilt).

@joneskoo
Copy link
Contributor Author

What I would suggest is to include in binaries a kind of a "birth certificate" / "bill of materials" describing where it came from (import path, vcs version?) and what it consists of (packages, vcs version?).

Here's an example of a very similar idea in @hlandau's acmetool:

$ acmetool --version
go version go1.6 linux/amd64 gc cgo=false
built by travis
git github.com/alecthomas/template 14fd436dd20c3cc65242a9f396b61bfc8a3926fc heads/master
git github.com/alecthomas/units 2efee857e7cfd4f3d0138cc3cbb1b4966962b93a heads/master
git github.com/coreos/go-systemd 7b2428fec40033549c68f54e26e89e7ca9a9ce31 v5
git github.com/godbus/dbus e2cf28118e66a6a63db46cf6088a35d2054d3bb0 heads/master
git github.com/hlandau/acme 064917b44dfd719e6332f2d62bf9b2597a3f20e5 v0.0.50
git github.com/hlandau/degoutils a7296d8b17fec87fa1fa6ba8fb6addd0897f36e2 heads/master
git github.com/hlandauf/gspt 25f3bd3f5948489aa5f31c949310ae9f2b0e956c heads/master
git github.com/hlandau/xlog 197ef798aed28e08ed3e176e678fda81be993a31 heads/master
git github.com/jmhodges/clock 3c4ebd218625c9364c33db6d39c276d80c3090c6 heads/master
git github.com/mattn/go-isatty 56b76bdf51f7708750eac80fa38b952bb9f32639 heads/master
git github.com/mitchellh/go-wordwrap ad45545899c7b13c020ea92b2072220eefad42b8 heads/master
git github.com/ogier/pflag 45c278ab3607870051a2ea9040bb85fcb8557481 heads/master
git github.com/peterhellberg/link 1053d3b2893eeebd482fce32550ec24bebed308c heads/master
git github.com/satori/go.uuid f9ab0dce87d815821e221626b772e3475a0d2749 heads/master
git github.com/shiena/ansicolor a422bbe96644373c5753384a59d678f7d261ff10 heads/master
git github.com/square/go-jose 293adbe25f48db25c2212e7313192715b5fc3cea heads/master
git golang.org/x/crypto c7e3b0ebdd409a0d024e3d71801427ab0e05fb2e heads/master
git golang.org/x/net 1600a4cd699d70009ec28edfd69d3568f6bd757d heads/master
git gopkg.in/alecthomas/kingpin.v2 8cccfa8eb2e3183254457fb1749b2667fbc364c7 tags/v2.1.11
git gopkg.in/cheggaaa/pb.v1 29ad9b62f9e0274422d738242b94a5b89440bfa6 v1.0.1
git gopkg.in/hlandau/configurable.v1 41496864a1fe3e0fef2973f22372b755d2897402 v1.0.1
git gopkg.in/hlandau/easyconfig.v1 f38184c467a3200c92ac929527daf77497b7ec69 v1.0.13
git gopkg.in/hlandau/service.v2 601cce2a79c1e61856e27f43c28ed4d7d2c7a619 v2.0.15
git gopkg.in/hlandau/svcutils.v1 09c5458e23bda3b8e4d925fd587bd44fbdb5950e v1.0.7
git gopkg.in/tylerb/graceful.v1 9a3d4236b03bb5d26f7951134d248f9d5510d599 tags/v1.2.5
git gopkg.in/yaml.v2 a83829b6f1293c91addabc89d0571c246397bbf4 heads/v2

@minux
Copy link
Member

minux commented Sep 11, 2016 via email

@dmitshur
Copy link
Contributor

dmitshur commented Sep 11, 2016

I don't understand why we must bake the entire list of import paths and revisions into the binary.

It would be helpful (to me, at least) if those who are in favor of this proposal would elaborate on what problem they're trying to solve with this information.

@joneskoo
Copy link
Contributor Author

@minux Initially my need was to find out where the binaries in my bin/ came from (import path). This is useful for many things, including updating installed binaries (think go get -u). https://github.com/FiloSottile/gorebuild basically does this, but it only appears to work with the same GOPATH, as the import path is not stored to the binary, so it has its limitations. Version information (vcs id, tag/branch) stored in the binary would make this even better, and could even be used by programs to display their own version information that way.

Hopefully the first part is uncontroversial and you can agree that at least the path that could be used with go get should be reliably available to enable a better gorebuild, regardless whether the binary came from a binary distribution, scp'ed from another machine or go get.

My argument for the imported packages would be about transparency and auditability, secure development practices. For example it could be used to detect whether the binary is affected by a bug fix or vulnerability, and as a way of e.g. re-creating the build artifact (e.g. validate that a build is not compromised). It would be great if it was possible to independently re-create a binary build, all the way to matching checksum. Maybe this would come with an opt-out.

@joneskoo joneskoo changed the title cmd/go: Generate manifest from go get, go install cmd/go: Embed import path to binary to enable rebuilding Sep 11, 2016
@dmitshur
Copy link
Contributor

I see you've provided more motivation for including actual import path in the binary in FiloSottile/gorebuild#7.

I do think the import path of the command could be nice to always include, but I'm not yet sure if going beyond that is worthwhile.

@joneskoo
Copy link
Contributor Author

joneskoo commented Sep 11, 2016 via email

@minux
Copy link
Member

minux commented Sep 11, 2016 via email

@joneskoo
Copy link
Contributor Author

I don't think a mechanism would be foolproof and as such the rebuild
mechanism only makes sense as external, like gorebuild. I believe most
binaries include source paths, including go and gcc. You can usually see a
lot of paths with 'strings'.

Including the path just makes it reliable to parse this, rather than
relying on file paths.

I don't believe go tools should manage binaries but in this case enabling
the external tools requires a minor change.

I'm now convinced that for the used packages it's better to have external
tools and manifests, like gvt, handle that.

I still want the path needed for go get, and unless there are strong
objections, also vcs hash for the root project.
On su 11. syyskuuta 2016 at 9.46 Minux Ma notifications@github.com wrote:

does gcc embed the file path into the binary (except for the debugging
info)?
Why should go tool does that?

To put it another way, if I compile random C programs and put them into my
$PATH, and then forget where do they come from, I won't ask gcc to embed
such paths into the binary to make my live easier.

Managing binary assets is a much larger problem and I don't think the go
tool should solve it. For example, I'd argue that having paths and revision
is not enough, because some of the packages might have custom patches
applied. How could you represent that?


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#16814 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AASn0MlUwLsWZ6EZ8k10sMs0KREfT0w2ks5qo6PFgaJpZM4JpPvt
.

@minux
Copy link
Member

minux commented Sep 11, 2016 via email

@dmitshur
Copy link
Contributor

dmitshur commented Sep 11, 2016

Using debug/dwarf you can easily get such information. And by looking at
the file that contains main.main, you can know the path of the main package.

That's what gorebuild (and importpathof) already do.

But it doesn't work reliably if the GOPATH is not set (or is different) when trying to reconstruct the import path.

Imagine if main.main is located at path /home/user/src/foo/src/bar/main.go. Without knowing the GOPATH used to when building the original import path, it's impossible to know if the import path of the binary was foo/src/bar or bar. This is a contrived example of the possible ambiguity.

@minux
Copy link
Member

minux commented Sep 11, 2016 via email

@quentinmit quentinmit added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Oct 6, 2016
@quentinmit
Copy link
Contributor

/cc @rsc

@rsc
Copy link
Contributor

rsc commented Oct 6, 2016

This issue is asking for a particular implementation, not new functionality. Like @shurcooL said, this needs to be in service of a concrete larger goal in order to be evaluated. I'm not opposed to putting the import path into the binary next to the build ID, as long as there's some limit to how long it is (build IDs are designed to be limited size). We certainly don't want to try to record the specific version of every file.

None of this is necessary for proper build caching. The build IDs in the binary today already suffice.

In any event, this is vague enough that we can put any decisions off to the Go 1.9 cycle.

@rsc rsc modified the milestones: Go1.9Early, Go1.8 Oct 6, 2016
@dmitshur
Copy link
Contributor

dmitshur commented Mar 9, 2017

I'm not opposed to putting the import path into the binary next to the build ID, as long as there's some limit to how long it is (build IDs are designed to be limited size).

What kind of a limit would you have in mind? Would it be something large enough to cover vast majority of realistic import path lengths (like 1024 bytes)? Or something else?

What happens if an import path exceeds that length, doesn't that defeat the purpose of this field, since it would no longer accurately answer the question of "what is the import path" in some cases? Either that, or it puts a hard limitation on the length of an import path (something that was previously not there).

Is the limit needed to prevent someone from modifying Go binaries in a malicious way, by placing an unbounded number of bytes into the import path field?

@dmitshur
Copy link
Contributor

dmitshur commented Mar 9, 2017

this needs to be in service of a concrete larger goal in order to be evaluated.

To help this get evaluated, I can share what goals I have that I think this would be helpful for. I currently want it for 2 use cases:

@jimmyfrasche
Copy link
Member

Perhaps this could be included in the go tool. Something like go which binname. That would leave the implementation free to change and anyone who wants access in another program can fork/exec it

@bradfitz bradfitz modified the milestones: Go1.10Early, Go1.9Early May 3, 2017
@bradfitz bradfitz added early-in-cycle A change that should be done early in the 3 month dev cycle. and removed early-in-cycle A change that should be done early in the 3 month dev cycle. labels Jun 14, 2017
@bradfitz bradfitz modified the milestones: Go1.10Early, Go1.10 Jun 14, 2017
@rsc rsc modified the milestones: Go1.10, Go1.11 Dec 1, 2017
@rsc
Copy link
Contributor

rsc commented Apr 18, 2018

vgo does this already. When it moves into the go command, the go command will too (for module-based builds).
https://research.swtch.com/vgo-repro

Closing this issue - we're not going to make a special case for non-module-based builds at this point.

@rsc rsc closed this as completed Apr 18, 2018
@dmitshur
Copy link
Contributor

dmitshur commented Mar 2, 2019

vgo does this already. When it moves into the go command, the go command will too (for module-based builds).

For posterity, this has happened by now.

This functionality is available in Go 1.12 under the runtime/debug package. See https://golang.org/doc/go1.12#runtime/debug.

Function debug.ReadBuildInfo allows determining the import path of the running binary only. If you're interested in getting the import path of a Go binary file that is on disk, I found the rsc.io/goversion/version package which implements that functionality. See version.Version.ModuleInfo. To extract the import path of the binary, ModuleInfo needs to be parsed similarly to how debug.ReadBuildInfo does it (or more simply, by reading the first line and trimming the "path\t" prefix).

@golang golang locked and limited conversation to collaborators Mar 3, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
early-in-cycle A change that should be done early in the 3 month dev cycle. FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

8 participants