Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: add compiler flags, relevant env vars to 'go version -m' output #35667

Closed
michael-obermueller opened this issue Nov 18, 2019 · 14 comments

Comments

@michael-obermueller
Copy link

This is a proposal to add extensive build meta information to Go binaries for various use cases like:

  • Stability: maturity analysis
  • Security: vulnerability detection
  • Technology detection, which is the process of identifying if an application's underlying technology is Go

Currently it is hard to retrieve meta information from Go binaries - either information is missing completely or extraction requires extensive parsing of the binary file. The following table lists existing metadata entities and the mechanism required to extract the information.

Meta information Extraction
Go build version Symbol table lookup to access global variable runtime.buildVersion (type string)
Build information (modules and versions) Symbol table lookup to access global variable runtime/debug.modinfo (type string)
Compiler options, e.g. build mode, compiler, gcflags, ldflags Currently this information is not present in the executable
User defined custom data, e.g. application version, vendor name Currently this is only possible when setting global string variables at compile-time. The downside of this approach is that it requires the symbol table to access them and implies data type knowledge.

This proposal is to provision extended build time meta information to Go binaries. Reading the information from binaries shall be trivial.

Go already provisions go.buildid hash string into Go binaries and provides tools to read that information from the binary.

go.buildid is provisioned in PT_NOTE segment for ELF based systems (see note sections (2-4)). In case of executable file formats which do not define appropriate mechanisms for enclosing meta information (like e.g. Windows PE), go.buildid is added as non-instruction bytes at the very beginning of the .text segment.

Thus, a portable mechanism for meta information provisioning is already in place and can be re-used for build meta information. The proposed name for build meta information is go.metadata and it should be added after the existing go.buildid entry.

go.metadata format

The proposed format for go.metadata is JSON. JSON is extensible and Go has first class JSON parsing support. The following sample shows what meta information of a simple Go binary may look like:

{
    "version": "go1.13.4",
    "compileropts": {
        "compiler": "gc",
        "mode": "pie",
        "os": "linux",
        "arch": "amd64",
        "libcvendor": "GLIBC",
        "cgoenabled": true
    },
    "buildinfo": {
        "path": "HelloWorld",
        "main": {
            "path": "HelloWorld",
            "version": "(devel)",
            "sum": ""
        },
        "deps": [
            {
                "path": "github.com/pkg/errors",
                "version": "v0.8.1",
                "sum": "h1:iURUrRGxPUNPdy5/HRSm+Yj6okJ6UtLINN0Q9M4+h3I="
            }
        ]
    },
    "user": {
        "customkey": "customval",
        "version": "1.0",
        "vendor": "my company name"
    }
}

go.metadata shall validate against the JSON schema attached to this issue.

@michael-obermueller michael-obermueller changed the title cmd/link - Include build meta information cmd/link: Include build meta information Nov 18, 2019
@michael-obermueller michael-obermueller changed the title cmd/link: Include build meta information proposal: cmd/link: Include build meta information Nov 18, 2019
@gopherbot gopherbot added this to the Proposal milestone Nov 18, 2019
@rsc
Copy link
Contributor

rsc commented Nov 27, 2019

This could get arbitrarily complex. We already have the first two rows in the table, accessible using go version <binary>, even for stripped binaries.

Generalizing to JSON will just make the binaries bigger and create more work for existing parsers, for very little benefit.

Generalizing to arbitrary metadata similarly adds complexity with not much benefit.

I think we should probably stop where we have stopped.

@networkimprov
Copy link

Is there a way that apps could opt-in to this scheme? Perhaps by defining a const string (in arbitrary format) and then passing its name to a build flag to be sited at a known or locatable offset?

@michael-obermueller
Copy link
Author

michael-obermueller commented Nov 29, 2019

@rsc - as you outlined, some of the data is already available with go version <binary>. The issue is, that go version <binary> bears two implicit assumptions - go tool chain is installed and the binary is a Go built binary.
If these two things are removed from the equation, the process to read meta data gets a lot more tedious.

Tools (not necessarily implemented in Go) which operate on application meta information have to deal with all sorts of technologies. Performance monitoring tools and vulnerability scanners supervise production systems.
Thus, go version is not an option. Another, very different use case is to extend file command for Go applications. Go is a great technology and already deploys a mint foundation of information into application binaries. Sure, reading .go.buildinfo and parsing runtime.modinfo are no major technical obstacles. Nonetheless, it increases tech currency and the risk of failure once these internal formats change.

This reasoning led us to propose JSON formatted build meta information. The very minor increase in binary size is in our view outweight by its extendability, standardization, and availability of proven parsers. But JSON format is in no terms a mandatory requirement. If size is a roadblock, it can be substituted with another, more lightweight format.
Aside the implicit overhead of JSON, all proposed data is either

  • already included in application binary today (version, module info),
  • limited in size (likely less then 1KB, tool chain options), or
  • in control of the user (custom data)

Thus, we think the proposal adds significant benefits to Go, by bringing it en par with other technologies.

@ianlancetaylor
Copy link
Contributor

When you say "en par with other technologies" which technologies are you thinking of? If other languages are providing this kind of information we should consider doing what they do rather than inventing something new.

Note that for specific purposes the linker's -X option can be used to set run-time information based on build-time data.

@Hollerberg
Copy link

@ianlancetaylor [disclaimer - I am co-author of the proposal]

To my knowledge - no standardized deployment mechanism or defined set of meta information that Go could directly re-use exists. We tried to compile a set of properties, that seemed reasonable for
hopefully many applications (adding custom information is e.g. has no importance for our use cases).

In shortcoming of a better mechanism, we did propose to follow the go.buildid
embedding scheme, although it requires searching .text segment in PE format - which is a real
performance hit in technology detection / determination.

Different technologies use very divergent schemes and formats. Many technologies simply leverage from not being bound to a specific file format (like ELF or PE). Others, from being backed
by organizations with the ability to extend file format definitions to their requirement.

Node.js has package.json file, which defines a rich set of information (name, version, license,
runtime version limitations, dependent packages etc.).

.Net manifest files contain a quite rich set of meta information (referenced assemblies, version,
vendor etc., standardized in ECMA-335).

Java has a subset of the proposed information in its manifest file.

@rsc rsc added this to Incoming in Proposals (old) Dec 4, 2019
@rsc
Copy link
Contributor

rsc commented Feb 5, 2020

OK, so to summarize, the proposal was to put the following in the binary in a new section:

  • Go build version
  • Build information (modules and versions)
  • Compiler options (build mode, compiler, gcflags, ldflags)
  • User-defined data

The first two are already present in the binary and don't need to be duplicated. (They can be extracted by parsing the binary or by running go version <binary> or at runtime by using debug.ReadBuildInfo.) The last (user-defined data) is apparently not needed, per previous comment ("adding custom information is e.g. has no importance for our use cases").

That leaves compiler information (basically, the go command-line options). It seems like we could plausibly write that down too. I've wanted it at least once in the last month. That would just be a single new line in the existing format, not a whole new section.

If we refocus this discussion to be adding go command command-line flags, does that address the initial request well enough?

@rsc rsc moved this from Incoming to Active in Proposals (old) Feb 5, 2020
@michael-obermueller
Copy link
Author

@rsc, thanks for the summary and considering the proposal.
Of course, we would prefer the proposed solution which inflicts less dependencies to Go internals. The parsing application implies knowledge of (Go version dependent)

  • buildinfo binary format
  • memory layout of Go string type
  • composition of modinfo string

JSON format would have mitigated these technical dependencies. As correctly stated, we do not have specific use cases for user defined data.
That being said, we very welcome the proposed addition of build flags. If we understood your proposed solution correctly, buildinfo would be extended by a new entry, e.g.

offset data
0x0 build info magic = "\xff Go buildinf:"
0xe binary ptrSize
0xf endianess
0x10 pointer to string runtime.buildVersion
0x10 + ptrSize pointer to runtime.modinfo
0x10 + 2 * ptr size pointer to build flags

Would you consider to include environment variables (e.g. CGO_ENABLED) which impact the build process, too?

@rsc
Copy link
Contributor

rsc commented Feb 12, 2020

Note that there is a parser today at https://rsc.io/goversion/version. We could move that somewhere in golang.org/x. Many programs will be just fine shelling out to go version -m <file>. We can also add a -json flag to that command to print JSON instead.

Yes, the idea is that we'd add relevant go command line flags. We could look into relevant go environment variables too, but we don't want to break reproducible builds, so we shouldn't write down things like GOPATH, which might mention temporary directories.

@aarzilli
Copy link
Contributor

The compiler flags are already saved in the DW_AT_producer attribute in debug_info.

@michael-obermueller
Copy link
Author

@rsc, thanks for the parser reference and proposal to add JSON formatted output. Although we cannot use it directly for our use case, it can serve as a reference for our parser implementation. Moving the parser to golang.org/x makes sense.
We only would consider environment variables which are essential to reproduce the build - GOPATH and related are not among these.

@rsc rsc changed the title proposal: cmd/link: Include build meta information proposal: cmd/go: add compiler flags, relevant env vars to 'go version -m' output Feb 26, 2020
@rsc
Copy link
Contributor

rsc commented Feb 26, 2020

I've retitled based on the discussion above. Based on the discussion this seems like a likely accept.

@bradfitz is also filing a separate issue to discuss whether we should have version information from the current VCS working directory.

@rsc
Copy link
Contributor

rsc commented Mar 4, 2020

No change in consensus, so accepted.

@rsc rsc moved this from Likely Accept to Accepted in Proposals (old) Mar 4, 2020
@rsc rsc modified the milestones: Proposal, Backlog Mar 4, 2020
@rsc rsc changed the title proposal: cmd/go: add compiler flags, relevant env vars to 'go version -m' output cmd/go: add compiler flags, relevant env vars to 'go version -m' output Mar 4, 2020
@imjasonh
Copy link

Is this done now at head?

$ gotip version
go version devel go1.18-eba0e866fa Mon Oct 18 22:56:07 2021 +0000 darwin/amd64
$ gotip build ./
$ gotip version -m ko
...
	build	compiler	gc
	build	tags	goexperiment.regabiwrappers,goexperiment.regabireflect,goexperiment.regabiargs
	build	CGO_ENABLED	true
	build	CGO_CPPFLAGS	
	build	CGO_CFLAGS	
	build	CGO_CXXFLAGS	
	build	CGO_LDFLAGS	
	build	gitrevision	6447264ff8b5d48aff64000f81bb0847aefc7bac
	build	gituncommitted	true

@AlekSi
Copy link
Contributor

AlekSi commented Nov 12, 2021

I think it is, someone should close this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
Development

No branches or pull requests

10 participants