Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: support reproducible buildid when building with -trimpath #34186

Closed
longsleep opened this issue Sep 9, 2019 · 2 comments
Closed

cmd/go: support reproducible buildid when building with -trimpath #34186

longsleep opened this issue Sep 9, 2019 · 2 comments

Comments

@longsleep
Copy link

When building with the new -trimpath in 1.13, resulting binaries are almost reproducible. The only thing different is the buildid which also gets written inside the resulting build artifacts.

#16860 implemented the -trimpath flag, stripping the paths successfully. But apparently the buildid still gets the path taken into account in its actionID parts.

// The "one-element cache" purpose is a bit more complex for installed
// binaries. For a binary, like cmd/gofmt, there are two steps: compile
// cmd/gofmt/*.go into main.a, and then link main.a into the gofmt binary.
// We do not install gofmt's main.a, only the gofmt binary. Being able to
// decide that the gofmt binary is up-to-date means computing the action ID
// for the final link of the gofmt binary and comparing it against the
// already-installed gofmt binary. But computing the action ID for the link
// means knowing the content ID of main.a, which we did not keep.
// To sidestep this problem, each binary actually stores an expanded build ID:
//
//    actionID(binary)/actionID(main.a)/contentID(main.a)/contentID(binary)

Resulting in different non-reproducible builds when building the same source from different paths.

For example CGO_ENABLED=0 go build -trimpath -o bin/tool ./cmd/tool yields

a: t6sFhx64vDfJLhVcRcjW/J56i1RIbPcgOquS7FGtO/aDD2rPWxLW9E5uImuM8n/V-uwd08cQ2E04lQdOuy0

For source in folder a

and

b: EnJtrdNkod77vx69dugv/lCsJ2qPEOefIbrg_fZ5U/aDD2rPWxLW9E5uImuM8n/V-uwd08cQ2E04lQdOuy0

for source in folder b.

This is the only thing different in the resulting binary. Is that intentional? It would be nice for reproducible builds if the buildid could be the same no matter in what folder the source is actually built.

For the time being i use a small Python script to override the different parts in the resulting binary like this

    buildid = subprocess.check_output([go, 'tool', 'buildid', fn]).strip()
    actionid = b'/'.join(buildid.split(b'/', 2)[:2])
    with open(fn, 'r+b') as f:
        data = f.read()
        idx = data.find(actionid)
        if idx == -1:
            raise ValueError('actionid not found in file')
        f.seek(idx)
        f.write(b'0'*(len(actionid)-2))
        f.write(b'/0')

but this might not be very reliable. Can this be improved?

@ALTree
Copy link
Member

ALTree commented Sep 9, 2019

Thanks for reporting this.

This is a dup of #33772.

It would be useful if you could confirm that the workaround mentioned at #33772 (comment) works for you.

This difference is due to the .note.go.buildid section added by the linker. It can be set to something static e.g. -ldflags=-buildid= (empty string) to gain reproducibility.

@longsleep
Copy link
Author

It would be useful if you could confirm that the workaround mentioned at #33772 (comment) works for you.

Oh yeah - indeed adding -ldflags=-buildid= does the trick.

Closing as dup of #33772 - thanks @ALTree

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants