-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: cmd/link: support BUILD_PATH_PREFIX_MAP for reproducible binaries when built under varying path #22491
Comments
Change https://golang.org/cl/73291 mentions this issue: |
CC @neild |
This issue is titled with the feature you want instead of the issue you have. If the goal is to allow users to have debug info in their binaries but also trim file name prefixes in order to enable reproducible builds, then let's do that directly instead of introducing a dependence on a new, untested spec that in turn depends on a magic environment variable with a magic encoding that I can't even understand. It makes absolutely no sense to me that a toolchain that aims for simplicity would give any meaning at all to (quoting an example from the spec):
If gcc already implements this spec in its linker, then it seems like you could use -linkmode=external to have gcc do the rewrite, if you must use this environment variable. The script in the first comment above (which I amended to cat into a file instead of a directory) does not require toolchain changes to produce identical binaries from gopath1 and gopath2.
There is an issue that perhaps the toolchain should support multiple trimpath arguments (#22382). Perhaps for more complex examples there are conflicts between the toolchain's use of trimpath and an extra use in -gcflags. If that's the root problem, then let's solve that. I don't see any compelling reason to add a second, completely different and frankly very complicated mechanism targeting the same goal as -trimpath. |
Thanks for taking a look at this issue. I now realize that the instructions I gave in the first comment weren’t sufficient; sorry about that. We’re already using -trimpath in -gcflags and -asmflags in Debian, but -trimpath does not actually trim all paths: Here is an updated test script: #!/bin/zsh
set -e
for GOPATH in /tmp/gopath1 /tmp/gopath2; do
mkdir -p ${GOPATH}/src/helloworld
cat >${GOPATH}/src/helloworld/main.go <<'EOT'
package main
import (
"log"
"github.com/docker/docker/pkg/term"
"golang.org/x/crypto/curve25519"
)
func main() {
var a term.Winsize
log.Printf("a = %v", a)
var dst, in [32]byte
curve25519.ScalarBaseMult(&dst, &in)
}
EOT
go get golang.org/x/crypto/curve25519 github.com/docker/docker/pkg/term
go install -gcflags=-trimpath=$GOPATH -asmflags=-trimpath=$GOPATH helloworld
done
colordiff -u <(strings -n 10 /tmp/gopath1/bin/helloworld) <(strings -n 10 /tmp/gopath2/bin/helloworld) Is this an oversight in cmd/link? |
Doing anything in the linker seems to me too late. We should really be aiming to not put the paths into the intermediate objects either, since that will at the very least result in better caching, and if you do need to verify something, you'd think you'd want reproducible intermediate objects too. Also, the content hash of the inputs affects the build ID recorded in the binary, so if the intermediate objects differ, the output will differ too. This is by design. Also the change in the build ID would not be fixable with BUILD_PATH_PREFIX_MAP since the path prefixes causing the build ID change have been hashed and are no longer visible to the linker. Also, expecting users to cobble together the necessary magic incantations seems to me unreasonable, especially since as the toolchain evolves the magic incantations may change. If builds without full file paths are something we want to support for the long term, that should have a simple API (like 'go install -newflag' / 'go build -newflag', for some value of newflag) that we can maintain as the underlying tools change and that would do the right thing in any toolchain, instead of hard-coding the answer for the gc toolchain of a specific day. That seems to me exactly #16860, which we didn't get to for Go 1.10 but that we could probably plan to get to for Go 1.11. I'm honestly a little surprised these reproducible builds bogged down by file name issues don't just use chroots to get control over the file names. |
Hi, author of the spec here. It was mainly designed for languages that allow much greater freedom on where source files are placed (e.g. C/C++), and I can understand that this level of complexity might not be necessary for Go. I'd like to also point out again the original 5-line patch I wrote in the first comment of #16860. That will cause absolute paths to be relative, however the downside is that builders can't specify a specific new "base" dir, which is possible with BPPM. AIUI, there was some concern that forcing everything to be relative, might affect debugging tools. Perhaps some sort of halfway-point could be found between my original patch and @stapelberg's proposed BPPM patch, if the latter is considered too complex for Go. Though regarding the former, I haven't rebased it or tested it recently. As for chroots, we think it would be a nice goal to support build-path-independent reproducibility in the toolchain itself, without needing extra container management software that would be yet another dependency for developers (and rebuilders and bootstrappers, etc). Since build-time paths don't exist on run-time systems anyway, the information is not useful, and getting rid of it ought not to be a complex thing for a reasonable toolchain. If it is complex, that means something is depending on these invalid paths, and that behaviour is perhaps broken or at least questionable. So it's a useful QA exercise. |
As I noted above (#22491 (comment)), the new toolchain invalidates the 5-line patch in #16860. Stripping the file names during link, or even linking with -w to strip dwarf entirely, will not produce reproducible builds if there were different files recorded in the .a files in the two builds. There needs to be a separate explicit signal to the go command, and the go command needs to be responsible for doing the right thing, both with the gccgo and gc toolchains. Let's leave that for #16860. |
Upstream has rejected the patch in this form and promised to implement an alternative they are happy with instead. That hasn't happened yet though. Bug: golang/go#22491, golang/go#16860 Forwarded: https://golang.org/cl/73291 (rejected upstream though) Last-Update: 2018-02-08 Gbp-Pq: Name 0001-reproducible-BUILD_PATH_PREFIX_MAP.patch
(Splitting this out of #16860 as per iant’s request in https://golang.org/cl/73291)
Currently, the full file system path to the source code files is included in Go binaries, e.g.:
The two resulting binaries differ:
colordiff -u <(strings $(go env GOPATH)/bin/helloworld) <(strings /tmp/gopath2/bin/helloworld)
prints:For the https://reproducible-builds.org/ project, it is desirable to add a way to make the compilation output independent of the build path (see also https://reproducible-builds.org/docs/build-path/), or, in other words, “reproducible when built under varying build path”.
Change https://golang.org/cl/73291 implements the BUILD_PATH_PREFIX_MAP specification.
The above test has been done with go1.9.2, but git HEAD behaves the same:
The text was updated successfully, but these errors were encountered: