New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/link: Extremely slow start for buildmode=PIE linkmode=internal binaries (R_X86_64_64 relocations) #36028
Comments
For context, we'd like to build gVisor's binaries as PIE (with internal linking, as we forbid cgo/libc). Our main binary ends up with 222k relocations and takes 12s to start (vs ~35ms normally linked). This seems like a massive number of relocations (it seems like type are mostly for type information and constant strings? I'm not very familiar with the extra data Go generates). It would be great to reduce the number of relocations, but just switching to R_X86_64_RELATIVE will make things at least reasonable. |
Change https://golang.org/cl/210181 mentions this issue: |
Seems like a clearly worthwhile improvement, and there's already a CL under review. It'll have to wait for 1.15 though (and the CL is tagged R=go1.15 accordingly), since it's not a regression. |
Hey @prattmic, thank you for the report and for the CL. I examined this bug and your CL but noticed there isn't a test and thought that perhaps if need be we can create a regression test that'll check for an anomaly between those numbers and then report one. The caveat here is that my version of Go complains about your flags for the last build as per (also I don't have formal education in statistics :)) internal linking requested but external linking required: buildmode=pie but perhaps this test can help after you tweak the slow buildmode arguments package main
import (
"io/ioutil"
"math"
"os"
"os/exec"
"path/filepath"
"testing"
"time"
)
const prog = `
package main
import (
"fmt"
"html"
"log"
"net/http"
)
func main() {
http.HandleFunc("/bar", func(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "Hello, %q", html.EscapeString(r.URL.Path))
})
go http.ListenAndServe(":8080", nil)
// Exit immediately so we can easily measure startup.
log.Printf("Exiting...")
}`
func TestBuildModeSpeedComparisons(t *testing.T) {
tmpDir, err := ioutil.TempDir("", "issue36028")
if err != nil {
t.Fatalf("Failed to create tmp directory: %v", err)
}
defer os.RemoveAll(tmpDir)
mainGoFile := filepath.Join(tmpDir, "http.go")
if err := ioutil.WriteFile(mainGoFile, []byte(prog), 0600); err != nil {
t.Fatalf("Failed to write file: %v", err)
}
staticFile := filepath.Join(tmpDir, "http.static")
pieFile := filepath.Join(tmpDir, "http.ext.pie")
pieLinkFile := filepath.Join(tmpDir, "http.int.pie")
var noEnv []string
_, output, err := execIt(noEnv, "go", "build", "-o", staticFile, mainGoFile)
if err != nil {
t.Fatalf("Failed to build object file: %v\n%s", err, output)
}
_, output, err = execIt(noEnv, "go", "build", "-buildmode=pie", "-o", pieFile, mainGoFile)
if err != nil {
t.Fatalf("Failed to build PIE file: %v\n%s", err, output)
}
cgoEnabled0 := append(os.Environ(), "CGO_ENABLED=0")
// TODO: Get @prattmic to properly add the slow build flags here.
_, output, err = execIt(cgoEnabled0, "go", "build", "-ldflags=-linkmode=internal", "-o", pieLinkFile, mainGoFile)
if err != nil {
t.Fatalf("Failed to build PIE -linkmode file: %v\n%s", err, output)
}
durStatic, _, _ := execIt(noEnv, staticFile)
durPIE, _, _ := execIt(noEnv, pieFile)
durPIELink, _, _ := execIt(noEnv, pieLinkFile)
values := []struct {
name string
value time.Duration
}{
{"durStatic", durStatic},
{"durPIE", durPIE},
{"durPIELink", durPIELink},
}
sum := 0.0
for _, value := range values {
sum += float64(value.value)
}
mean := sum / float64(len(values))
sumOfSquares := 0.0
for _, value := range values {
sumOfSquares += math.Pow(float64(value.value)-mean, 2.0)
}
variance := sumOfSquares / float64(len(values))
σ := math.Pow(variance, 0.5)
// Heuristic: outlier detection by 3 sigma-rule.
t.Logf("σ stdDev: %s", time.Duration(σ))
meanMinus2σ := time.Duration(mean - 2*σ)
meanPlus2σ := time.Duration(mean + 2*σ)
for _, value := range values {
if value.value > meanPlus2σ {
t.Errorf("Anomaly: %q: %s > (μ + 2σ = %s)", value.name, value.value, meanPlus2σ)
} else if value.value < meanMinus2σ {
t.Errorf("Anomaly: %q: %s < (μ - 2σ = %s)", value.name, value.value, meanMinus2σ)
}
}
}
func execIt(env []string, prog string, args ...string) (dur time.Duration, output []byte, err error) {
startTime := time.Now()
defer func() {
dur = time.Now().Sub(startTime)
}()
cmd := exec.Command(prog, args...)
cmd.Env = env
output, err = cmd.CombinedOutput()
return
} |
I can certainly add a timing test like that if desired. Are such tests common/desired in Go? These kinds of tests have the potential to be flaky due to small timing differences. I could also add a test looking for specific kinds of relocations in the output, but that is fragile w.r.t. future changes to the linker. |
Currently, internally linked PIE binaries take an extremely long time to start.
Suppose an example Go http server. Large enough to produce an interesting binary, but still simple:
hello.go
:Let's build it a few different ways:
(cgo is not supported for internally linked PIE)
Now let's run them:
The internally linked binary takes over 1s to start!
The problem is the relocations. Both the internally and externally binaries contain ~60k relocations (!). The externally linked version has R_X86_64_RELATIVE relocations, which are trivial for the dynamic linker to resolve. The internally linked version has R_X86_64_64 relocations, which require a symbol lookup to resolve.
Sure enough, profiling reveals that most of that 1s is going to symbol resolution in the dynamic linker. We should switch to more efficient relocations to keep dynamic linking times reasonable.
We already have a TODO to switch these relocations:
go/src/cmd/link/internal/amd64/asm.go
Line 325 in c3e8a20
The text was updated successfully, but these errors were encountered: