Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/link: Extremely slow start for buildmode=PIE linkmode=internal binaries (R_X86_64_64 relocations) #36028

Closed
prattmic opened this issue Dec 6, 2019 · 6 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@prattmic
Copy link
Member

prattmic commented Dec 6, 2019

Currently, internally linked PIE binaries take an extremely long time to start.

Suppose an example Go http server. Large enough to produce an interesting binary, but still simple:

hello.go:

package main

import (
        "fmt"
        "html"
        "log"
        "net/http"
)

func main() {
        http.HandleFunc("/bar", func(w http.ResponseWriter, r *http.Request) {
                fmt.Fprintf(w, "Hello, %q", html.EscapeString(r.URL.Path))
        })

        go http.ListenAndServe(":8080", nil)

        // Exit immediately so we can easily measure startup.
        log.Printf("Exiting...")
}

Let's build it a few different ways:

$ go build -o ./http.static http.go
$ go build -buildmode=pie -o ./http.ext.pie http.go
$ CGO_ENABLED=0 go build -buildmode=pie -ldflags=-linkmode=internal -o ./http.int.pie http.go

(cgo is not supported for internally linked PIE)

Now let's run them:

$ time ./http.static
2019/12/06 17:16:58 Exiting...
./http.static  0.00s user 0.01s system 102% cpu 0.005 total

$ time ./http.ext.pie
2019/12/06 17:17:17 Exiting...
./http.ext.pie  0.01s user 0.01s system 103% cpu 0.015 total

$ time ./http.int.pie    
2019/12/06 17:17:30 Exiting...
./http.int.pie  1.03s user 0.00s system 99% cpu 1.032 total

The internally linked binary takes over 1s to start!

The problem is the relocations. Both the internally and externally binaries contain ~60k relocations (!). The externally linked version has R_X86_64_RELATIVE relocations, which are trivial for the dynamic linker to resolve. The internally linked version has R_X86_64_64 relocations, which require a symbol lookup to resolve.

Sure enough, profiling reveals that most of that 1s is going to symbol resolution in the dynamic linker. We should switch to more efficient relocations to keep dynamic linking times reasonable.

We already have a TODO to switch these relocations:

// TODO: We generate a R_X86_64_64 relocation for every R_ADDR, even
. It turns out making the switch is quite simple (unless I missed something), so I'll send a CL shortly.

@prattmic prattmic changed the title Generate R_X86_64_RELATIVE relocations instead of R_X86_64_64 for buildmode=PIE linkmode=internal Extremely slow start for buildmode=PIE linkmode=internal binaries (R_X86_64_64 relocations) Dec 6, 2019
@prattmic
Copy link
Member Author

prattmic commented Dec 6, 2019

For context, we'd like to build gVisor's binaries as PIE (with internal linking, as we forbid cgo/libc). Our main binary ends up with 222k relocations and takes 12s to start (vs ~35ms normally linked).

This seems like a massive number of relocations (it seems like type are mostly for type information and constant strings? I'm not very familiar with the extra data Go generates). It would be great to reduce the number of relocations, but just switching to R_X86_64_RELATIVE will make things at least reasonable.

@gopherbot
Copy link

Change https://golang.org/cl/210181 mentions this issue: cmd/link: generate R_X86_64_RELATIVE relocs for PIE

@toothrot toothrot changed the title Extremely slow start for buildmode=PIE linkmode=internal binaries (R_X86_64_64 relocations) cmd/link: Extremely slow start for buildmode=PIE linkmode=internal binaries (R_X86_64_64 relocations) Dec 9, 2019
@toothrot toothrot added this to the Backlog milestone Dec 9, 2019
@toothrot toothrot added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Dec 9, 2019
@toothrot
Copy link
Contributor

toothrot commented Dec 9, 2019

@mdempsky
Copy link
Member

mdempsky commented Dec 9, 2019

Seems like a clearly worthwhile improvement, and there's already a CL under review. It'll have to wait for 1.15 though (and the CL is tagged R=go1.15 accordingly), since it's not a regression.

@mdempsky mdempsky added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. labels Dec 9, 2019
@mdempsky mdempsky modified the milestones: Backlog, Go1.15 Dec 9, 2019
@odeke-em
Copy link
Member

odeke-em commented Dec 15, 2019

Hey @prattmic, thank you for the report and for the CL.

I examined this bug and your CL but noticed there isn't a test and thought that perhaps if need be we can create a regression test that'll check for an anomaly between those numbers and then report one. The caveat here is that my version of Go complains about your flags for the last build as per (also I don't have formal education in statistics :))

internal linking requested but external linking required: buildmode=pie

but perhaps this test can help after you tweak the slow buildmode arguments

package main

import (
	"io/ioutil"
	"math"
	"os"
	"os/exec"
	"path/filepath"
	"testing"
	"time"
)

const prog = `
package main

import (
        "fmt"
        "html"
        "log"
        "net/http"
)

func main() {
        http.HandleFunc("/bar", func(w http.ResponseWriter, r *http.Request) {
                fmt.Fprintf(w, "Hello, %q", html.EscapeString(r.URL.Path))
        })

        go http.ListenAndServe(":8080", nil)

        // Exit immediately so we can easily measure startup.
        log.Printf("Exiting...")
}`

func TestBuildModeSpeedComparisons(t *testing.T) {
	tmpDir, err := ioutil.TempDir("", "issue36028")
	if err != nil {
		t.Fatalf("Failed to create tmp directory: %v", err)
	}
	defer os.RemoveAll(tmpDir)

	mainGoFile := filepath.Join(tmpDir, "http.go")
	if err := ioutil.WriteFile(mainGoFile, []byte(prog), 0600); err != nil {
		t.Fatalf("Failed to write file: %v", err)
	}

	staticFile := filepath.Join(tmpDir, "http.static")
	pieFile := filepath.Join(tmpDir, "http.ext.pie")
	pieLinkFile := filepath.Join(tmpDir, "http.int.pie")
	var noEnv []string
	_, output, err := execIt(noEnv, "go", "build", "-o", staticFile, mainGoFile)
	if err != nil {
		t.Fatalf("Failed to build object file: %v\n%s", err, output)
	}
	_, output, err = execIt(noEnv, "go", "build", "-buildmode=pie", "-o", pieFile, mainGoFile)
	if err != nil {
		t.Fatalf("Failed to build PIE file: %v\n%s", err, output)
	}
	cgoEnabled0 := append(os.Environ(), "CGO_ENABLED=0")
	// TODO: Get @prattmic to properly add the slow build flags here.
	_, output, err = execIt(cgoEnabled0, "go", "build", "-ldflags=-linkmode=internal", "-o", pieLinkFile, mainGoFile)
	if err != nil {
		t.Fatalf("Failed to build PIE -linkmode file: %v\n%s", err, output)
	}

	durStatic, _, _ := execIt(noEnv, staticFile)
	durPIE, _, _ := execIt(noEnv, pieFile)
	durPIELink, _, _ := execIt(noEnv, pieLinkFile)

	values := []struct {
		name  string
		value time.Duration
	}{
		{"durStatic", durStatic},
		{"durPIE", durPIE},
		{"durPIELink", durPIELink},
	}
	sum := 0.0
	for _, value := range values {
		sum += float64(value.value)
	}
	mean := sum / float64(len(values))
	sumOfSquares := 0.0
	for _, value := range values {
		sumOfSquares += math.Pow(float64(value.value)-mean, 2.0)
	}
	variance := sumOfSquares / float64(len(values))
	σ := math.Pow(variance, 0.5)

        // Heuristic: outlier detection by 3 sigma-rule.
	t.Logf("σ stdDev: %s", time.Duration(σ))
	meanMinus2σ := time.Duration(mean - 2*σ)
	meanPlus2σ := time.Duration(mean + 2*σ)
	for _, value := range values {
		if value.value > meanPlus2σ {
			t.Errorf("Anomaly: %q: %s > (μ + 2σ = %s)", value.name, value.value, meanPlus2σ)
		} else if value.value  < meanMinus2σ {
			t.Errorf("Anomaly: %q: %s < (μ - 2σ = %s)", value.name, value.value, meanMinus2σ)
		}
	}
}

func execIt(env []string, prog string, args ...string) (dur time.Duration, output []byte, err error) {
	startTime := time.Now()
	defer func() {
		dur = time.Now().Sub(startTime)
	}()
	cmd := exec.Command(prog, args...)
	cmd.Env = env
	output, err = cmd.CombinedOutput()
	return
}

@prattmic
Copy link
Member Author

I can certainly add a timing test like that if desired. Are such tests common/desired in Go? These kinds of tests have the potential to be flaky due to small timing differences.

I could also add a test looking for specific kinds of relocations in the output, but that is fragile w.r.t. future changes to the linker.

@golang golang locked and limited conversation to collaborators Feb 20, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

5 participants