Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fmt: fmt.Fprintf("%+q", large byte string) allocates quite a lot #31472

Closed
kevinburke opened this issue Apr 15, 2019 · 4 comments
Closed

fmt: fmt.Fprintf("%+q", large byte string) allocates quite a lot #31472

kevinburke opened this issue Apr 15, 2019 · 4 comments
Labels
FrozenDueToAge help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@kevinburke
Copy link
Contributor

kevinburke commented Apr 15, 2019

I maintain a program called go-bindata which allows users to embed static assets (images, CSS, Javascript) in a Go program. The performance is not great, as this benchmark indicates: https://github.com/kevinburke/go-bindata/blob/master/benchmark_test.go#L24-L50

$ go test -run=BenchmarkBindata -bench=Bindata -benchmem -memprofile=mem.out .
goos: darwin
goarch: amd64
pkg: github.com/kevinburke/go-bindata
BenchmarkBindata-4   	       5	 239407710 ns/op	  28.67 MB/s	177947792 B/op	   16211 allocs/op
PASS
ok  	github.com/kevinburke/go-bindata	3.824s

When we embed a large image into the binary, we call this to ensure it can be embedded in Go source code - I'm eliding some of the details but this is the meat of it.

	if len(b) > 0 && utf8.Valid(b) && !bytes.Contains(b, []byte{0}) {
		w.Write(sanitize(b))
	} else {
		fmt.Fprintf(w, "%+q", b)
	}

Fprintf in turn ends up calling fmtQ, which calls strconv.AppendQuoteToASCII. The latter has a high number of very small append calls which per pprof get called quite a bit. I think the "grow by 25%" threshold gets triggered quite a lot when you start with an empty buffer and have a large format string. Some samples:

Total: 179339
ROUTINE ======================== strconv.appendQuotedWith in /Users/kevin/go/src/strconv/quote.go
      7963      16661 (flat, cum)  9.29% of Total
         .          .     31:		width = 1
         .          .     32:		if r >= utf8.RuneSelf {
         .          .     33:			r, width = utf8.DecodeRuneInString(s)
         .          .     34:		}
         .          .     35:		if width == 1 && r == utf8.RuneError {
      3792       3792     36:			buf = append(buf, `\x`...)
      1831       1831     37:			buf = append(buf, lowerhex[s[0]>>4])
      2340       2340     38:			buf = append(buf, lowerhex[s[0]&0xF])
         .          .     39:			continue
         .          .     40:		}
         .       8698     41:		buf = appendEscapedRune(buf, r, quote, ASCIIonly, graphicOnly)
         .          .     42:	}
         .          .     43:	buf = append(buf, quote)
         .          .     44:	return buf
         .          .     45:}
         .          .     46:
         .          .     74:	switch r {
         .          .     75:	case '\a':
       165        165     76:		buf = append(buf, `\a`...)
         .          .     77:	case '\b':
        22         22     78:		buf = append(buf, `\b`...)
         .          .     79:	case '\f':
        78         78     80:		buf = append(buf, `\f`...)
         .          .     81:	case '\n':
        40         40     82:		buf = append(buf, `\n`...)
         .          .     83:	case '\r':
        75         75     84:		buf = append(buf, `\r`...)
         .          .     85:	case '\t':
         .          .     86:		buf = append(buf, `\t`...)
         .          .     87:	case '\v':
        21         21     88:		buf = append(buf, `\v`...)
         .          .     89:	default:
         .          .     90:		switch {
         .          .     91:		case r < ' ':
      2236       2236     92:			buf = append(buf, `\x`...)
      1533       1533     93:			buf = append(buf, lowerhex[byte(r)>>4])
      1381       1381     94:			buf = append(buf, lowerhex[byte(r)&0xF])

It would be great if we could improve the performance of this. Maybe we could do one pass to compute the size and then allocate a single buffer instead of allocating throughout the program execution.

@gopherbot
Copy link

Change https://golang.org/cl/172077 mentions this issue: strconv: benchmark large string in AppendQuoteToASCII

@kevinburke
Copy link
Contributor Author

I'm interested in making improvements here but not sure I have any great ideas besides just try a bunch of different tweaks to the strconv code and benchmark them, if you have suggestions that would be great!

@bradfitz bradfitz added this to the Unplanned milestone Apr 15, 2019
@bradfitz bradfitz added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Apr 15, 2019
@robpike robpike self-assigned this Apr 15, 2019
@gopherbot
Copy link

Change https://golang.org/cl/172677 mentions this issue: strconv: pre-allocate in appendQuotedWith

@kevinburke
Copy link
Contributor Author

FWIW, performance on the github.com/kevinburke/go-bindata benchmarks, with tip just before this commit:

name            time/op
Bindata-4          242ms ± 0%

name            speed
Bindata-4       28.3MB/s ± 0%

name            alloc/op
Bindata-4          178MB ± 0%

name            allocs/op
Bindata-4          16.2k ± 0%

Including this commit:

name            time/op
Bindata-4          242ms ± 0%

name            speed
Bindata-4       28.4MB/s ± 0%

name            alloc/op
Bindata-4          159MB ± 0%

name            allocs/op
Bindata-4          15.1k ± 0%

@golang golang locked and limited conversation to collaborators Apr 17, 2020
@rsc rsc unassigned robpike Jun 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

No branches or pull requests

4 participants