Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: teach the compiler that arguments passed to {strings|bytes}.{Index*|Count|Has*} don't escape #25864

Closed
valyala opened this issue Jun 13, 2018 · 11 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@valyala
Copy link
Contributor

valyala commented Jun 13, 2018

The issue

Go tip doesn't know that string / []byte arguments passed to functions like strings.IndexByte don't escape. This leads to unnecessary allocation and copy in the following code:

package stringsIndex_test

import (
        "strings"
        "testing"
)


func BenchmarkStringsIndexByte(b *testing.B) {
        bb := []byte("foobar baz")
        b.ReportAllocs()
        for i := 0; i < b.N; i++ {
                // string(bb) unnecessarily allocates new string and copies bb contents to it.
                n := strings.IndexByte(string(bb), ' ')
                Sink += n
        }
}

var Sink int

Benchmark results indicate an unnecessary allocation:

BenchmarkStringsIndexByte 	50000000	        30.5 ns/op	      16 B/op	       1 allocs/op

The solution

Mark strings and bytes functions accepting string / []byte as go:noescape in order to eliminate unnecessary allocations in the code above.

@ianlancetaylor ianlancetaylor added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 13, 2018
@ianlancetaylor ianlancetaylor added this to the Go1.12 milestone Jun 13, 2018
@ianlancetaylor
Copy link
Contributor

CC @dr2chase

I don't think we want to start using go:noescape in places like this. But perhaps the compiler can figure it out.

@valyala
Copy link
Contributor Author

valyala commented Jun 13, 2018

I don't think we want to start using go:noescape in places like this. But perhaps the compiler can figure it out.

I think it will be too difficult figuring out this for functions written in assembly. Probably, it would be OK adding go:noescape for assembly-written functions?

@valyala
Copy link
Contributor Author

valyala commented Jun 13, 2018

This change may have significant positive impact on standard packages, since then escape analysis may prove that arguments to many functions written in Go that call strings.Index* and co don't escape.

@ianlancetaylor
Copy link
Contributor

Yes, sorry, it's fine to use go:noescape for functions written in assembly.

@mundaym
Copy link
Member

mundaym commented Jun 13, 2018

IndexByte is already marked with go:noescape:

//go:noescape
func IndexByte(b []byte, c byte) int

I suspect this is just a case where the string cast could be cleverer.

@randall77
Copy link
Contributor

I guess we could fix this, but why are you using strings.IndexByte(string(bb), ' ') instead of bytes.IndexByte(bb, ' ')?

@valyala
Copy link
Contributor Author

valyala commented Jun 15, 2018

IndexByte is already marked with go:noescape

Then the []byte -> string conversion with a memory allocation and a copy for non-escaped argument looks like a bug.

why are you using strings.IndexByte(string(bb), ' ') instead of bytes.IndexByte(bb, ' ')

This is just an artificial example :)
The strings.Index(string(b), stringPrefix) is more real-life. It may be used interchangeably with bytes.Index(b, []byte(stringPrefix)), since there is no a preferred code here.

@quasilyte
Copy link
Contributor

quasilyte commented Sep 19, 2018

The non-escaping string(b) still can allocate if it doesn't fit the small tmp buffer.
I think #27148 (comment) can be relevant here.

bytes.Index(b, []byte(stringPrefix))

This is a preferred way for now, since prefix is usually smaller than haystack itself, and it can probably fit fixes 32-byte stack allocated buffer.

quasilyte added a commit to go-critic/go-critic that referenced this issue Sep 19, 2018
Performance check.

Suggests to replace strings.Index with bytes.Index
where it can make code more efficient.

See golang/go#25864
quasilyte added a commit to go-critic/go-critic that referenced this issue Sep 19, 2018
Performance check.

Suggests to replace strings.Index with bytes.Index
where it can make code more efficient.

See golang/go#25864
@quasilyte
Copy link
Contributor

quasilyte commented Sep 19, 2018

Here is a simple benchmarking code:

package foo

import (
	"bytes"
	"strings"
	"testing"
)

var haystack = bytes.Repeat([]byte("a"), 100)
var needle = "aaa"

func BenchmarkStringsIndex(b *testing.B) {
	for i := 0; i < b.N; i++ {
		_ = strings.Index(string(haystack), needle)
	}
}

func BenchmarkBytesIndex(b *testing.B) {
	for i := 0; i < b.N; i++ {
		_ = bytes.Index(haystack, []byte(needle))
	}
}

The results:

BenchmarkStringsIndex-8 20000000  73.3 ns/op 112 B/op 1 allocs/op
BenchmarkBytesIndex-8   100000000 15.1 ns/op   0 B/op 0 allocs/op

I've written a simple check that detects such calls that can be replaced with bytes.Index to aid in optimization sessions of big projects (https://go-critic.github.io/overview#indexAlloc-ref).

quasilyte pushed a commit to go-critic/go-critic that referenced this issue Sep 19, 2018
Performance check.

Suggests to replace strings.Index with bytes.Index
where it can make code more efficient.

See golang/go#25864
@gopherbot
Copy link

Change https://golang.org/cl/146018 mentions this issue: strings: declare Index as noescape

@randall77
Copy link
Contributor

The CL I just mailed fixes strings.Index. I think all the other methods mentioned in the issue title are already ok. Let me know if that doesn't mesh with your understanding.

@golang golang locked and limited conversation to collaborators Oct 30, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

6 participants