Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: treat []byte("xyz") and []byte{'x', 'y', 'z'} the same during escape analysis #10170

Closed
tdewolff opened this issue Mar 15, 2015 · 10 comments

Comments

@tdewolff
Copy link

Whenever a constant string (such as a literal string) is casted to a byte slice the compiler should perform the cast at compile time and not runtime.

Any []byte("foo") should be transformed implicitly to []byte{'f', 'o', 'o'}.

package main

import (
    "testing"
)

var c = 0
func count(b []byte) {
    c += len(b)
}

func BenchmarkStringToSlice(b *testing.B) {
    for i := 0; i < b.N; i++ {
        count([]byte("lorem"))
    }
}

func BenchmarkInlineSlice(b *testing.B) {
    for i := 0; i < b.N; i++ {
        count([]byte{'l', 'o', 'r', 'e', 'm'})
    }
}
BenchmarkStringToSlice  30000000                56.8 ns/op
BenchmarkInlineSlice    1000000000               2.09 ns/op
ok      test3   4.092s
@robpike
Copy link
Contributor

robpike commented Mar 15, 2015

This cannot be done. Consider this expression:

var x = []byte("0123")

x is now a byte slice. If the conversion is done at compile time there can be only one instance of the memory holding the string. For some simple cases, that will result in incorrect behavior. This function:

func f(b byte) []byte {
var x = []byte("0123")
x[0] = b
return x
}

If x is built at compile time, every call to f will return the same memory, so if we call f('a') and then f('b'), the second call will overwrite the memory returned by the first call.

In other words, this is a change in language semantics that cannot be done in general. There are some cases where a compiler might be able to guarantee this cannot happen, but semantically the conversion cannot be done only once. The program must behave as though new memory is created for each conversion.

@robpike robpike closed this as completed Mar 15, 2015
@infogulch
Copy link
Contributor

@robpike You may have misunderstood. The pull quote of the request is:

Any []byte("foo") should be transformed implicitly to []byte{'f', 'o', 'o'}.

If I perform that change to your example manually, there is no incorrect behavior: (See a playground example where two calls to this function produce different slices with different contents.)

func g(b byte) []byte {
    var x = []byte{'0', '1', '2', '3'}
    x[0] = b
    return x
}

Perhaps I'm missing something and there is another example where this transformation wouldn't preserve semantics? In any case, an almost 25x speedup for a commonly used pattern with such a simple transformation could be important.

@minux
Copy link
Member

minux commented Mar 16, 2015 via email

@tdewolff
Copy link
Author

I'm not sure what the relation is between escaping and allocating, but indeed most of the difference between both benchmarks is that BenchmarkStringToSlice allocates on the heap which is not necessary. This was actually a common use pattern while I was working on a minifier which writes literal strings to an io.Writer.

Edit: I see, escape analysis, I thought unicode escapes!

@rsc rsc changed the title cmd/gc: literal byte slice from string cmd/gc: treat []byte("xyz") and []byte{'x', 'y', 'z'} the same during escape analysis Mar 16, 2015
@rsc
Copy link
Contributor

rsc commented Mar 16, 2015

I am going to reopen this, and I've changed the title. What you're asking for in the initial report is not quite right, but it is definitely the case that []byte("xyz") and []byte{'x', 'y', 'z'} should behave the same, whatever that behavior is. As Minux has pointed out, one is getting more benefit from escape analysis than the other, which causes the inconsistency in the benchmarks. We should fix the inconsistency. We will not change the semantics, though.

@rsc rsc reopened this Mar 16, 2015
@rsc rsc added this to the Go1.5 milestone Apr 10, 2015
@rsc
Copy link
Contributor

rsc commented Apr 10, 2015

@dr2chase

@dr2chase
Copy link
Contributor

I think we are closer (non-escaping with or without inlining enabled)

drchase$ go tool 6g -m asdf_test.go
asdf_test.go:9: can inline count
asdf_test.go:15: inlining call to count
asdf_test.go:21: inlining call to count
asdf_test.go:9: count b does not escape
asdf_test.go:13: BenchmarkStringToSlice b does not escape
asdf_test.go:15: BenchmarkStringToSlice ([]byte)("lorem") does not escape
asdf_test.go:19: BenchmarkInlineSlice b does not escape
asdf_test.go:21: BenchmarkInlineSlice []byte literal does not escape

and

BenchmarkStringToSlice  100000000           14.6 ns/op
BenchmarkInlineSlice    1000000000           2.30 ns/op

But though those two times are closer, they are clearly not equal.

@dr2chase dr2chase self-assigned this May 11, 2015
@rsc
Copy link
Contributor

rsc commented May 11, 2015

We will defer this to Go 1.6. David confirmed they are the same for escape analysis. The issue is that []byte("xyz") does a runtime conversion of string to byte, not a compile-time conversion. We disabled the compile-time conversion because it was very bad for large strings. Any fix will have to reckon with that.

See issue #6643 and https://golang.org/cl/15930045

@rsc rsc modified the milestones: Go1.6, Go1.5 May 11, 2015
@rsc rsc changed the title cmd/gc: treat []byte("xyz") and []byte{'x', 'y', 'z'} the same during escape analysis cmd/compile: treat []byte("xyz") and []byte{'x', 'y', 'z'} the same during escape analysis Jun 8, 2015
@rsc rsc modified the milestones: Unplanned, Go1.6 Dec 14, 2015
@quasilyte
Copy link
Contributor

quasilyte commented Sep 19, 2018

So, this is not about escape analysis anymore.
I also think it can be closed in favor of any other issue that describes the last issue pointed out
(Probably #26498).

In any case, the problem is []byte("literal") conversion and lack of compile-time optimization for that, not the escape analysis, so the title is misleading.

@gopherbot
Copy link

Change https://golang.org/cl/140698 mentions this issue: cmd/compile: make []byte("...") more efficient

@golang golang locked and limited conversation to collaborators Oct 10, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants