cmd/compile: treat []byte("xyz") and []byte{'x', 'y', 'z'} the same during escape analysis #10170

tdewolff · 2015-03-15T18:43:55Z

Whenever a constant string (such as a literal string) is casted to a byte slice the compiler should perform the cast at compile time and not runtime.

Any []byte("foo") should be transformed implicitly to []byte{'f', 'o', 'o'}.

package main

import (
    "testing"
)

var c = 0
func count(b []byte) {
    c += len(b)
}

func BenchmarkStringToSlice(b *testing.B) {
    for i := 0; i < b.N; i++ {
        count([]byte("lorem"))
    }
}

func BenchmarkInlineSlice(b *testing.B) {
    for i := 0; i < b.N; i++ {
        count([]byte{'l', 'o', 'r', 'e', 'm'})
    }
}

BenchmarkStringToSlice  30000000                56.8 ns/op
BenchmarkInlineSlice    1000000000               2.09 ns/op
ok      test3   4.092s

The text was updated successfully, but these errors were encountered:

robpike · 2015-03-15T19:15:48Z

This cannot be done. Consider this expression:

var x = []byte("0123")

x is now a byte slice. If the conversion is done at compile time there can be only one instance of the memory holding the string. For some simple cases, that will result in incorrect behavior. This function:

func f(b byte) []byte {
var x = []byte("0123")
x[0] = b
return x
}

If x is built at compile time, every call to f will return the same memory, so if we call f('a') and then f('b'), the second call will overwrite the memory returned by the first call.

In other words, this is a change in language semantics that cannot be done in general. There are some cases where a compiler might be able to guarantee this cannot happen, but semantically the conversion cannot be done only once. The program must behave as though new memory is created for each conversion.

infogulch · 2015-03-15T23:22:42Z

@robpike You may have misunderstood. The pull quote of the request is:

Any []byte("foo") should be transformed implicitly to []byte{'f', 'o', 'o'}.

If I perform that change to your example manually, there is no incorrect behavior: (See a playground example where two calls to this function produce different slices with different contents.)

func g(b byte) []byte {
    var x = []byte{'0', '1', '2', '3'}
    x[0] = b
    return x
}

Perhaps I'm missing something and there is another example where this transformation wouldn't preserve semantics? In any case, an almost 25x speedup for a commonly used pattern with such a simple transformation could be important.

minux · 2015-03-16T00:02:46Z

The benchmark is not comparing apple to apple. In BenchmarkInlineSlice, because the slice doesn't escape, it's actually allocated on stack. However, in BenchmarkStringToSlice the runtime always allocate a new []byte, or that benchmark is actually comparing memory allocation vs. stack allocation. If I make the b argument to count escape, by assigning it to a global []byte, the result is comparable. BenchmarkStringToSlice 20000000 87.3 ns/op BenchmarkInlineSlice 20000000 69.8 ns/op

tdewolff · 2015-03-16T09:33:15Z

I'm not sure what the relation is between escaping and allocating, but indeed most of the difference between both benchmarks is that BenchmarkStringToSlice allocates on the heap which is not necessary. This was actually a common use pattern while I was working on a minifier which writes literal strings to an io.Writer.

Edit: I see, escape analysis, I thought unicode escapes!

rsc · 2015-03-16T14:48:28Z

I am going to reopen this, and I've changed the title. What you're asking for in the initial report is not quite right, but it is definitely the case that []byte("xyz") and []byte{'x', 'y', 'z'} should behave the same, whatever that behavior is. As Minux has pointed out, one is getting more benefit from escape analysis than the other, which causes the inconsistency in the benchmarks. We should fix the inconsistency. We will not change the semantics, though.

rsc · 2015-04-10T03:20:21Z

@dr2chase

dr2chase · 2015-05-11T20:24:45Z

I think we are closer (non-escaping with or without inlining enabled)

drchase$ go tool 6g -m asdf_test.go
asdf_test.go:9: can inline count
asdf_test.go:15: inlining call to count
asdf_test.go:21: inlining call to count
asdf_test.go:9: count b does not escape
asdf_test.go:13: BenchmarkStringToSlice b does not escape
asdf_test.go:15: BenchmarkStringToSlice ([]byte)("lorem") does not escape
asdf_test.go:19: BenchmarkInlineSlice b does not escape
asdf_test.go:21: BenchmarkInlineSlice []byte literal does not escape

and

BenchmarkStringToSlice  100000000           14.6 ns/op
BenchmarkInlineSlice    1000000000           2.30 ns/op

But though those two times are closer, they are clearly not equal.

rsc · 2015-05-11T20:45:47Z

We will defer this to Go 1.6. David confirmed they are the same for escape analysis. The issue is that []byte("xyz") does a runtime conversion of string to byte, not a compile-time conversion. We disabled the compile-time conversion because it was very bad for large strings. Any fix will have to reckon with that.

See issue #6643 and https://golang.org/cl/15930045

quasilyte · 2018-09-19T13:52:13Z

So, this is not about escape analysis anymore.
I also think it can be closed in favor of any other issue that describes the last issue pointed out
(Probably #26498).

In any case, the problem is []byte("literal") conversion and lack of compile-time optimization for that, not the escape analysis, so the title is misleading.

gopherbot · 2018-10-10T15:51:07Z

Change https://golang.org/cl/140698 mentions this issue: cmd/compile: make []byte("...") more efficient

robpike closed this as completed Mar 15, 2015

rsc changed the title ~~cmd/gc: literal byte slice from string~~ cmd/gc: treat []byte("xyz") and []byte{'x', 'y', 'z'} the same during escape analysis Mar 16, 2015

rsc reopened this Mar 16, 2015

rsc added this to the Go1.5 milestone Apr 10, 2015

dr2chase self-assigned this May 11, 2015

rsc modified the milestones: Go1.6, Go1.5 May 11, 2015

rsc changed the title ~~cmd/gc: treat []byte("xyz") and []byte{'x', 'y', 'z'} the same during escape analysis~~ cmd/compile: treat []byte("xyz") and []byte{'x', 'y', 'z'} the same during escape analysis Jun 8, 2015

josharian mentioned this issue Jun 18, 2015

proposal: cmd/link: redesign format of intermediate object files #11123

Closed

rsc modified the milestones: Unplanned, Go1.6 Dec 14, 2015

mvdan mentioned this issue Jul 20, 2018

cmd/compile: constant string -> []byte and []byte -> string conversions aren't constant folded #26498

Open

gopherbot closed this as completed in ceb0c37 Oct 10, 2018

golang locked and limited conversation to collaborators Oct 10, 2019

gopherbot added the FrozenDueToAge label Oct 10, 2019

rsc unassigned dr2chase Jun 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmd/compile: treat []byte("xyz") and []byte{'x', 'y', 'z'} the same during escape analysis #10170

cmd/compile: treat []byte("xyz") and []byte{'x', 'y', 'z'} the same during escape analysis #10170

tdewolff commented Mar 15, 2015

robpike commented Mar 15, 2015

infogulch commented Mar 15, 2015

minux commented Mar 16, 2015 via email

tdewolff commented Mar 16, 2015

rsc commented Mar 16, 2015

rsc commented Apr 10, 2015

dr2chase commented May 11, 2015

rsc commented May 11, 2015

quasilyte commented Sep 19, 2018 •

edited

Loading

gopherbot commented Oct 10, 2018

cmd/compile: treat []byte("xyz") and []byte{'x', 'y', 'z'} the same during escape analysis #10170

cmd/compile: treat []byte("xyz") and []byte{'x', 'y', 'z'} the same during escape analysis #10170

Comments

tdewolff commented Mar 15, 2015

robpike commented Mar 15, 2015

infogulch commented Mar 15, 2015

minux commented Mar 16, 2015 via email

tdewolff commented Mar 16, 2015

rsc commented Mar 16, 2015

rsc commented Apr 10, 2015

dr2chase commented May 11, 2015

rsc commented May 11, 2015

quasilyte commented Sep 19, 2018 • edited Loading

gopherbot commented Oct 10, 2018

quasilyte commented Sep 19, 2018 •

edited

Loading