-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: treat []byte("xyz") and []byte{'x', 'y', 'z'} the same during escape analysis #10170
Comments
This cannot be done. Consider this expression: var x = []byte("0123") x is now a byte slice. If the conversion is done at compile time there can be only one instance of the memory holding the string. For some simple cases, that will result in incorrect behavior. This function: func f(b byte) []byte { If x is built at compile time, every call to f will return the same memory, so if we call f('a') and then f('b'), the second call will overwrite the memory returned by the first call. In other words, this is a change in language semantics that cannot be done in general. There are some cases where a compiler might be able to guarantee this cannot happen, but semantically the conversion cannot be done only once. The program must behave as though new memory is created for each conversion. |
@robpike You may have misunderstood. The pull quote of the request is:
If I perform that change to your example manually, there is no incorrect behavior: (See a playground example where two calls to this function produce different slices with different contents.)
Perhaps I'm missing something and there is another example where this transformation wouldn't preserve semantics? In any case, an almost 25x speedup for a commonly used pattern with such a simple transformation could be important. |
The benchmark is not comparing apple to apple.
In BenchmarkInlineSlice, because the slice doesn't escape, it's actually
allocated on
stack.
However, in BenchmarkStringToSlice the runtime always allocate a new
[]byte, or
that benchmark is actually comparing memory allocation vs. stack allocation.
If I make the b argument to count escape, by assigning it to a global
[]byte, the
result is comparable.
BenchmarkStringToSlice 20000000 87.3 ns/op
BenchmarkInlineSlice 20000000 69.8 ns/op
|
I'm not sure what the relation is between escaping and allocating, but indeed most of the difference between both benchmarks is that BenchmarkStringToSlice allocates on the heap which is not necessary. This was actually a common use pattern while I was working on a minifier which writes literal strings to an Edit: I see, escape analysis, I thought unicode escapes! |
I am going to reopen this, and I've changed the title. What you're asking for in the initial report is not quite right, but it is definitely the case that []byte("xyz") and []byte{'x', 'y', 'z'} should behave the same, whatever that behavior is. As Minux has pointed out, one is getting more benefit from escape analysis than the other, which causes the inconsistency in the benchmarks. We should fix the inconsistency. We will not change the semantics, though. |
I think we are closer (non-escaping with or without inlining enabled)
and
But though those two times are closer, they are clearly not equal. |
We will defer this to Go 1.6. David confirmed they are the same for escape analysis. The issue is that []byte("xyz") does a runtime conversion of string to byte, not a compile-time conversion. We disabled the compile-time conversion because it was very bad for large strings. Any fix will have to reckon with that. See issue #6643 and https://golang.org/cl/15930045 |
So, this is not about escape analysis anymore. In any case, the problem is |
Change https://golang.org/cl/140698 mentions this issue: |
Whenever a constant string (such as a literal string) is casted to a byte slice the compiler should perform the cast at compile time and not runtime.
Any
[]byte("foo")
should be transformed implicitly to[]byte{'f', 'o', 'o'}
.The text was updated successfully, but these errors were encountered: