Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Go 2: spec: remove special cases for strings in conversions, append, and copy #24529

Closed
jimmyfrasche opened this issue Mar 26, 2018 · 17 comments
Labels
FrozenDueToAge LanguageChange NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. Proposal v2 A language change or incompatible library change
Milestone

Comments

@jimmyfrasche
Copy link
Member

Strings have a number of special cases in the spec.

Some or all of these special cases should be considered for removal in Go 2 in order to to buy complexity for more general features.

Conversions from string to []byte or []rune

These can be replaced by conversion functions in the strings package.

strings.Bytes(s) is not as concise as []byte(s) but it clearer that work (allocation) is being done.

Conversion from []byte to string

This can be a function in the bytes package.

Conversion from byte or rune to string

This could be a function in unicode/utf8 (taking rune since you can convert a byte to rune easily)

Append and copy

There is a special case in append allowing

var b []byte
b = append(b, "string"...)

and a special case in copy allowing

b := make([]byte, 6)
copy(b, "string")

The equivalent code with an explicit conversion is clearer.

@gopherbot gopherbot added this to the Proposal milestone Mar 26, 2018
@robpike robpike added the v2 A language change or incompatible library change label Mar 26, 2018
@ianlancetaylor ianlancetaylor changed the title proposal: Remove special cases for strings in conversions, append, and copy proposal: remove special cases for strings in conversions, append, and copy Mar 26, 2018
@ianlancetaylor
Copy link
Contributor

In making this argument, it would help to see counts of how often the feature is used in the standard library and in some collection of popular Go packages. If a feature is used frequently, then it is likely worth keeping even if removing it would simplify the language. Perhaps the feature should never have been added in the first place, but now that it is there removing it will have a cost. Go 2 is not starting from a blank slate; we can't ignore the cost of changing the existing language.

@jimmyfrasche
Copy link
Member Author

@ianlancetaylor Agreed, but there's been talk of removing features from Go 2 to keep the language simple on balance. I've had more than my share to say about what should be added so I thought it only fair to look for something that could be removed.

Replacing all of the conversions with library functions is easily go fixable.

For copy and append, removing these special cases could allow them to be defined as standard generic functions and is as go fixable.

I wrote a program to count uses of these special cases (as best as I was able, I'm not sure if I got all the cases correct, review/PRs welcome) https://github.com/jimmyfrasche/string-special-case-counter

Running

string-special-case-counter cmd std golang.org/x/{crypto,image,net,sys,text,tools}/...

(has to type check so this is just for GOARCH=amd64 GOOS=linux and it skips all vendor directories to avoid double counting) reports:

[]byte(string): 2834
string([]byte): 1576
[]rune(string): 88
string(rune): 204
string(byte): 10
append: 251
copy: 136

packages examined: 701
lloc examined: 1466948

(lloc are counted as distinct lines reported by the FileSet per File).

@ianlancetaylor
Copy link
Contributor

Thanks. I'm curious now: can you tell us what the 10 instances of string(byte) are? Removing that specific conversion, as well as string(rune), is already #3939.

@jimmyfrasche
Copy link
Member Author

Interestingly that particular one seems to almost entirely exist in xtest packages:

import/path:file:line

bufio_test:bufio_test.go:302
encoding/json:scanner.go:607
strconv_test:ftoa_test.go:138
strconv_test:ftoa_test.go:142
strconv_test:ftoa_test.go:147
strconv_test:ftoa_test.go:151
golang.org/x/net/http2:frame_test.go:552
golang.org/x/net/http2:frame_test.go:1167
golang.org/x/tools/cmd/godoc:codewalk.go:304
golang.org/x/tools/present:args.go:40

@jimmyfrasche
Copy link
Member Author

string(rune) seems to prefer xtest, too, though not quite as much: https://gist.github.com/jimmyfrasche/b1fb6fdc3495f0deadb87c1c26d278df

@dsnet dsnet changed the title proposal: remove special cases for strings in conversions, append, and copy proposal: Go 2: spec: remove special cases for strings in conversions, append, and copy Mar 27, 2018
@jimmyfrasche
Copy link
Member Author

A potential upgrade path, should this (or a subset, like #3939) be accepted:

  1. introduce the functions that are equivalent to the conversions
  2. go fix code owned by the Go org.
  3. (Go2) remove the special cases from the language but still have the compiler recognize the old versions and provide "you have to do it this way now" errors (this could be removed after a few versions as a cleanup).

That

  • let's people use the new functions in Go 1, if they want
  • ensures that there are no performance regressions when using the functions rather than the conversions (by testing it on Go itself)
  • and, if anyone forgets it was removed, there's a helpful error message.

The majority of the work can be done under Go 1.

@jimmyfrasche
Copy link
Member Author

Related #23536 and #23814.

@bcmills
Copy link
Contributor

bcmills commented Apr 3, 2018

To me, the conversion operations and the special-cases in append and copy are very different.

The conversion operations, as you note, hide expensive operations in seemingly-simple conversions and add a bunch of complexity when reasoning about defined types.

On the other hand, the special-cases in append and copy serve a real purpose: they allow programs to copy from an immutable string to a mutable byte slice in a way that clearly does not allocate an intermediate slice in the process. (The equivalent code using an explicit call to strings.Bytes or similar would appear to allocate when in fact it does not.)

@jimmyfrasche
Copy link
Member Author

@bcmills that's very true but it would still be more consistent and simpler if they were all gone. I'd rather something cheap looked expensive (and explicit) but was optimized out, personally.

Also, if Go2 gets generics, the existing generic built ins are going to have to work with whatever that entails and having a simpler type parameterization may make that simpler.

My main goal with this issue was to consider this class of cases as a whole: all, some, or none can be removed, of course.

@jimmyfrasche
Copy link
Member Author

I wonder how many times copy(byteSlice, []byte(aString)) and append(byteSlice, []byte(aString)...) have been written. I'm sure I've done it plenty.

I should update my program to count those separately. I wouldn't expect to find many, if any at all, in the stdlib or x/ packages.

@ianlancetaylor do you have any suggestions for a collection of popular Go packages to analyze? I guess I can just go with what gddo lists as popular (not otherwise counted so far).

@bcmills
Copy link
Contributor

bcmills commented Apr 3, 2018

If you find many (copy|append)(byteSlice, []byte(aString)), maybe that's a good candidate for a lint check (independently of any Go 2 changes).

@jimmyfrasche
Copy link
Member Author

In cmd std golang.org/x/{crypto,image,net,sys,text,tools}/... there are append(bytes, []byte(string)...) at

golang.org/x/crypto/ssh/terminal:terminal.go:222
golang.org/x/net/proxy:proxy_test.go:181

and copy(bytes, []byte(string)) at

crypto/tls:handshake_messages.go:181
crypto/tls:handshake_messages.go:636
crypto/tls:handshake_messages.go:671
crypto/tls:handshake_messages.go:1162
crypto/tls:handshake_client_test.go:55
crypto/tls:handshake_client_test.go:57
syscall:lsf_linux.go:55
golang.org/x/crypto/bcrypt:bcrypt.go:245
golang.org/x/net/dns/dnsmessage:message.go:1538
golang.org/x/net/icmp:interface.go:166

(for go 1.10.1 and I updated golang.org/x/ less than a week ago)

@jimmyfrasche
Copy link
Member Author

Here are the results for a handful of random packages. I looked at some others but only included results from trees that my program could fully analyze due to test deps I didn't want to track down:

gopkg.in/yaml.v2/...
[]byte(string): 42
string([]byte): 14
[]rune(string): 2

string(rune): 1
string(byte): 0

append([]byte, string...): 0
copy([]byte, string): 0

append([]byte, []byte(string)...): 0
copy([]byte, []byte(string)): 0

packages examined: 2
lloc examined: 6450

github.com/gorilla/websocket/...
[]byte(string): 17
string([]byte): 9
[]rune(string): 0

string(rune): 0
string(byte): 0

append([]byte, string...): 28
copy([]byte, string): 4

append([]byte, []byte(string)...): 0
copy([]byte, []byte(string)): 0

packages examined: 6
lloc examined: 3312

github.com/gorilla/mux/...
[]byte(string): 15
string([]byte): 0
[]rune(string): 0

string(rune): 0
string(byte): 0

append([]byte, string...): 0
copy([]byte, string): 0

append([]byte, []byte(string)...): 0
copy([]byte, []byte(string)): 0

packages examined: 2
lloc examined: 3353

github.com/dgrijalva/jwt-go/...
[]byte(string): 19
string([]byte): 4
[]rune(string): 0

string(rune): 0
string(byte): 0

append([]byte, string...): 0
copy([]byte, string): 0

append([]byte, []byte(string)...): 0
copy([]byte, []byte(string)): 0

packages examined: 5
lloc examined: 1872

github.com/aws/aws-sdk-go/...
[]byte(string): 212
string([]byte): 126
[]rune(string): 0

string(rune): 0
string(byte): 1

append([]byte, string...): 0
copy([]byte, string): 0

append([]byte, []byte(string)...): 0
copy([]byte, []byte(string)): 4

packages examined: 386
lloc examined: 454907

github.com/pkg/errors/...
[]byte(string): 0
string([]byte): 0
[]rune(string): 0

string(rune): 0
string(byte): 0

append([]byte, string...): 0
copy([]byte, string): 0

append([]byte, []byte(string)...): 0
copy([]byte, []byte(string)): 0

packages examined: 2
lloc examined: 1053

@jimmyfrasche
Copy link
Member Author

How is that substantially different than other compiler intrinsics?

@josephvusich
Copy link

josephvusich commented Apr 29, 2018

string(byte) and string(rune) are valid in constant expressions:

// Valid
const b = byte(255)
const s = string(b)

// Would this work?
const b = byte(255)
const s = utf8.RuneToString(rune(b))

It's my understanding that methods, intrinsic or not, are not allowed in constant expressions (feel free to provide a counter-example if I've forgotten something.)

@jimmyfrasche
Copy link
Member Author

@jvusich that's a good point.

Without additional loosening of what's considered a constant expression (unlikely), those are lost.

But they're not exactly widely used now and can be rewritten using escapes in string literals const s = "\u00ff".

A const block using that and iota could be unpleasant to go fix, though.

It would work if the functions were built ins (complex can be used in constant expressions), but if the idea is thin to spec that doesn't really add anything.

@ianlancetaylor ianlancetaylor added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label May 1, 2018
@ianlancetaylor
Copy link
Contributor

Conversions between string and []byte are widely used. Removing them from the language won't solve any problems. We should continue to consider removing conversions from int to string, which is #3939, but we should keep conversions between string and []byte.

@golang golang locked and limited conversation to collaborators Jan 29, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge LanguageChange NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. Proposal v2 A language change or incompatible library change
Projects
None yet
Development

No branches or pull requests

6 participants