Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spec: clarify that conversions to slices don't guarantee slice capacity? #24163

Open
mattbostock opened this issue Feb 27, 2018 · 15 comments
Open
Labels
Documentation LanguageChange NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@mattbostock
Copy link
Contributor

What did you do?

Converted a string to a slice of runes then back to a string, and referenced non-existent elements on the slice of runes:

https://play.golang.org/p/MmJNLu6h6FT

What did you expect to see?

A runtime panic with a 'slice bounds out of range' error.

What did you see instead?

"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"

System details

go version go1.10 darwin/amd64
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/matt/Library/Caches/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/matt/go"
GORACE=""
GOROOT="/opt/boxen/homebrew/Cellar/go/1.10/libexec"
GOTMPDIR=""
GOTOOLDIR="/opt/boxen/homebrew/Cellar/go/1.10/libexec/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/k4/_18vs9g119b0q2_p950t0jl00000gn/T/go-build038036646=/tmp/go-build -gno-record-gcc-switches -fno-common"
GOROOT/bin/go version: go version go1.10 darwin/amd64
GOROOT/bin/go tool compile -V: compile version go1.10
uname -v: Darwin Kernel Version 17.4.0: Sun Dec 17 09:19:54 PST 2017; root:xnu-4570.41.2~1/RELEASE_X86_64
ProductName:	Mac OS X
ProductVersion:	10.13.3
BuildVersion:	17D47
lldb --version: lldb-900.0.64
  Swift-4.0
@bradfitz
Copy link
Contributor

Interesting, thanks!

Here we can see that the cap is 32:
https://play.golang.org/p/PDAI5olA4YG

But uncomment the following line and the cap changes to 0:
https://play.golang.org/p/BA2ivc0sTY6

Definitely something messed up in the compiler.

/cc @randall77 @mdempsky

@bradfitz bradfitz changed the title Inconsistent panics on out-of-bounds slice index cmd/compile: inconsistent panics on out-of-bounds slice index Feb 27, 2018
@bradfitz bradfitz added the NeedsFix The path to resolution is known, but the work has not been done. label Feb 27, 2018
@bradfitz bradfitz added this to the Go1.11 milestone Feb 27, 2018
@bradfitz
Copy link
Contributor

Actually, does the language guarantee the size of the cap on string->[]rune conversions?

https://golang.org/ref/spec#Conversions_to_and_from_a_string_type says only:

Converting a value of a string type to a slice of runes type yields a slice containing the individual Unicode code points of the string

But doesn't say anything about the slice's capacity.

/cc @griesemer

@mattbostock
Copy link
Contributor Author

Here's another example using len() that might help clarify how confusing this could be:
https://play.golang.org/p/YLwjpbY7_nX

@mdempsky
Copy link
Member

mdempsky commented Feb 27, 2018

@mattbostock Thanks for the examples. I think this example summarizes/demonstrates the root issue you're noticing: https://play.golang.org/p/Er5s4t9rO7R

(Also, note that the upper bound for slice operations is cap(s), not len(s); i.e., it's possible to grow a slice, as long as there's capacity.)

Like @bradfitz points out, I think the real question is whether string->[]byte or string->[]rune conversions guarantee a particular slice capacity. It seems the spec doesn't guarantee that.

@bradfitz bradfitz added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Feb 27, 2018
@bradfitz
Copy link
Contributor

So seems everything is working fine, but the language spec could possibly be more specific here.

Moving to NeedsDecision to see whether the language gets changed.

/cc @ianlancetaylor too.

@gopherbot gopherbot removed the NeedsFix The path to resolution is known, but the work has not been done. label Feb 27, 2018
@bradfitz bradfitz added LanguageChange NeedsFix The path to resolution is known, but the work has not been done. labels Feb 27, 2018
@gopherbot gopherbot removed the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Feb 28, 2018
@ianlancetaylor
Copy link
Contributor

I think that for these conversions the result should be a slice with cap == len. I see no benefit to leaving it unspecified.

@mdempsky
Copy link
Member

mdempsky commented Feb 28, 2018

I see no benefit to leaving it unspecified.

I think there's a small performance benefit. Suppose something like:

var s string = f()
return append([]byte(s), "suffix"...)

For the conversion, the runtime will need to allocate memory. Also, it will pad the allocation up to the next allocation bucket size. If we expose this capacity to the user via the resulting slice's capacity, then the append might be able to fit into the existing space without requiring a second allocation.

@griesemer
Copy link
Contributor

As bad as it sounds, there may be existing code that relies on this feature...

Perhaps the spec just needs to say that it doesn't say anything about the capacity in these cases.

@dominikh
Copy link
Member

Related: #14232 and #14235

@bradfitz
Copy link
Contributor

As bad as it sounds, there may be existing code that relies on this feature...

Considering that escape analysis changes somewhat regularly, I kinda doubt anybody's relying on this. We've never heard anything about it.

@randall77
Copy link
Contributor

For conversions from string to both []byte and []rune, we allocate extra capacity. This happens for both stack-allocated (round up to the 32-byte stack buffer) and heap-allocated (round up to the next size class) results:

func main() {
	foo := "foo"
	bar := []rune(foo)
	fmt.Printf("%d %d\n", len(bar), cap(bar))

	baz := []rune(foo)
	sink = baz
	fmt.Printf("%d %d\n", len(baz), cap(baz))
}

var sink []rune

Prints:

3 32
3 4

So escape analysis doesn't really figure into this discussion, it just affects the constants.

IMO when Go does an allocation and the user didn't specify a capacity, the runtime should be free to allocate capacity somewhat beyond the requested size. Calling make gives you exactly the capacity requested*. Calling append or doing a cast need not.

*Although you could argue that only the 3-arg version of make requests an explicit capacity, the 2-arg version could allow cap>len.

@mdempsky
Copy link
Member

mdempsky commented Mar 1, 2018

So escape analysis doesn't really figure into this discussion, it just affects the constants.

I think @bradfitz's point was that because escape analysis changes, the constants particular user code sees sometimes changes too. Thus it's arguably safe if we want to intentionally change the constants.

*Although you could argue that only the 3-arg version of make requests an explicit capacity, the 2-arg version could allow cap>len.

I'd argue even for the 3-arg version, that the explicit cap argument should just be a lower bound on what the runtime returns. But unfortunately these seems out of scope for Go 1.

@randall77
Copy link
Contributor

@mdempsky Wow, you're even more of a rebel than I am.

@josharian
Copy link
Contributor

Although you could argue that only the 3-arg version of make requests an explicit capacity, the 2-arg version could allow cap>len.

To be clear, the spec does currently appear to forbid this.

I'd argue even for the 3-arg version, that the explicit cap argument should just be a lower bound on what the runtime returns. But unfortunately these seems out of scope for Go 1.

There's an analogy with making maps, where the len arg is just a hint. On the other hand, we definitely wouldn't want the len arg to buffered channels to be just a hint, since they are used to strictly control concurrency.

Another option is to add a "round up" function to package runtime that accepts a size (obtained presumably by unsafe.Sizeof) and a quantity and returns a recommended quantity. Then in cases in which it really matters, people would have the tools to do it--at the cost of using both package unsafe and package runtime. Combine with generics and you could even build append as a library function. :)

(Note that if one wanted to go really nuts and build this yourself without package runtime support, you could, by writing an init function that makes byte slices of various sizes and calls append on them and records the cap you get back, and inferring malloc slab sizes from them.)

@mdempsky
Copy link
Member

mdempsky commented Mar 1, 2018

I think the discussions about make are interesting, but increasingly off topic. I've opened #24204 to discuss them further. :)

@bradfitz bradfitz changed the title cmd/compile: inconsistent panics on out-of-bounds slice index spec: clarify that conversions to slices don't guarantee slice capacity? Apr 21, 2018
@ianlancetaylor ianlancetaylor modified the milestones: Go1.11, Unplanned Jun 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation LanguageChange NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

9 participants