Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bytes: Runtime mutations of variables without pointers and unsafe.* operations in user code. #28554

Closed
llorephie opened this issue Nov 2, 2018 · 6 comments

Comments

@llorephie
Copy link

What version of Go are you using (go version)?

$ go version
go version go1.11.1 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/llorephie/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/llorephie/.go"
GOPROXY=""
GORACE=""
GOROOT="/usr/lib/go"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build857919622=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Variant 1: https://play.golang.org/p/FUSIhyjg2y2
Variant 2 (almost same, another way to initialize readBufferCopy4 variable): https://play.golang.org/p/0aB0m2XMH4D

What did you expect to see?

readBufferCopy1 == readBufferCopy2 == readBufferCopy3 == readBufferCopy4 == readBufferCopy5 == readBuffer == contentToWriteOut (untouched encrypted content)

What did you see instead?

Almost all variables decrypted with addition of InitVector at beginning, excluding readBufferCopy5, readBufferCopy4 in make() case;

=====
Additionaly mentioning @andrcmdr for additional comments by this case and watching.

@ianlancetaylor
Copy link
Contributor

Yes, this is how slices work. See https://blog.golang.org/go-slices-usage-and-internals .

@andrcmdr
Copy link

andrcmdr commented Nov 2, 2018

Hi Ian! First of all - thank you for your time!

I think it's not so obvious behaviour, 'cause it feels like (and work so) passing slice elements (of an array) by reference, but Dave Cheney (@davecheney) talking to us that Go is a "pass by value language" and "there is no pass-by-reference in Go" (https://dave.cheney.net/2017/04/29/there-is-no-pass-by-reference-in-go). This got stuck in my memory. =)

https://blog.golang.org/go-slices-usage-and-internals

Slicing does not copy the slice's data. It creates a new slice value that points to the original array. This makes slice operations as efficient as manipulating array indices.

Therefore, modifying the elements (not the slice itself) of a re-slice modifies the elements of the original slice

d := []byte{'r', 'o', 'a', 'd'}
e := d[2:] 
// e == []byte{'a', 'd'}
e[1] = 'm'
// e == []byte{'a', 'm'}
// d == []byte{'r', 'o', 'a', 'm'}

So, to prevent this implicit behaviour we must explicitly allocate array in slice via make() built-in function (func make([]T, len, cap) []T).

But, to prevent such a "Whoa!" effect for newbie Gophers this implicit behavior of slices must be reflect in documentstion of the Go language.

https://golang.org/ref/spec#Slice_types
https://golang.org/ref/spec#Slice_expressions

Honestly, for me, with even more than three years of practitioning in Go and build projects with Go, doesn't understand what's wrong and doesn't spot this behaviour from the first sight — "simplicity is not so simple" as a Rob Pike (@robpike) said, but also simplicity is not so obvious!

This behaviour reflects in "Effective Go" only, but also needs to be present in language specification.

https://golang.org/doc/effective_go.html#slices

About an arrays itself:

Arrays are values. Assigning one array to another copies all the elements.
In particular, if you pass an array to a function, it will receive a copy of the array, not a pointer to it.

About slices:

Slices hold references to an underlying array, and if you assign one slice to another, both refer to the same array. If a function takes a slice argument, changes it makes to the elements of the slice will be visible to the caller, analogous to passing a pointer to the underlying array. A Read function can therefore accept a slice argument rather than a pointer and a count

Append can modify the elements of slice, the slice itself (the run-time data structure holding the pointer, length, and capacity) is passed by value.

And this is true and applied to the maps also:

Like slices, maps hold references to an underlying data structure. If you pass a map to a function that changes the contents of the map, the changes will be visible in the caller.

https://golang.org/doc/effective_go.html#allocation_new
https://golang.org/doc/effective_go.html#allocation_make

Allocation with make
The built-in function make(T, args) serves a purpose different from new(T). It creates slices, maps, and channels only, and it returns an initialized (not zeroed) value of type T (not *T). The reason for the distinction is that these three types represent, under the covers, references to data structures that must be initialized before use. A slice, for example, is a three-item descriptor containing a pointer to the data (inside an array), the length, and the capacity, and until those items are initialized, the slice is nil.

Remember that make applies only to maps, slices and channels and does not return a pointer. To obtain an explicit pointer allocate with new or take the address of a variable explicitly.

But, I think, if we need such eficiency, we always can use a pointer to an array, slice or map — nevertheless slices and maps passed by reference for efficiency by default, and can mutate. Very controversial decision by design.

The value property can be useful but also expensive; if you want C-like behavior and efficiency, you can pass a pointer to the array.

So, I think it could be potentially the root of many misunderstandings and bugs (especially in cryptography work, like in our project), not only for newbies, but also for more experienced Go developers with a background in other languages and polyglot in mind (like me), 'cause when we pass slice into the function argument, or assign a slice into other variable (and here must be possibility of const for slices), we expect that slice will be passed by value and thus primary slice doesn't mutating, and function returns new slice of bytes to us, which we can to assign to the other variable.

But for now we have a such economy on slice and maps variables and bytes, much mutation, just for efficiency.

A possible "gotcha"
As mentioned earlier, re-slicing a slice doesn't make a copy of the underlying array. The full array will be kept in memory until it is no longer referenced. Occasionally this can cause the program to hold all the data in memory when only a small piece of it is needed.

Because of such things, Go doesn't follow the "principle of least astonishment" (https://en.wikipedia.org/wiki/Principle_of_least_astonishment).

So, thus I think this behaviour of slices must be at least cover by statically checking and compier warning (like "use make() to allocate the slices, little gopher!" =), and also intergrated into the LSP and debugging tools for IDEs, like https://github.com/sourcegraph/go-langserver

And guys, please, take a look at the on-line Go playground - there's something really strange happening with the program output of the first example (in contrast to the local terminal output):
https://play.golang.org/p/mxTdqUNneYZ
https://play.golang.org/p/7S_-ERcWWAq

============
Just to mention: @ianlancetaylor @llorephie @luckyrider @Parabellum1905y @sabbakumov @KanybekMomukeyev

@ianlancetaylor
Copy link
Contributor

It's far too late to change how slices work in Go today.

You say that it should be documented in the language spec, and it is: you are quoting the exact lines where it is documented.

I'm sorry you found the behavior confusing, but I'm not sure what we should do here. Thousands of existing Go programs rely on the current semantics. We aren't going to add a compiler warning. It's quite unlikely that we would add a vet warning, even if we could figure out what to warn about.

@andrcmdr
Copy link

andrcmdr commented Nov 3, 2018

Thanks for an answer Ian!
So, I think it's possible to add warning ("make() slice to allocate!" & "make() map to allocate!", or something like this =) in statical verification phase when we passing slice or map into function - it will be very helpful.
Nevertheless, I hope the core team may revise this compiler behaviour closer to the Go 2.0 release.

============
@ianlancetaylor

@andrcmdr
Copy link

andrcmdr commented Nov 4, 2018

Ian, please, take a look at this two examples.

Here's the earlier two simplified examples (based on production code, in which we observed such unexpected behaviour initially) - this code use file IO, instead of buffers (as in examples above).

When we use file IO (io/ioutil) to read encrypted content we have mutation of this data after decryption.
It's maybe ReadFile() returned slice, passed by reference, through pointer, as usually, but I'm not sure.

https://play.golang.org/p/G4uMh6pCwOx

Otherwise, when we use encryption without file IO and encrypt data in-place or just use in-source hard-coded slice of an array of bytes in hex representation (of encrypted data) - we don't have a mutation of the source encrypted data in EncryptedContentMut variable after decryption routine returns:

https://play.golang.org/p/TnUmiHFy7iC

https://play.golang.org/p/3kfHf1NpE2I

In both cases we use slices of byte arrays, similar data-types, but have different semantics - mutation of source data and not.

Is it correct semantics? And if so - please, can you explain why and how does it happen? 'Cause I feels strong confusing and misunderstanding of this.

============
@ianlancetaylor

@agnivade
Copy link
Contributor

agnivade commented Nov 4, 2018

Hi @andrcmdr - I am very sorry, but we only use the issue tracker for bugs and feature proposals.

For general questions like these, please feel free to make good use of the forums mentioned in the Questions wiki page. That has much more reach and is more suitable for general discussions like these.

Alternatively, if you have a new suggestion or proposal, please do open a new issue.

Thank you !

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants