Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slices of an array of bytes: runtime mutation of source data variables when using file IO (ReadFile()) without pointers and unsafe.* operations in user code #28585

Closed
andrcmdr opened this issue Nov 4, 2018 · 9 comments

Comments

@andrcmdr
Copy link

andrcmdr commented Nov 4, 2018

What did you do?

Here's the earlier two simplified examples (based on production code, in which we observed such unexpected behaviour initially) - this code use file IO, instead of buffers (as in examples in previous issue - #28554).

When we use file IO (ioutil.ReadFile()) to read encrypted content we have mutation of source data (EncryptedContent and EncryptedContentMut) after decryption.
It's maybe because ReadFile() returns slice, passed by reference, through pointer, as usually (#28554) and source data mutate over slices through pointers, but I'm not sure. So maybe it's depend on internal working of ReadFile() routine.
I think this is a bug, 'cause it' has incorrect and unexpected semantics. And if not so - please, can you explain why and how does it happen, why it's work as it does and data mutates?

https://play.golang.org/p/G4uMh6pCwOx

Otherwise, when we use encryption without file IO (ioutil.ReadFile()) and encrypt data in-place (encryptAES256CBC()) or just use in-source hard-coded slice of an array of bytes in hex representation (of encrypted data) - we don't have a mutation of the source encrypted data in EncryptedContent and EncryptedContentMut variables after decryption routine returns:

https://play.golang.org/p/TnUmiHFy7iC

https://play.golang.org/p/3kfHf1NpE2I

What did you expect to see?

Source encrypted data in EncryptedContent and in EncryptedContentMut variables must not mutate after passing to decryptAES256CBC() and decryption operations, 'cause mutation happen only when source encrypted data gets from file through ioutil.ReadFile() routine.

What did you see instead?

In both cases we use slices of byte arrays, i.e. similar data-types, but have different semantics - mutation of source encrypted data (EncryptedContent, EncryptedContentMut variables) in case of reading them from file using ioutil.ReadFile() in first example, and in second example we doesn't observe mutation (which is normal behaviour) of EncryptedContent and EncryptedContentMut variables in case of using in-place encryption of source data with encryptAES256CBC() or using in-source hard-coded encrypted slice of an array of bytes.

What impact this induce?

The open decrypted text in DecryptedContent are gets into encrypted data (EncryptedContent and EncryptedContentMut variables) and replaces upper bytes. This cause malfunction of cryptosystem of our product and even goes to malicious using of this data (personal information, secret data, etc.).

============
Just to mention:
@ianlancetaylor @dvyukov @davecheney @rsc @bradfitz @adg @aclements @griesemer @robpike
@llorephie @luckyrider @Parabellum1905y @sabbakumov @KanybekMomukeyev

@andrcmdr andrcmdr changed the title slices of array of bytes: runtime mutation of source data variables when using file IO (ReadFile()) without pointers and unsafe.* operations in user code slices of an array of bytes: runtime mutation of source data variables when using file IO (ReadFile()) without pointers and unsafe.* operations in user code Nov 4, 2018
@cznic
Copy link
Contributor

cznic commented Nov 4, 2018

From https://play.golang.org/p/G4uMh6pCwOx, lines 21 and 22

EncryptedContent := EncryptedContentMut
copy(EncryptedContent, EncryptedContentMut)

The type of EncryptedContentMut is []byte. The above code makes the backing array of EncryptedContent the same as that of EncryptedContentMut. The subsequent copy is a no-operation.

I guess the intent of this code is to have the two variables not share the backing array. To do that:

EncryptedContent := append([]byte(nil), EncryptedContentMut...)

@andrcmdr
Copy link
Author

andrcmdr commented Nov 4, 2018

I guess the intent of this code is to have the two variables not share the backing array.

EncryptedContent := EncryptedContentMut
copy(EncryptedContent, EncryptedContentMut)

This code is just for experimenting to check the behaviour/semantics if we apply the assignment operation or even copy the slice of bytes (I guess in this operation Go only creates poiter to the same array of bytes).

So, if we even disable this code and rename EncryptedContentMut to EncryptedContent - we can observe the same behaviour/semantics, i.e. mutation of source data in EncryptedContent when we use file IO, but not when we use hard-coded slice of and array of bytes or take this slice from encryptAES256CBC() returns:

https://play.golang.org/p/5_dcDwd1JOB

@cznic
Copy link
Contributor

cznic commented Nov 4, 2018

The point is that the code assumes slices work in a way they actually don't.

I don't think there's any bug in Go involved here.

At least, can you please make the reproducer smaller, simpler and remove all the printing noise? (Byte literals are also usually written as eg. 0x42, not '\x42'.)

And you must clearly state where you see the bug. As in 'What happens' vs 'What do you expect to happen instead?'. Nowhere I found any indication of this. You're familiar with the code, others are not.

@bradfitz
Copy link
Contributor

bradfitz commented Nov 4, 2018

It sounds like there's a misunderstanding of this package's API and/or the Go language here, and not a bug in Go.

Unfortunately we don't have the resources in this bug tracker to help with all problems.

Let's bring this to golang-nuts or another forum for now (see https://golang.org/wiki/Questions). If/when a concrete, small problem is identified (even if it's an omission in the docs), then feel free to file a new bug here or comment on this one.

Thanks.

@bradfitz bradfitz closed this as completed Nov 4, 2018
@andrcmdr
Copy link
Author

andrcmdr commented Nov 4, 2018

At least, can you please make the reproducer smaller, simpler and remove all the printing noise? (Byte literals are also usually written as e.g. 0x42, not '\x42')

And you must clearly state where you see the bug. As in 'What happens' vs 'What do you expect to happen instead?'. Nowhere I found any indication of this. You're familiar with the code, others are not.

Here's an even more simplified two examples, and now we have mutation of source data in EncryptedContent in both cases, in case of using ioutil.ReadFile() for encrypted data and in case of using in-place encryption with encryptAES256CBC() or in-source hard-coded encrypted slice of an array of bytes:

https://play.golang.org/p/K5_XMqTh9LA
https://play.golang.org/p/DWSv3sYKJAl

https://play.golang.org/p/gRcqFywbHL6

Idk, how to make it simpler and explain the problem in a even more concise way - I edited and structured all description of the problem in the first message, now it's stated in a more precise manner, so you may read it carefully above.

About notation:

0x may used to int type notation or used as rune type notation symbols and in conversion operations of rune type to/from stringtype

https://golang.org/ref/spec#Conversions_to_and_from_a_string_type

'\x' and '\u' — is a hex and unicode (respectively) representation of char/byte in a string and in a rune type, so here we use []byte{'\xAB', '\xCD'} notation to convert string literals to byte, so this is correct

https://golang.org/ref/spec#Rune_literals
https://golang.org/ref/spec#String_literals

@cznic
Copy link
Contributor

cznic commented Nov 4, 2018

IDK why three different playground links, I'd expect just one. Still a lot of noisy prints yet not a single one of them ever cares to reveal what's perceived wrong. Can there be at least/just one that prints something like "Here we expect 0x42 0x31 but we got 0x12 0x13 instead"? (The % x verb is your friend in printing human readable dumps of byte slices.)

'\x' and '\u' — is a hex and unicode (respectively) representation of char/byte in a string and in a rune type, so here we use []byte{'\xAB', '\xCD'} notation to convert string literals to byte, so this is correct

No string literal is converted in the above quoted code. The '\xAB' is a rune not a string. Yes, it is correct, but no one said otherwise. The problem is that's unnecessarily verbose. Writing instead

[]byte{0xAB, 0xCD} 

is completely equivalent, just shorter and thus arguably more readable.

@andrcmdr
Copy link
Author

andrcmdr commented Nov 4, 2018

Solution

So, now in nearly every function we create and use with passing []byte or other slices and maps parameters into function - we will use append([]byte(nil), data...) for slice/map variables in local function scopes/namespaces.

It's most common pattern for Go semantics for slices and maps, which is passed by reference.

In the beginning of decryptAES256CBC() and encryptAES256CBC() functions we added this:

data = append([]byte(nil), data...)

And all issues gone, 'cause we explicitly instantiate new array with slice and pointer to this new instance.

Thanks a lot for help!

PS:

This implicit semantics and behaviour is not so obvious from the first sight. Especially when you expect passing by value (#28554) and immutability, 'cause has a common experience in functional languages with immutability in mind and system languages where we use pointers explicitly to pass by reference.

#28554 - here we must use make() to allocate new instance of an array

https://golang.org/pkg/builtin/#make

The make built-in function allocates and initializes an object of type slice, map, or chan (only).
Unlike new, make's return type is the same as the type of its argument, not a pointer to it.

#28585 - and here we must use append() to clone and allocate new instance of array in function scope

https://golang.org/pkg/builtin/#append

The append built-in function appends elements to the end of a slice. If it has sufficient capacity, the destination is resliced to accommodate the new elements. If it does not, a new underlying array will be allocated. Append returns the updated slice. It is therefore necessary to store the result of append, often in the variable holding the slice itself

Also there's a copy()

https://golang.org/pkg/builtin/#copy

The copy built-in function copies elements from a source slice into a destination slice. (As a special case, it also will copy bytes from a string to a slice of bytes.) The source and destination may overlap. Copy returns the number of elements copied, which will be the minimum of len(src) and len(dst).

And new()

https://golang.org/pkg/builtin/#new

The new built-in function allocates memory. The first argument is a type, not a value, and the value returned is a pointer to a newly allocated zero value of that type.

@cznic
Copy link
Contributor

cznic commented Nov 5, 2018

So, now in nearly every function we create and use with passing []byte or other slices and maps parameters into function - we will use append([]byte(nil), data...) for slice/map variables in local function scopes/namespaces.

That's not commonly seen and it's probably not necessary. It will have negative performance impact.

There's possibly a better solution for this particular issue, but sadly you have never shown us the exact problem and its location (as in here we expected X but got Y instead).

@agnivade
Copy link
Contributor

agnivade commented Nov 5, 2018

As pointed above, let us use golang-nuts or another forum, which is more suited for conversations such as these. And when there is a concrete bug request, then file an issue. Thank you.

@golang golang locked as resolved and limited conversation to collaborators Nov 5, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants