Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bytes: TrimXXX makes it easy to unintentionally overwrite slice #43341

Open
shmsr opened this issue Dec 23, 2020 · 10 comments
Open

bytes: TrimXXX makes it easy to unintentionally overwrite slice #43341

shmsr opened this issue Dec 23, 2020 · 10 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@shmsr
Copy link
Contributor

shmsr commented Dec 23, 2020

What version of Go are you using (go version)?

$ go version

go version go1.15.5 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env

GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/subhamsarkar/Library/Caches/go-build"
GOENV="/Users/subhamsarkar/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/subhamsarkar/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/subhamsarkar/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/Cellar/go/1.15.5/libexec"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/Cellar/go/1.15.5/libexec/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/n2/w00w99g93cj26xl42msl6lc80000gp/T/go-build468451499=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Trim and friends (TrimXXX) in the bytes package seem to unintentionally overwrite slices. Well, this could be working as designed but do refer:

  1. bytes: appending to a single slice from Split output can affect other slices of the output #21149
  2. regexp: Find makes it easy to unintentionally overwrite slices #30169

So, based on CLs that fixed the issues listed above, I think it'd be appropriate to match the capacity to the length so that it could be avoided and match the behaviour of other functions.

package main

import (
	"bytes"
	"fmt"
)

// func FixTrimSuffix(s, suffix []byte) []byte {
// 	if bytes.HasSuffix(s, suffix) {
// 		// Match capacity and length. Fixed bytes.TrimSuffix
// 		return s[: len(s)-len(suffix) : len(s)-len(suffix)]
// 	}
// 	return s
// }

func main() {
	b := []byte{'h', 'e', 'l', 'l', 'o'}
	t := []byte{'l', 'o'}

	fmt.Printf("b: %s\nt: %s\n", b, t)

	fmt.Printf("\n~ TrimSuffix t from b\n")
	bytesTrim := bytes.TrimSuffix(b, t)
	fmt.Printf("b: %s\nbytesTrim: %s\n", b, bytesTrim)

	fmt.Printf("\n~ Append xy to bytesTrim\n")
	bytesTrim = append(bytesTrim, 'x', 'y')
	fmt.Printf("b: %s\nbytesTrim: %s\n", b, bytesTrim)
}

Link to the playground: https://play.golang.org/p/yGJxI_Jnkk6

What did you expect to see?

b: hello
t: lo

~ TrimSuffix t from b
b: hello
bytesTrim: hel

~ Append xy to bytesTrim
b: hello
bytesTrim: helxy

What did you see instead?

b: hello
t: lo

~ TrimSuffix t from b
b: hello
bytesTrim: hel

~ Append xy to bytesTrim
b: helxy
bytesTrim: helxy
@shmsr
Copy link
Contributor Author

shmsr commented Dec 23, 2020

If this issue needs to be fixed then I'd like to send a CL for this if it's fine.

@seankhliao

This comment has been minimized.

@shmsr
Copy link
Contributor Author

shmsr commented Dec 23, 2020

@seankhliao I know how slices work. Look at CL for #30169!

@seankhliao
Copy link
Member

I see, after reading the linked issues, 👍

@ianlancetaylor
Copy link
Contributor

It would help to see the kind of analysis that was done in #21149 (comment). My intuition is that TrimSuffix in particular is a case where people might want to write things like b = append(bytes.TrimSuffix(b, ".go"), ".o"...) in which case forcing a new allocation would be undesirable.

@shmsr
Copy link
Contributor Author

shmsr commented Dec 23, 2020

Yeah, right. Okay, I'll do some research on this!

@mvdan
Copy link
Member

mvdan commented Dec 25, 2020

Just to give an extra data point - I've used Trim funcs many times with the knowledge that they do not allocate. Changing that would be surprising to me, and would make perfectly valid code suddenly allocate more. That could very well mean regressions if the call is in performance-sensitive code.

@go101
Copy link

go101 commented Dec 25, 2020

If the syntax s[::] is supported, this can be easily written as

bytesTrim = append(bytesTrim[::], 'x', 'y')

b = append(bytes.TrimSuffix(b, ".go")[::], ".o"...)

to always make an allocation.

@dmitshur dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Dec 30, 2020
@dmitshur dmitshur added this to the Backlog milestone Dec 30, 2020
@shmsr
Copy link
Contributor Author

shmsr commented Jan 16, 2021

Sorry for keep you guys waiting. So I'm skimmed through a lot of projects which were using popular packages (mostly vendored) and I think it should be safe to do the change that I was suggesting earlier. Although there's some disagreement and that's totally valid. So here I'm attaching some links where you can see the usage of bytes's Trim and friends.

In my opinion, the performance aspect of limiting the capacity might affect a small number of projects but most of the projects shouldn't see any difference. Also, it'd match the behaviour done in #21149 #30169

Here are some links:

vendor: (github.com/go-git/go-git/v5/plumbing/object/commit.go)
https://github.com/GoogleContainerTools/kaniko/blob/master/vendor/github.com/go-git/go-git/v5/plumbing/object/commit.go#L198

vendor: (github.com/go-git/go-git/v5/plumbing/object/commit.go)
https://github.com/GoogleContainerTools/kaniko/blob/ece215c18113020f9151fb25e69fc4ecc157c395/vendor/github.com/go-git/go-git/v5/plumbing/object/commit.go#L207

vendor: (github.com/moby/buildkit/frontend/dockerfile/parser/parser.go)
https://github.com/GoogleContainerTools/kaniko/blob/master/vendor/github.com/moby/buildkit/frontend/dockerfile/parser/parser.go#L228

vendor: (golang.org/x/crypto/openpgp/armor/armor.go)
https://github.com/GoogleContainerTools/skaffold/blob/master/hack/tools/vendor/golang.org/x/crypto/openpgp/armor/armor.go#L175

vendor: (golang.org/x/crypto/openpgp/armor/armor.go)
https://github.com/GoogleContainerTools/skaffold/blob/master/hack/tools/vendor/golang.org/x/crypto/openpgp/armor/armor.go#L199

https://github.com/GoogleContainerTools/skaffold/blob/master/pkg/skaffold/build/tag/git_commit.go#L166

https://github.com/Jeffail/benthos/blob/master/lib/processor/text.go#L130

vendor: (golang.org/x/crypto/ssh/keys.go)
https://github.com/PursuanceProject/pursuance/blob/develop/vendor/golang.org/x/crypto/ssh/keys.go#L67

vendor: (golang.org/x/crypto/ssh/keys.go)
https://github.com/PursuanceProject/pursuance/blob/develop/vendor/golang.org/x/crypto/ssh/keys.go#L85

vendor: (github.com/russross/blackfriday/v2/block.go)
https://github.com/docker/cli/blob/master/vendor/github.com/russross/blackfriday/v2/block.go#L310

https://github.com/jcmvbkbc/gcc-xtensa/blob/call0-4.8.2/libgo/go/encoding/pem/pem.go#L48

Do let me know if you need more examples.

@dsnet
Copy link
Member

dsnet commented Jul 28, 2021

I'm not sure if isolated examples are helpful (it's fairly easy to find examples that support either semantic). What would be needed is a thorough analysis that answers something like:

  • Of all the usages of bytes.TrimXXX on the module proxy, what percentage provably do not depend the extra capacity?

Currently, based on the 👍 on comments above, it seems that people generally prefer the current semantic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

7 participants