Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strings: ToLower not returning a copy for strings already in lower case #30987

Closed
Feandil opened this issue Mar 21, 2019 · 6 comments
Closed
Labels
Documentation FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@Feandil
Copy link

Feandil commented Mar 21, 2019

What version of Go are you using (go version)?

1.12.1

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

linux/amd64

What did you do?

Using strings.ToLower on strings in upper or lower case:
https://play.golang.org/p/eyPCCj_zCtF

What did you expect to see?

As per the documentation (https://github.com/golang/go/blob/master/src/strings/strings.go#L583), have the same behaviour for both cases: getting a new copy of the string

What did you see instead?

Strings with only lower case do not get copied, the original string is kept

Comments

This seems to be a result of an optimisation, 13cfb15, and in particular the fully optimized path of lower case ASCII: https://github.com/golang/go/blob/master/src/strings/strings.go#L597

As string are immutable, it doesn't seem to be a huge issue (except that the documentation is wrong), but as far as I understand this can result in pseudo memory leaks if one relies (as per the documentation) on ToLower to do a copy of small slices of larger strings (and only keeping references to the lower case slice)

@agnivade
Copy link
Contributor

Nice catch. Although the commit you pointed to is for the ToUpper one.

My vote is to document this behavior. We don't need to unnecessarily copy the string when it's not required. Even Map has a fast path that returns the original string instead of a copy. https://github.com/golang/go/blob/master/src/strings/strings.go#L497

@bradfitz @dsnet

@agnivade agnivade added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Mar 21, 2019
@agnivade agnivade changed the title strings: strings.ToLower not returning a copy for strings already in lower case strings: ToLower not returning a copy for strings already in lower case Mar 21, 2019
@randall77
Copy link
Contributor

At the language level, there's no difference between a string and a copy of that string. They may retain different amounts of memory for their backing store, but memory usage is not a language-level concept.

So I don' think it warrants changing the docs for strings.ToLower. I'm sure there are 10 other functions that do similar optimizations. We don't want people using strings.ToLower because of its copying side-effect.

That said, we understand that it's occasionally necessary to copy strings to free a large backing store. We've been musing about a way to do that. I think we probably need a runtime function which guarantees a new backing store (or knows that the current one is small enough).

Currently, string([]byte(s)) makes a new string. Probably not wise to rely on that behavior.

@agnivade
Copy link
Contributor

Makes sense. But the OP has a point. If the docs say that a function returns a copy, but it actually doesn't in some cases, the the docs are technically incorrect.

We don't want people using strings.ToLower because of its copying side-effect.

Agree, but this needs to be documented somewhere as a general pointer. Perhaps in the package overview or somewhere else.

@ianlancetaylor ianlancetaylor added Documentation NeedsFix The path to resolution is known, but the work has not been done. labels Mar 21, 2019
@ianlancetaylor ianlancetaylor added this to the Go1.13 milestone Mar 21, 2019
@gopherbot gopherbot removed the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Mar 21, 2019
@go101
Copy link

go101 commented Mar 21, 2019

Maybe the docs of strings.ToLower is copied from bytes.ToLower. ;D

@bradfitz
Copy link
Contributor

Maybe the docs of strings.ToLower is copied from bytes.ToLower. ;D

Yes, I think that's the case. We should remove "a copy" from the strings version, as it doesn't mean anything.

@gopherbot
Copy link

Change https://golang.org/cl/171735 mentions this issue: strings: documenting the case when ToUpper/ToLower can return original string

@golang golang locked and limited conversation to collaborators Apr 13, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Documentation FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

7 participants