Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strings: add CutPrefix and CutSuffix #42537

Closed
cespare opened this issue Nov 12, 2020 · 23 comments
Closed

strings: add CutPrefix and CutSuffix #42537

cespare opened this issue Nov 12, 2020 · 23 comments

Comments

@cespare
Copy link
Contributor

cespare commented Nov 12, 2020

Inspired by #40135, I have another proposal for an addition to the strings package (and by extension, I suppose, the bytes package).

The strings functions I most frequently wish for are variants of TrimPrefix and TrimSuffix which report whether they did any trimming.

// TrimmedPrefix returns s without the provided leading prefix string
// and reports whether it found the prefix.
// If s doesn't start with prefix, TrimmedPrefix returns s, false.
// If prefix is the empty string, TrimmedPrefix returns s, true.
func TrimmedPrefix(s, prefix string) (trimmed string, ok bool)

Lacking these functions, Go authors resort to fragile code, often writing a prefix/suffix literal twice or hard-coding its length:

if strings.HasPrefix(s, "myprefix") {
	s = strings.TrimPrefix(s, "myprefix")
	...
}
if strings.HasPrefix(s, "myprefix") {
	s = s[len("myprefix"):]
	...
}
if strings.HasPrefix(s, "myprefix") {
	s = s[8:]
	...
}
if t := strings.TrimPrefix(s, "myprefix"); s != t {
	// had prefix
	s = t
	...
}

Of course, a function like TrimmedPrefix is easy to write and can exist outside of the standard library. At my company we have these functions in an internal string helper package and they see regular use. But certainly I wouldn't pull in a dependency for such a tiny helper function and so when I'm working on my own projects I generally just use the above workarounds.


Here are some examples from the Go source tree along with how they could be altered to use TrimmedPrefix/TrimmedSuffix:

  • src/cmd/go/internal/modload/load.go:

    if strings.HasPrefix(suffix, "/vendor/") {
    	return strings.TrimPrefix(suffix, "/vendor/")
    }
    

    becomes

    if v, ok := strings.TrimmedPrefix(suffix, "/vendor/"); ok {
    	return v
    }
    
  • src/cmd/go/proxy_test.go:

    if !strings.HasSuffix(name, ".txt") {
    	continue
    }
    name = strings.TrimSuffix(name, ".txt")
    

    becomes

    name, ok := strings.TrimmedSuffix(name, ".txt")
    if !ok {
    	continue
    }
    
  • src/testing/benchmark.go:

    if strings.HasSuffix(s, "x") {
    	n, err := strconv.ParseInt(s[:len(s)-1], 10, 0)
    	...
    }
    

    becomes

    if s, ok := strings.TrimmedSuffix(s, "x"); ok {
    	n, err := strconv.ParseInt(s, 10, 0)
    	...
    }
    
  • src/testing/fstest/mapfs.go:

    if strings.HasPrefix(fname, prefix) {
    	felem := fname[len(prefix):]
    	...
    }
    

    becomes

    if felem, ok := strings.TrimmedPrefix(fname, prefix); ok {
    	...
    }
    
  • src/mime/mediatype.go:

    if !strings.HasPrefix(rest, ";") {
    	return "", "", v
    }
    
    rest = rest[1:] // consume semicolon
    

    becomes

    rest, ok := strings.TrimmedPrefix(rest, ";")
    if !ok {
    	return "", "", v
    }
    
  • test/run.go:

    if strings.HasPrefix(line, "//") {
    	line = line[2:]
    } else {
    	continue
    }
    

    becomes

    line, ok := strings.TrimmedPrefix(line, "//")
    if !ok {
    	continue
    }
    
  • test/run.go:

    if strings.HasPrefix(m, "LINE+") {
    	delta, _ := strconv.Atoi(m[5:])
    	n += delta
    	...
    }
    

    becomes

    if d, ok := strings.TrimmedPrefix(m, "LINE+"); ok {
    	delta, _ := strconv.Atoi(d)
    	n += delta
    	...
    }
    
  • src/runtime/testdata/testprog/traceback_ancestors.go:

    if strings.HasPrefix(tb[pos:], "goroutine ") {
    	id := tb[pos+len("goroutine "):]
    	...
    }
    

    becomes

    if id, ok := strings.TrimmedPrefix(tb[pos:], "goroutine "); ok {
    	...
    }
    

Update 2022-03-26: Changed proposed names to TrimmedPrefix/TrimmedSuffix (per @ianlancetaylor's suggestion).

@gopherbot gopherbot added this to the Proposal milestone Nov 12, 2020
@ianlancetaylor ianlancetaylor added this to Incoming in Proposals (old) Nov 12, 2020
@ianlancetaylor
Copy link
Contributor

TrimmedPrefix and TrimmedSuffix ?

@mvdan
Copy link
Member

mvdan commented Nov 12, 2020

I find if t := strings.TrimPrefix(s, "myprefix"); s != t { reasonable, but that might just be because I've gotten used to the pattern. One argument in favor of it not being descriptive/intuitive enough is that it's not completely winning over alternatives like HasPrefix+TrimPrefix.

@jimmyfrasche
Copy link
Member

CutPrefix and CutSuffix so they group with the proposed Cut in docs?

@rsc rsc moved this from Incoming to Active in Proposals (old) Nov 18, 2020
@rsc
Copy link
Contributor

rsc commented Nov 18, 2020

It's unclear whether we need these, but at the very least we'd need better names.
Adding to the minutes in hopes that someone will come up with better names.

@narqo
Copy link
Contributor

narqo commented Nov 20, 2020

(With all respect) It feels the proposal lacks the use cases where the new functions required. Knowing that the current implementation of both strings.TrimPrefix and strings.TrimSuffix is the following

func TrimPrefix(s, prefix string) string {
	if HasPrefix(s, prefix) {
		return s[len(prefix):]
	}
	return s

}

can't one always use TrimPrefix for all the samples from the issue's description, dropping Has-checks? I.e.

- if strings.HasPrefix(suffix, "/vendor/") {
- 	return strings.TrimPrefix(suffix, "/vendor/")
- }
+ return strings.TrimPrefix(suffix, "/vendor/")

@cespare
Copy link
Contributor Author

cespare commented Nov 20, 2020

@narqo no. The places where you use TrimSuffix2 do something different if the test string contains the prefix/suffix.

For example, you suggested

- if strings.HasPrefix(suffix, "/vendor/") {
- 	return strings.TrimPrefix(suffix, "/vendor/")
- }
+ return strings.TrimPrefix(suffix, "/vendor/")

but those are not the same thing, because your version returns if suffix does not start with "/vendor/" and the original does not.

I listed the filenames that all my examples come from. You can go check out that example in context and you'll see it looks like this:

if strings.HasPrefix(suffix, "/vendor/") {
	return strings.TrimPrefix(suffix, "/vendor/")
}
return targetPrefix + suffix

@rsc
Copy link
Contributor

rsc commented Dec 2, 2020

Putting on hold with Cut.

@rsc rsc moved this from Active to Hold in Proposals (old) Dec 2, 2020
@cespare cespare changed the title proposal: strings: add TrimPrefix2 and TrimSuffix2, variants which report whether they trimmed proposal: strings: add TrimmedPrefix and TrimmedSuffix, variants which report whether they trimmed Mar 26, 2022
@cespare cespare changed the title proposal: strings: add TrimmedPrefix and TrimmedSuffix, variants which report whether they trimmed proposal: strings: add TrimmedPrefix and TrimmedSuffix which report whether they trimmed Mar 26, 2022
@cespare
Copy link
Contributor Author

cespare commented Mar 26, 2022

I adopted @ianlancetaylor's suggested names.

In the intervening time since filing this proposal, I haven't thought of a better name. But I've used these functions constantly, and I've ended up copy-pasting them into many of my projects.

@cespare
Copy link
Contributor Author

cespare commented Mar 26, 2022

@rsc Cut has been proposed, accepted, and implemented. Can we un-hold this proposal now?

@ianlancetaylor ianlancetaylor moved this from Hold to Incoming in Proposals (old) Mar 26, 2022
@ianlancetaylor
Copy link
Contributor

Moved out of proposal-hold.

Do these add enough value over just using Cut?

@cespare
Copy link
Contributor Author

cespare commented Mar 27, 2022

@ianlancetaylor I'm not sure I see how Cut changes things here.

To me, using Cut for any of these purposes seems pretty awkward. Taking one of the examples (from src/cmd/go/proxy_test.go):

if !strings.HasSuffix(name, ".txt") {
	continue
}
name = strings.TrimSuffix(name, ".txt")

With TrimmedSuffix, it is:

name, ok := strings.TrimmedSuffix(name, ".txt")
if !ok {
	continue
}

Using Cut, I guess it would be

name, extra, ok := strings.Cut(name, ".txt")
if !ok || extra != "" {
	continue
}

Is that what you had in mind? To me, that is much less clear (since it doesn't say "suffix").

If I had to write this using only the strings functions that exist today, I wouldn't use Cut. I'd probably do something like this:

trimmed := strings.TrimSuffix(name, ".txt")
if trimmed == name {
	continue
}
name = trimmed

That at least says "suffix" and it avoids writing the suffix string (".txt") twice, but it requires an extra line and a separate string variable in comparison with the TrimmedSuffix version.

@rsc
Copy link
Contributor

rsc commented Mar 30, 2022

Cut doesn't change things but these seem remarkably minor. HasPrefix works well for this use case.

@rsc rsc moved this from Incoming to Active in Proposals (old) Mar 30, 2022
@rsc
Copy link
Contributor

rsc commented Mar 30, 2022

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Apr 6, 2022

It doesn't seem like these are providing enough value to add to the standard library. Does anyone want to make a case for accepting?

@cespare
Copy link
Contributor Author

cespare commented Apr 8, 2022

The reasoning for why this belongs in the standard library has two parts.

These functions are improvements over various poor patterns that people write in their absence.

Without Trimmed(Prefix|Suffix), people write code that is less clear, longer, more repetitive, and more prone to errors when being changed. The proposal at the top gives many concrete examples so I won't rehash them here. It seems obvious that writing the same string constant multiple times, or hardcoding its length, is not great. HasPrefix does not always "work well for this use case."

These functions improve a large number of sites.

I didn't address frequency of use in my original proposal so I took a quick skim now. Using the Go source tree (and just grep, so these numbers are a little rough):

Calls Count
All strings function calls 4843
strings.Has(Prefix|Suffix) 1029
strings.Trim(Prefix|Suffix) 192

So (Has|Trim)(Prefix|Suffix) make up a quarter of all strings function calls. (I checked a large private code base and it was fairly similar; about 22%.)

Then I examined a random sampling of 50 strings.Has(Prefix|Suffix) calls and 50 strings.Trim(Prefix|Suffix) calls.

For strings.Has(Prefix|Suffix):

  • 4/50 use strings.Trim(Prefix|Suffix) as well
  • 10/50 use other separate trimming code
  • A total of 14/50 = 28% of sites can be improved with Trimmed(Prefix|Suffix).

For strings.Trim(Prefix|Suffix):

  • 12/50 use strings.Has(Prefix|Suffix) as well
  • 3/50 use other separate checking code
  • A total of 15/50 = 30% of sites can be improved with Trimmed(Prefix|Suffix).

Extrapolating from these samples, I estimate that 288/1029 calls to Has(Prefix|Suffix) and 58/192 calls to Trim(Prefix|Suffix) can be improved with Trimmed(Prefix|Suffix). This totals to a little over 7% of all uses of strings.

By way of comparison, the repo contains 115 uses of strings.Cut (this is after https://go.dev/cl/351711 which rewrote Index/IndexByte/IndexRune/Split/SplitN calls to use Cut as much as possible).

@rsc
Copy link
Contributor

rsc commented Apr 13, 2022

These statistics do make a good case. Thank you for taking the time to gather them.

strings.TrimmedPrefix is a bit too long a name. Given that we have strings.Cut that returns a 'found' boolean already, which is what we want to add here, maybe strings.CutPrefix and strings.CutSuffix?

@rsc rsc changed the title proposal: strings: add TrimmedPrefix and TrimmedSuffix which report whether they trimmed proposal: strings: add CutPrefix and CutSuffix May 4, 2022
@rsc
Copy link
Contributor

rsc commented May 4, 2022

Retitled, and will move to likely accept.

@rsc
Copy link
Contributor

rsc commented May 4, 2022

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@rsc rsc moved this from Active to Likely Accept in Proposals (old) May 4, 2022
@rsc rsc moved this from Likely Accept to Accepted in Proposals (old) May 11, 2022
@rsc
Copy link
Contributor

rsc commented May 11, 2022

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc changed the title proposal: strings: add CutPrefix and CutSuffix strings: add CutPrefix and CutSuffix May 11, 2022
@rsc rsc modified the milestones: Proposal, Backlog May 11, 2022
@gopherbot
Copy link

Change https://go.dev/cl/407176 mentions this issue: strings: add CutPrefix and CutSuffix

@cespare
Copy link
Contributor Author

cespare commented May 19, 2022

I forgot to mention it in the proposal, but we should also add these functions to bytes, correct? To keep parity?

@gopherbot
Copy link

Change https://go.dev/cl/432895 mentions this issue: all: use the newly added strings.CutPrefix method to optimize the code

@gopherbot
Copy link

Change https://go.dev/cl/433275 mentions this issue: cmd/go/internal: use the newly added strings.CutSuffix method to opti…

gopherbot pushed a commit that referenced this issue Sep 27, 2022
Updates #42537

Change-Id: Ice23d7d36bbede27551cbc086119694f6a3b5e4a
GitHub-Last-Rev: 0d65208
GitHub-Pull-Request: #55347
Reviewed-on: https://go-review.googlesource.com/c/go/+/432895
Reviewed-by: Benny Siegert <bsiegert@gmail.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
gopherbot pushed a commit that referenced this issue Sep 27, 2022
Updates #42537

Change-Id: I2d4c5e911c8a2ddfe9a976896b05d3cd8be61f6b
GitHub-Last-Rev: a87597d
GitHub-Pull-Request: #55830
Reviewed-on: https://go-review.googlesource.com/c/go/+/433275
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Benny Siegert <bsiegert@gmail.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
tmm1 pushed a commit to fancybits/go that referenced this issue May 26, 2023
Fixes golang#42537

Change-Id: Ie03c2614ffee30ebe707acad6b9f6c28fb134a45
Reviewed-on: https://go-review.googlesource.com/c/go/+/407176
Reviewed-by: Benny Siegert <bsiegert@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Changkun Ou <mail@changkun.de>
Reviewed-by: Ian Lance Taylor <iant@google.com>
(cherry picked from commit 6800559)
@golang golang locked and limited conversation to collaborators Sep 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
Development

No branches or pull requests

9 participants