Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: strings: add (*Builder).ResetKeepCapacity method #60768

Open
mateusz834 opened this issue Jun 13, 2023 · 5 comments
Open

proposal: strings: add (*Builder).ResetKeepCapacity method #60768

mateusz834 opened this issue Jun 13, 2023 · 5 comments
Labels
Milestone

Comments

@mateusz834
Copy link
Member

Currently the strings.Builder wastes the remaining capacity when (*Builder).Reset is being called, it works just the same way as creating a new empty strings.Builder.

// Reset resets the Builder to be empty.
func (b *Builder) Reset() {
	b.addr = nil
	b.buf = nil
}

Currently it is not possible in a backwards compatibility manner to keep the remaining capacity in the Builder after calling Reset(), because it would require keeping the b.addr pointer, so that it is not possible to copy the Builder.

I propose adding a simple (*Builder).ResetKeepCapacity method with following signature:

// ResetKeepCapacity resets the builder and keeps the remaining
// capacity (b.Cap() - b.Len()) for future use.
func (b *Builder) ResetKeepCapacity() {
	b.buf = b.buf[len(b.buf):]
}
@gopherbot gopherbot added this to the Proposal milestone Jun 13, 2023
@ianlancetaylor
Copy link
Contributor

This seems like a special purpose operation. When does it come up in practice?

@mateusz834
Copy link
Member Author

When allocating a bunch of small strings when the total size of all strings in more or less known.
This is currently kind of possible using the current API, but not really convenient to use.

func main() {
	var b strings.Builder
	// Case 1: Size known upfront.
	// The size must be known upfront, so that it doesn't grow
	// anymore, because it will unecessary reallocate the previous strings
	// in the builder.
	b.Grow(len("test1") + len("test2"))
	str1 := newStringSizeKnown(&b, []byte("test1"))
	str2 := newStringSizeKnown(&b, []byte("test2"))
	fmt.Println(str1, str2) // test1 test2

	b.Reset()

	// Case 1: Size not known upfront.
	b.Grow(len("test1"))
	str1 = newString(&b, []byte("test1"))
	str2 = newString(&b, []byte("test2"))
	fmt.Println(str1, str2) // test1 test2
}

func newStringSizeKnown(b *strings.Builder, buf []byte) string {
	l := b.Len()
	b.Write(buf)
	return b.String()[l:]
}

func newString(b *strings.Builder, buf []byte) string {
	if b.Cap()-b.Len() < len(buf) {
		b.Reset()
	}
	return newStringSizeKnown(b, buf)
}

@ayang64
Copy link
Member

ayang64 commented Jun 14, 2023

Can you explain what this is trying to achieve? What problem does calling .Reset() cause that this fixes?

@mateusz834
Copy link
Member Author

Can you explain what this is trying to achieve?

Just performance optimizations, reduce a bunch of small string allocations into one bigger memory allocation.

What problem does calling .Reset() cause that this fixes?

With the .Reset() it is currently possible to simulate the .ResetKeepCapacity() method, but the .ResetKeepCapacity() is just simpler and it does not require workarounds as in the examples above (#60768 (comment)).

@ianlancetaylor
Copy link
Contributor

You could keep track of your offsets, call the String method once, and slice the resulting string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

4 participants