Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime,cmd/compile: string concatenation specializations for performance #27801

Open
quasilyte opened this issue Sep 21, 2018 · 4 comments
Open
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@quasilyte
Copy link
Contributor

CL123256 started a discussion on which strings concatenation specialization we should choose (if any).

String concatenation can be made faster if we specialize concatenation with escaping result and len(args)<=5 (N) arguments. Escaping result means that we can avoid passing buf that is always nil as well as use rawstring instead of rawstringtmp. Known N means that we can unroll loops.

There are several ways (and these are not all of them):

  1. Specialize for N=2 x+y. Less code, seems like most frequent, the boost is maximal, but does not improve any other concatenation (like x+y+z which uses concatstring3).
  2. Specialize for all N, but for marginal performance boost that may not even worth it.
  3. Specialize for N={2,3,4,5}. More code.
  4. Specialize for N={2,3} covers most concatenations. Speeds up concat2 and concat3 measurably.

I've started from (1) since it's the easiest change that requires less amount of changes and gives significant performance gain. But in order to have decision extra feedback is required.

Here is comparison of (1) against unoptimized concat:

name            old time/op    new time/op    delta
Concat2-8         74.2ns ± 0%    53.5ns ± 1%  -27.95%  (p=0.000 n=9+15)
Concat3-8         94.9ns ± 0%    94.8ns ± 0%     ~     (p=0.082 n=14+15)
Concat2Empty-8    21.4ns ± 1%    14.2ns ± 0%  -33.75%  (p=0.000 n=15+14)
Concat3Empty-8    23.9ns ± 1%    23.9ns ± 1%     ~     (p=0.756 n=15+15)
[Geo mean]        43.6ns         36.2ns       -16.88%

Comparison for (4) implementation against go tip:

name            old time/op    new time/op    delta
Concat2-8         74.2ns ± 0%    66.1ns ± 1%  -10.95%  (p=0.000 n=9+15)
Concat3-8         94.9ns ± 0%    71.9ns ± 1%  -24.22%  (p=0.000 n=14+15)
Concat2Empty-8    21.4ns ± 1%    21.1ns ± 0%   -1.56%  (p=0.000 n=15+15)
Concat3Empty-8    23.9ns ± 1%    16.6ns ± 1%  -30.63%  (p=0.000 n=15+12)
[Geo mean]        43.6ns         35.9ns       -17.61%

Note that these numbers represent overhead savings, not general case strings concatenation performance boost, since length of strings determine these timings significantly.

Benchmarks:

package foo

import "testing"

//go:noinline
func concat2(x, y string) string {
	return x + y
}

//go:noinline
func concat3(x, y, z string) string {
	return x + y + z
}

var x = "foor"
var y = "2ews"
var z = ""

func BenchmarkConcat2(b *testing.B) {
	for i := 0; i < b.N; i++ {
		_ = concat2(x, "abc")
	}
}

func BenchmarkConcat3(b *testing.B) {
	for i := 0; i < b.N; i++ {
		_ = concat3(x, "abc", y)
	}
}

func BenchmarkConcat2Empty(b *testing.B) {
	for i := 0; i < b.N; i++ {
		_ = concat2(x, "")
	}
}

func BenchmarkConcat3Empty(b *testing.B) {
	for i := 0; i < b.N; i++ {
		_ = concat3("", x, z)
	}
}

I would like to know:

  1. That this kind of optimization is welcome.
  2. If it is desired, we need to choose which approach to take.

CC @martisch

@quasilyte
Copy link
Contributor Author

quasilyte commented Sep 21, 2018

СС @TocarIP @josharian @mvdan

@bcmills bcmills added Performance NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Sep 22, 2018
@bcmills bcmills added this to the Unplanned milestone Sep 22, 2018
@TocarIP
Copy link
Contributor

TocarIP commented Sep 24, 2018

Just adding data from CL:

For go executable, there are:

	501 concatstring2
	194 concatstring3
	55  concatstring4

@odeke-em
Copy link
Member

odeke-em commented Nov 9, 2018

Kindly paging @randall77 @ianlancetaylor in case of some more ideas too

@randall77
Copy link
Contributor

I'm not particularly worried about the extra code. If this means duplicating concatstrings and concatstring[2345] for the escape vs. nonescape case, that's not a big deal.

Are you also planning on unrolling the loop in concatstrings? Also here I'm not worried about binary code size, but I worry about about maintainability of lots of copy-pasted inner loop bodies.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

No branches or pull requests

6 participants