Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: "redo growslice calling convention" caused a 14.7% regression in AppendMsgReplicateDecision #56440

Closed
mknyszek opened this issue Oct 26, 2022 · 10 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge Performance
Milestone

Comments

@mknyszek
Copy link
Contributor

We should determine if this is worth fixing, if the benchmark is too noisy (or not really useful), or if there's a real bug here.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Oct 26, 2022
@mknyszek mknyszek added this to the Go1.20 milestone Oct 26, 2022
@randall77
Copy link
Contributor

Definitely not a great benchmark. With a sub-1ns running time, lots of noise. I think we can probably get rid of it. (Anyone know how to do that? Where's the config of what is run?) @dr2chase

That said, it does look like there is something happening here. The inner loop looks like it has an extra bounds check in it. That may be fixable, I'd need to extract a repro from this giant repository first.

@randall77
Copy link
Contributor

I can reproduce with:

package p

import (
	"testing"
)

func BenchmarkAppendMsgReplicateDecision(b *testing.B) {
	v := ReplicateDecision{}
	bts := make([]byte, 0, v.Msgsize())
	bts, _ = v.MarshalMsg(bts[0:0])
	b.SetBytes(int64(len(bts)))
	b.ReportAllocs()
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		bts, _ = v.MarshalMsg(bts[0:0])
	}
}

func (z ReplicateDecision) MarshalMsg(b []byte) (o []byte, err error) {
	o = Require(b, z.Msgsize())
	// map header, size 0
	o = append(o, 0x80)
	return
}

type ReplicateDecision struct {
	targetsMap map[string]int64
}

func Require(old []byte, extra int) []byte {
	l := len(old)
	c := cap(old)
	r := l + extra
	if c >= r {
		return old
	} else if l == 0 {
		return make([]byte, 0, extra)
	}
	c <<= 1
	if c < r {
		c = r
	}
	n := make([]byte, l, c)
	copy(n, old)
	return n
}

func (z ReplicateDecision) Msgsize() (s int) {
	s = 1
	return
}

@dr2chase
Copy link
Contributor

dr2chase commented Oct 27, 2022

The default bent benchmarks are here: https://cs.opensource.google/go/x/benchmarks/+/3255f073:cmd/bent/configs/benchmarks-50.toml

It needs a decoder ring to tell you what benchmark+package that is, but it's in minio.
(Decoder ring, scroll down: https://perf.golang.org/search?q=upload:20221026.20 )

A nanosecond's not great, but this was a regression, right? I try to have a diversity of benchmarks, but also, I am willing to ignore a few regressions if the big picture is okay.

@randall77
Copy link
Contributor

Simple repro:

package p

import (
	"testing"
)

func BenchmarkFoo(b *testing.B) {
	s := make([]byte, 1)
	for i := 0; i < b.N; i++ {
		s = append(s[:0], 0)
	}
}

The generated code is a bit worse after my CL because the compiler doesn't realize that growslice definitely makes the returned slice have length 1.
Previously, the new length was computed as 0+1, which makes the length known after the append. But my change reloads the length from the result of growslice so we don't have to spill/restore the length. But if the length is constant, spill/restore isn't really a worry. This is probably fixable with a special case. More invasive may be teaching the prove pass about the equality of length between the input and output of growslice.

@gopherbot
Copy link

Change https://go.dev/cl/445875 mentions this issue: cmd/compile: recognize when the result of append has a constant length

@prattmic
Copy link
Member

FWIW, this did not seem to fix the benchmark: https://perf.golang.org/dashboard/?benchmark=AppendMsgReplicateDecision-8&days=90&end=2022-10-28T15%3A45

The most recent run is for https://go.dev/cl/444715, which was submitted after https://go.dev/cl/445875.

@randall77
Copy link
Contributor

You're right, it fixes my simple repro but not the original repro.

@randall77 randall77 reopened this Oct 28, 2022
@gopherbot
Copy link

Change https://go.dev/cl/446277 mentions this issue: cmd/compile: add rule for post-decomposed growslice optimization

romaindoumenc pushed a commit to TroutSoftware/go that referenced this issue Nov 3, 2022
Fixes a performance regression due to CL 418554.

Fixes golang#56440

Change-Id: I6ff152e9b83084756363f49ee6b0844a7a284880
Reviewed-on: https://go-review.googlesource.com/c/go/+/445875
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
romaindoumenc pushed a commit to TroutSoftware/go that referenced this issue Nov 3, 2022
The recently added rule only works before decomposing slices.
Add a rule that works after decomposing slices.

The reason we need the latter is because although the length may
be a constant, it can be hidden inside a slice that is not constant
(its pointer or capacity might be changing). By applying this
optimization after decomposing slices, we can find more cases
where it applies.

Fixes golang#56440

Change-Id: I0094e59eee3065ab4d210defdda8227a6e897420
Reviewed-on: https://go-review.googlesource.com/c/go/+/446277
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
@prattmic
Copy link
Member

@golang golang locked and limited conversation to collaborators Dec 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge Performance
Projects
None yet
Development

No branches or pull requests

5 participants