Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: unexpected memequal call in short string comparison #24765

Open
bradfitz opened this issue Apr 8, 2018 · 11 comments
Open

cmd/compile: unexpected memequal call in short string comparison #24765

bradfitz opened this issue Apr 8, 2018 · 11 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. help wanted NeedsFix The path to resolution is known, but the work has not been done. Performance Suggested Issues that may be good for new contributors looking for work to do.
Milestone

Comments

@bradfitz
Copy link
Contributor

bradfitz commented Apr 8, 2018

(Using Go tip, 8818b4d)

Compiling this code on amd64,

type BoxType [4]byte

func (t BoxType) EqualString(s string) bool {
        return string(t[:]) == s
}

I would expect that to compile to something pretty minimal, since the memory comparison must always be 4 bytes, but instead I see a call to memequal:

"".BoxType.EqualString STEXT size=103 args=0x20 locals=0x28
        0x0000 00000 (./bmff/bmff.go:62)        TEXT    "".BoxType.EqualString(SB), $40-32
        0x0000 00000 (./bmff/bmff.go:62)        MOVQ    (TLS), CX
        0x0009 00009 (./bmff/bmff.go:62)        CMPQ    SP, 16(CX)
        0x000d 00013 (./bmff/bmff.go:62)        JLS     96
        0x000f 00015 (./bmff/bmff.go:62)        SUBQ    $40, SP
        0x0013 00019 (./bmff/bmff.go:62)        MOVQ    BP, 32(SP)
        0x0018 00024 (./bmff/bmff.go:62)        LEAQ    32(SP), BP
        0x001d 00029 (./bmff/bmff.go:62)        FUNCDATA        $0, gclocals·a20105803dd226ab8faa525f9ceddf12(SB)
        0x001d 00029 (./bmff/bmff.go:62)        FUNCDATA        $1, gclocals·69c1753bd5f81501d95132d08af04464(SB)
        0x001d 00029 (./bmff/bmff.go:63)        MOVQ    "".s+64(SP), AX
        0x0022 00034 (./bmff/bmff.go:63)        CMPQ    AX, $4
        0x0026 00038 (./bmff/bmff.go:63)        JEQ     56
        0x0028 00040 (./bmff/bmff.go:63)        XORL    AX, AX
        0x002a 00042 (./bmff/bmff.go:63)        MOVB    AL, "".~r1+72(SP)
        0x002e 00046 (./bmff/bmff.go:63)        MOVQ    32(SP), BP
        0x0033 00051 (./bmff/bmff.go:63)        ADDQ    $40, SP
        0x0037 00055 (./bmff/bmff.go:63)        RET
        0x0038 00056 (./bmff/bmff.go:63)        LEAQ    "".t+48(SP), AX
        0x003d 00061 (./bmff/bmff.go:63)        MOVQ    AX, (SP)
        0x0041 00065 (./bmff/bmff.go:63)        MOVQ    "".s+56(SP), AX
        0x0046 00070 (./bmff/bmff.go:63)        MOVQ    AX, 8(SP)
        0x004b 00075 (./bmff/bmff.go:63)        MOVQ    $4, 16(SP)
        0x0054 00084 (./bmff/bmff.go:63)        PCDATA  $0, $1
        0x0054 00084 (./bmff/bmff.go:63)        CALL    runtime.memequal(SB)
        0x0059 00089 (./bmff/bmff.go:63)        MOVBLZX 24(SP), AX
        0x005e 00094 (./bmff/bmff.go:63)        JMP     42
        0x0060 00096 (./bmff/bmff.go:63)        NOP
        0x0060 00096 (./bmff/bmff.go:62)        PCDATA  $0, $-1
        0x0060 00096 (./bmff/bmff.go:62)        CALL    runtime.morestack_noctxt(SB)
        0x0065 00101 (./bmff/bmff.go:62)        JMP     0

Ideally we'd just do a 4 byte (potentially unaligned) load from the string into a register and compare with the receiver?

/cc @josharian @randall77 @rasky

@bradfitz bradfitz added this to the Unplanned milestone Apr 8, 2018
@josharian
Copy link
Contributor

In case someone wants to look into this, the relevant bit of code to soup up is from walk.go, func walkexpr, near the beginning of case OCMPSTR:

		// Rewrite comparisons to short constant strings as length+byte-wise comparisons.
		var cs, ncs *Node // const string, non-const string
		switch {
		case Isconst(n.Left, CTSTR) && Isconst(n.Right, CTSTR):
			// ignore; will be constant evaluated
		case Isconst(n.Left, CTSTR):
			cs = n.Left
			ncs = n.Right
		case Isconst(n.Right, CTSTR):
			cs = n.Right
			ncs = n.Left
		}

This code tries to detect strings with known length by looking for constant strings. Fixing this issue would involve adding detection of strings converted from byte (and rune) slices of known length.

Might (maybe) be a decent starter compiler issue.

@josharian josharian added Suggested Issues that may be good for new contributors looking for work to do. help wanted labels Apr 11, 2018
@ChrisALiles
Copy link
Contributor

I’m continuing to try to get started (or maybe continually trying) so I’d like to have a look at this.

@josharian
Copy link
Contributor

It might make sense to do this via SSA rewrite rules. See the memmove rewrite rules for reference. Without some digging, I couldn't say offhand which is best.

One argument for putting it somewhere suitable in the front-end is that this code should boil down to returning a constant:

package p

func f() int {
	var a [4]byte
	return len(string(a[:]))
}

@TocarIP
Copy link
Contributor

TocarIP commented Jun 28, 2018

@ChrisALiles are you working on this? I've encountered this pattern in internal code, so I was thinking on implementing this optimization.

@bradfitz bradfitz added the NeedsFix The path to resolution is known, but the work has not been done. label Jun 28, 2018
@ChrisALiles
Copy link
Contributor

ChrisALiles commented Jun 28, 2018 via email

@TocarIP
Copy link
Contributor

TocarIP commented Jun 29, 2018

@ChrisALiles are you using walk.go or ssa rules based approach? I've implemented something to see if it will speed-up internal code and going to mail it for trybot testing on other architectures, but please continue working on your prototype, as there is plenty of time left until code unfreeze.

@gopherbot
Copy link

Change https://golang.org/cl/121697 mentions this issue: cmd/compile: inline runtime.memequal if possible

@ChrisALiles
Copy link
Contributor

ChrisALiles commented Jun 30, 2018 via email

@gopherbot
Copy link

Change https://golang.org/cl/130255 mentions this issue: cmd/compile: optimise some small equality comparisons

@ChrisALiles
Copy link
Contributor

CL 130255 is probably only of academic interest at best seeing that CL 121697 has already been submitted, but I thought I would send it anyway because it cost me a fair amount of effort. A couple of notes: 1. I originally tried to find a way to fix this in walk and found a way to handle Josharian's "len" example above, so I left that in. 2. Don't send it to the bots - it fails in test/codegen because I wasn't brave/silly enough to try anything with architectures other than amd64/386.

@TocarIP
Copy link
Contributor

TocarIP commented Aug 23, 2018

@ChrisALiles CL 121697 is not submitted, due to trybot failures. My understanding is that main problem is that replacing call to memequal with store to the stack may change stack layout and cause write after stack end. I didn't had time to fix that.

CAFxX pushed a commit to CAFxX/go that referenced this issue Oct 19, 2018
For calls to runtime.memequal with known size, we can just
generate single compare instruction. Do this on all architectures, that
have appropriate sized unaligned load and compare instructions.

Shaves ~8kb from go tool:
go_old 12092977
/localdisk/itocar/golang/bin/go 12080689 [-12288 bytes]

section differences
global text (code) = -5904 bytes (-0.134493%)
read-only data = -2839 bytes (-0.098694%)
Total difference -8743 bytes (-0.112717%)

Fixes golang#24765

Change-Id: I13fcb7ac046df6915521340ec6321465c7f4e93c
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. help wanted NeedsFix The path to resolution is known, but the work has not been done. Performance Suggested Issues that may be good for new contributors looking for work to do.
Projects
None yet
Development

No branches or pull requests

5 participants