Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: Switch on four-byte string constants repeats comparisons #53333

Closed
greatroar opened this issue Jun 11, 2022 · 5 comments
Closed
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. Performance
Milestone

Comments

@greatroar
Copy link

greatroar commented Jun 11, 2022

What version of Go are you using (go version)?

$ gotip version
go version devel go1.19-55590f3a2b Fri Jun 10 23:30:35 2022 +0000 linux/amd64

What operating system and processor architecture are you using (go env)?

GOAMD64="v2"
GOARCH="amd64"

What did you do?

func mimetype(ext string) string {
	switch ext {
	case ".htm":
		return "text/html; charset=utf-8"
	case ".eot":
		return "application/vnd.ms-fontobject"
	case ".svg":
		return "image/svg+xml; charset=utf-8"
	case ".ttf":
		return "font/ttf"
	default:
		return ""
	}
}

What did you expect to see?

I expected the switch to be compiled to a check for len(ext) == 4 followed by a binary search that has a single CMPL for each case.

What did you see instead?

A binary search where each case has a length check against four, followed by a CMPL; worse, the binary search is preceded by a runtime.cmpstring against ".htm", so that comparison occurs twice (it also occurs as a CMPL against $1836345390).

The generated code is closer to expected when there is one fewer case. Then it only repeats the length check.

@greatroar greatroar changed the title cmd/compile: Switch one four-byte strings repeats comparisons cmd/compile: Switch on four-byte strings repeats comparisons Jun 11, 2022
@greatroar greatroar changed the title cmd/compile: Switch on four-byte strings repeats comparisons cmd/compile: Switch on four-byte string constants repeats comparisons Jun 11, 2022
@Jorropo
Copy link
Member

Jorropo commented Jun 11, 2022

We should also do that with 8 bytes on 64 bits architectures with support for 8 bytes immediates in comparisons.

@greatroar
Copy link
Author

The GOSSAFUNC dump already shows both the literal comparisons and the runtime.cmpstring call in the AST repr:

. . IF-Cond
. . . LE bool tc(1) # sw.go:13:2
. . . . CALLFUNC Walked int tc(1) # sw.go:13:2
. . . . . NAME-runtime.cmpstring Class:PFUNC Offset:0 Used FUNC-func(string, string) int tc(1)
. . . . CALLFUNC-Args
. . . . . NAME-main..autotmp_2 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used string tc(1) # sw.go:13:2
. . . . . LITERAL-".htm" string tc(1) # sw.go:14:7
. . . . LITERAL-0 int tc(1) # sw.go:13:2
. . IF-Body
. . . IF # _sw.go:13:2
. . . IF-Cond
. . . . ANDAND bool tc(1) # sw.go:16:2
. . . . . EQ bool tc(1) # sw.go:16:2
. . . . . . LEN int tc(1) # sw.go:16:2
. . . . . . . NAME-main..autotmp_2 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used string tc(1) # sw.go:13:2
. . . . . . LITERAL-4 int tc(1) # sw.go:16:2
. . . . . EQ bool tc(1) # sw.go:16:2
. . . . . . LITERAL-1953457454 uint32 tc(1) # sw.go:16:2
. . . . . . OR uint32 tc(1) # sw.go:16:2
. . . . . . . OR uint32 tc(1) # sw.go:16:2
. . . . . . . . OR uint32 tc(1) # sw.go:16:2
. . . . . . . . . CONV uint32 tc(1) # sw.go:16:2
. . . . . . . . . . INDEX byte tc(1) # sw.go:16:2
. . . . . . . . . . . NAME-main..autotmp_2 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used string tc(1) # sw.go:13:2
. . . . . . . . . . . LITERAL-0 int tc(1) # sw.go:16:2
. . . . . . . . . LSH uint32 tc(1) # sw.go:16:2
. . . . . . . . . . CONV uint32 tc(1) # sw.go:16:2
. . . . . . . . . . . INDEX byte tc(1) # sw.go:16:2
. . . . . . . . . . . . NAME-main..autotmp_2 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used string tc(1) # sw.go:13:2
. . . . . . . . . . . . LITERAL-1 int tc(1) # sw.go:16:2
. . . . . . . . . . LITERAL-8 uint tc(1) # sw.go:16:2
. . . . . . . . LSH uint32 tc(1) # sw.go:16:2
. . . . . . . . . CONV uint32 tc(1) # sw.go:16:2
. . . . . . . . . . INDEX byte tc(1) # sw.go:16:2
. . . . . . . . . . . NAME-main..autotmp_2 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used string tc(1) # sw.go:13:2
. . . . . . . . . . . LITERAL-2 int tc(1) # sw.go:16:2
. . . . . . . . . LITERAL-16 uint tc(1) # sw.go:16:2
. . . . . . . LSH uint32 tc(1) # sw.go:16:2
. . . . . . . . CONV uint32 tc(1) # sw.go:16:2
. . . . . . . . . INDEX byte tc(1) # sw.go:16:2
. . . . . . . . . . NAME-main..autotmp_2 esc(N) Class:PAUTO Offset:0 AutoTemp OnStack Used string tc(1) # sw.go:13:2
. . . . . . . . . . LITERAL-3 int tc(1) # sw.go:16:2
. . . . . . . . LITERAL-24 uint tc(1) # sw.go:16:2

@seankhliao seankhliao added Performance NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jun 11, 2022
@randall77
Copy link
Contributor

I think the reason it is this way is that the top clause of the binary search is a <= on strings, whereas the other comparisons are == on strings.

The relevant code is in cmd/compile/internal/walk/compare.go. it says:

// Don't do byte-wise comparisons for <, <=, etc.
// They're fairly complicated.

I think we'd just have to implement that. It will be somewhat complicated I think (e.g. endianness matters), but not too bad.

@randall77 randall77 added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jun 13, 2022
@randall77 randall77 added this to the Unplanned milestone Jun 13, 2022
@randall77
Copy link
Contributor

I looked at this a bit more. I originally thought we could just optimize generic string vs. constant string ordered comparisons, and that would fix this issue also. But those comparisons are very tough, mostly because we don't know anything about the length of the nonconstant string. We can't easily read the nonconstant string because we need to check the length to be sure we're still in bounds. ==/!= comparisons avoid that by comparing the length first.
However, in switch statements, we've already determined that the lengths are equal. So we could do something simpler there. It would probably require a new comparison op that's "less than on a string and constant string with equal length".

@gopherbot
Copy link

Change https://go.dev/cl/414894 mentions this issue: cmd/compile: use better splitting condition for string binary search

@joedian joedian added the compiler/runtime Issues related to the Go compiler and/or runtime. label Aug 16, 2022
@dmitshur dmitshur modified the milestones: Unplanned, Go1.20 Jan 23, 2023
@golang golang locked and limited conversation to collaborators Jan 23, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. Performance
Projects
None yet
Development

No branches or pull requests

7 participants