Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regexp: doOnePass does unnecessary check when regexp has a prefix #48891

Open
bboreham opened this issue Oct 9, 2021 · 2 comments · May be fixed by #48892
Open

regexp: doOnePass does unnecessary check when regexp has a prefix #48891

bboreham opened this issue Oct 9, 2021 · 2 comments · May be fixed by #48892
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@bboreham
Copy link
Contributor

bboreham commented Oct 9, 2021

What version of Go are you using (go version)?

$ go version
go version devel go1.18-541ef34ada Sun Oct 3 18:09:11 2021 +0100 darwin/arm64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/Users/bryan/Library/Caches/go-build"
GOENV="/Users/bryan/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/bryan/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/bryan/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/Users/bryan/src/github.com/golang/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/Users/bryan/src/github.com/golang/go/pkg/tool/darwin_arm64"
GOVCS=""
GOVERSION="devel go1.18-541ef34ada Sun Oct 3 18:09:11 2021 +0100"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/bryan/src/github.com/golang/go/src/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/bp/y4j6bp157f52wv55knd_j1fc0000gn/T/go-build3663975253=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Small example: https://play.golang.org/p/y46WA2Dc2dI - adapted from regexp BenchmarkAnchoredLiteralShortNonMatch;
the expression has .*$ added to make it go down the onepass branch.

This statement:

go/src/regexp/exec.go

Lines 431 to 433 in e1c294a

// If there is a simple literal prefix, skip over it.
if pos == 0 && flag.match(syntax.EmptyOp(inst.Arg)) &&
len(re.prefix) > 0 && i.canCheckPrefix() {

checks that the current runes match the op, but at regexp compile-time it is established that the start op must be EmptyBeginText:

go/src/regexp/onepass.go

Lines 470 to 473 in e1c294a

if prog.Inst[prog.Start].Op != syntax.InstEmptyWidth ||
syntax.EmptyOp(prog.Inst[prog.Start].Arg)&syntax.EmptyBeginText != syntax.EmptyBeginText {
return nil
}

EmptyBeginText will always match at position 0 of the input

What did you expect to see?

The check flag.match(syntax.EmptyOp(inst.Arg)) can be removed from run-time execution of the regexp, which means i.hasPrefix can be moved before the instruction and first runes are unpacked.

What did you see instead?

In the case where a onepass regexp has a prefix which does not match, a bit of unnecessary time is spent initializing.

@bboreham bboreham linked a pull request Oct 9, 2021 that will close this issue
@gopherbot
Copy link

Change https://golang.org/cl/354909 mentions this issue: regexp: speed up onepass prefix check

@mvdan
Copy link
Member

mvdan commented Oct 9, 2021

cc @sylvinus

@toothrot toothrot added this to the Backlog milestone Oct 12, 2021
@toothrot toothrot added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants