Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regexp: onepass regexp.Match allocates memory unnecessarily #19573

Closed
haya14busa opened this issue Mar 16, 2017 · 3 comments
Closed

regexp: onepass regexp.Match allocates memory unnecessarily #19573

haya14busa opened this issue Mar 16, 2017 · 3 comments

Comments

@haya14busa
Copy link
Contributor

haya14busa commented Mar 16, 2017

Please answer these questions before submitting your issue. Thanks!

What did you do?

I noticed that regexp.Match allocates memory unnecessarily while profiling a
program which uses "onepass regex" a lot (e.g. ^[a-z]$).

ROUTINE ======================== regexp.(*Regexp).doExecute in /usr/lib/go/src/regexp/exec.go
   95.50MB    96.02MB (flat, cum)  7.37% of Total
         .          .    434:           }
         .          .    435:   } else if size < m.maxBitStateLen && r == nil {
         .          .    436:           if m.b == nil {
         .          .    437:                   m.b = newBitState(m.p)
         .          .    438:           }
         .   528.17kB    439:           if !m.backtrack(i, pos, size, ncap) {
         .          .    440:                   re.put(m)
         .          .    441:                   return nil
         .          .    442:           }
         .          .    443:   } else {
         .          .    444:           m.init(ncap)
         .          .    445:           if !m.match(i, pos) {
         .          .    446:                   re.put(m)
         .          .    447:                   return nil
         .          .    448:           }
         .          .    449:   }
   95.50MB    95.50MB    450:   dstCap = append(dstCap, m.matchcap...)
         .          .    451:   if dstCap == nil {
         .          .    452:           // Keep the promise of returning non-nil value on match.
         .          .    453:           dstCap = arrayNoInts[:0]
         .          .    454:   }
         .          .    455:   re.put(m)

(This is go tool pprof output profiling against my program, so it contains m.backtrack allocations too, but please ignore it (L439).)

For example, regexp.MustCompile("^[a-z]$").Match([]byte("a")) should not allocate memory unnecessarily at L450 (dstCap = append(dstCap, m.matchcap...)), but allocation occurs because len(m.matchcap) == 2.

As for matching of non onepass regex (e.g. .*[a-z] or something), m.backtrack(i, pos, size, ncap) or m.init(ncap) reset m.matchcap length to zero (ncap==0 for regexp.Match) and doesn't allocate memory at L450.

What did you expect to see?

No allocation for regexp.Match of onepass regexp.

See CL for benchmark script.

benchmark                                    old ns/op      new ns/op      delta
BenchmarkMatch_match_onepass_regex/32-4      6465           4628           -28.41%
BenchmarkMatch_match_onepass_regex/1K-4      208324         151558         -27.25%
BenchmarkMatch_match_onepass_regex/32K-4     7230259        5834492        -19.30%
BenchmarkMatch_match_onepass_regex/1M-4      234379810      166310682      -29.04%
BenchmarkMatch_match_onepass_regex/32M-4     7903529363     4981119950     -36.98%

benchmark                                    old MB/s     new MB/s     speedup
BenchmarkMatch_match_onepass_regex/32-4      4.95         6.91         1.40x
BenchmarkMatch_match_onepass_regex/1K-4      4.92         6.76         1.37x
BenchmarkMatch_match_onepass_regex/32K-4     4.53         5.62         1.24x
BenchmarkMatch_match_onepass_regex/1M-4      4.47         6.30         1.41x
BenchmarkMatch_match_onepass_regex/32M-4     4.25         6.74         1.59x

benchmark                                    old allocs     new allocs     delta
BenchmarkMatch_match_onepass_regex/32-4      32             0              -100.00%
BenchmarkMatch_match_onepass_regex/1K-4      1024           0              -100.00%
BenchmarkMatch_match_onepass_regex/32K-4     32768          0              -100.00%
BenchmarkMatch_match_onepass_regex/1M-4      1048576        0              -100.00%
BenchmarkMatch_match_onepass_regex/32M-4     104559255      0              -100.00%

benchmark                                    old bytes      new bytes     delta
BenchmarkMatch_match_onepass_regex/32-4      512            0             -100.00%
BenchmarkMatch_match_onepass_regex/1K-4      16384          0             -100.00%
BenchmarkMatch_match_onepass_regex/32K-4     524288         0             -100.00%
BenchmarkMatch_match_onepass_regex/1M-4      16777216       0             -100.00%
BenchmarkMatch_match_onepass_regex/32M-4     2019458128     0             -100.00%

What did you see instead?

1 allocation for regexp.Match of onepass regexp at L450 (dstCap = append(dstCap, m.matchcap...)).
It's only one needless allocation, but regex will be often called a lot of times.
My program which heavily uses onepass regexp Match actually speedup by about 10% by removing the needless allocation.

System details

go version go1.8 linux/amd64
GOARCH="amd64"
GOBIN="/home/haya14busa/go/bin"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/haya14busa"
GORACE=""
GOROOT="/usr/lib/go"
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build371996710=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
PKG_CONFIG="pkg-config"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
GOROOT/bin/go version: go version go1.8 linux/amd64
GOROOT/bin/go tool compile -V: compile version go1.8 X:framepointer
uname -sr: Linux 4.4.48-1-MANJARO
LSB Version:	n/a
Distributor ID:	ManjaroLinux
Description:	Manjaro Linux
Release:	16.10
Codename:	Fringilla
/usr/lib/libc.so.6: GNU C Library (GNU libc) stable release version 2.24, by Roland McGrath et al.
gdb --version: GNU gdb (GDB) 7.12.1
@bradfitz bradfitz added this to the Go1.9Maybe milestone Mar 16, 2017
@bradfitz
Copy link
Contributor

/cc @matloob @rsc

@gopherbot
Copy link

CL https://golang.org/cl/38270 mentions this issue.

@haya14busa
Copy link
Contributor Author

Thank you for including it 👍

@golang golang locked and limited conversation to collaborators Mar 23, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants