Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance of the map[string]uintptr for keys distributed #27770

Closed
larytet opened this issue Sep 20, 2018 · 2 comments
Closed

Performance of the map[string]uintptr for keys distributed #27770

larytet opened this issue Sep 20, 2018 · 2 comments

Comments

@larytet
Copy link

larytet commented Sep 20, 2018

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version go1.10.1 linux/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOCACHE="/home/arkady/.cache/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/arkady/go"
GORACE=""
GOROOT="/usr/lib/go-1.10"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go-1.10/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build702714205=/tmp/go-build -gno-record-gcc-switches"

What did you do?

https://gist.github.com/larytet/b864982d32ac31fedbbed2ffdac9dc85

What did you expect to see?

Roughly similar performance of the map for any key distribution

What did you see instead?

Insertion is visibly slower for the uniform distribution. Can it be that replacing an entry in the map is faster than inserting a new entry? I was thinking that the DDR is a bottleneck - random memory access is a dominating factor in the performance of the large maps. I did not expect the key distribution to impact the map performance.

ROUTINE ======================== mcachego/hashtable.BenchmarkMapStore in /home/arkady      80ms      3.11s (flat, cum) 10.27% of Total
         .          .     62:
         .          .     63:// Run the same test with the Go map API for comparison
         .          .     64:func BenchmarkMapStore(b *testing.B) {
         .          .     65:	b.ReportAllocs()
         .          .     66:	//b.N = 100 * 1000
         .       50ms     67:	keys := make([]string, b.N, b.N)
         .          .     68:	for i := 0; i < b.N; i++ {
      40ms      1.14s     69:		keys[i] = fmt.Sprintf("%d", b.N-i)
         .          .     70:	}
         .       30ms     71:	m := make(map[string]uintptr, b.N)
         .          .     72:	b.ResetTimer()
      20ms       20ms     73:	for i := 0; i < b.N; i++ {
         .          .     74:		key := keys[i]
      20ms      1.87s     75:		m[key] = uintptr(i)
         .          .     76:	}
         .          .     77:}
         .          .     78:
         .          .     79:func BenchmarkMapStoreNonuniform(b *testing.B) {
         .          .     80:	b.ReportAllocs()
ROUTINE ======================== mcachego/hashtable.BenchmarkMapStoreNonuniform in      310ms      3.03s (flat, cum) 10.01% of Total
         .          .     77:}
         .          .     78:
         .          .     79:func BenchmarkMapStoreNonuniform(b *testing.B) {
         .          .     80:	b.ReportAllocs()
         .          .     81:	//b.N = 100 * 1000
         .       10ms     82:	keys := make([]string, b.N, b.N)
         .          .     83:	for i := 0; i < b.N; i++ {
      10ms      650ms     84:		keys[i] = fmt.Sprintf("%d", b.N-i)
         .          .     85:	}
         .       10ms     86:	samples := make([]int, b.N, b.N)
      10ms       10ms     87:	for i := 0; i < b.N; i++ {
         .      660ms     88:		grg := gaussian()
      20ms       20ms     89:		index := ((2.0 + grg) / 4.0) * float64(b.N-1)
         .          .     90:		samples[i] = int(index)
         .          .     91:	}
         .          .     92:
         .       40ms     93:	m := make(map[string]uintptr, b.N)
         .          .     94:	b.ResetTimer()
         .          .     95:	for i := 0; i < b.N; i++ {
         .          .     96:		sample := samples[i]
         .          .     97:		if sample < 0 || sample >= b.N {
         .          .     98:			continue
         .          .     99:		}
     210ms      210ms    100:		key := keys[sample]
      60ms      1.42s    101:		m[key] = uintptr(i)
         .          .    102:	}
         .          .    103:}

(pprof)
@larytet
Copy link
Author

larytet commented Sep 20, 2018

My bad - bug in the test

@larytet larytet closed this as completed Sep 20, 2018
@martisch
Copy link
Contributor

martisch commented Sep 20, 2018

For future performance discussions could you please change the benchmarks to:

  • not have the number of keys inserted/intial map size (and thereby map load/resizes) dependent on b.N. E.g. always insert 1000 or 10000 keys (optionally in a presized map) in each loop iteration then reset the map.
  • remove ReportAllocs if only cpu time is compared
  • run each benchmark 10 or more times with go test -bench on a quiet system
  • compare the output using benchstat (https://godoc.org/golang.org/x/perf/cmd/benchstat) and post those numbers

@golang golang locked and limited conversation to collaborators Sep 20, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants