runtime: cgo performance tracking bug #9704

minux · 2015-01-28T00:27:11Z

Running this stupid microbenchmark on linux/amd64, with different version of Go.
http://play.golang.org/p/5U0i26sA8U

package main

// int rand() { return 42; }
import "C"

import "testing"

func BenchmarkCgo(b *testing.B) {
    for i := 0; i < b.N; i++ {
        C.rand()
    }
}

func main() {
    testing.Main(func(string, string) (bool, error) {
        return true, nil
    }, nil, []testing.InternalBenchmark{
        {"BenchmarkCgo", BenchmarkCgo},
    }, nil)
}

$ go1 run cgobench.go -test.bench=.
testing: warning: no tests to run
PASS
BenchmarkCgo    50000000            30.8 ns/op
$ go112 run cgobench.go -test.bench=.
testing: warning: no tests to run
PASS
BenchmarkCgo    50000000            40.9 ns/op
$ go121 run cgobench.go -test.bench=.
testing: warning: no tests to run
PASS
BenchmarkCgo    50000000            46.1 ns/op
$ go133 run cgobench.go -test.bench=.
testing: warning: no tests to run
PASS
BenchmarkCgo    50000000            48.3 ns/op
$ go141 run cgobench.go -test.bench=.
testing: warning: no tests to run
PASS
BenchmarkCgo    10000000           160 ns/op
$ go run cgobench.go -test.bench=. # today's Go tip, f4a2617
testing: warning: no tests to run
PASS
BenchmarkCgo    10000000           203 ns/op

Why? Go 1.4 is much worse than any of the previous releases.
And Go tip is even worse than Go 1.4. This might be understandable,
but I wonder why Go 1.4 is that much slower than 1.3.3?

The text was updated successfully, but these errors were encountered:

dvyukov · 2015-01-28T07:25:48Z

@RLH @randall77 @rsc

It is combined effect of C->Go conversion, write barriers and atomic manipulation of goroutine statuses:

10.45% cgo cgo [.] runtime.cas
8.23% cgo cgo [.] runtime.deferreturn
7.21% cgo cgo [.] runtime.writebarrierptr
6.60% cgo cgo [.] runtime.reentersyscall
6.29% cgo cgo [.] runtime.newdefer
4.07% cgo cgo [.] runtime.getg
3.77% cgo cgo [.] runtime.exitsyscall
3.53% cgo cgo [.] main.BenchmarkCgo
3.50% cgo cgo [.] runtime.systemstack
3.20% cgo cgo [.] runtime.cgocall_errno
3.16% cgo cgo [.] main._Cfunc_rand
2.94% cgo cgo [.] runtime.freedefer
2.93% cgo cgo [.] runtime.deferproc
2.84% cgo cgo [.] runtime.exitsyscallfast
2.69% cgo cgo [.] runtime.casgstatus
2.58% cgo cgo [.] runtime.func.042
2.53% cgo cgo [.] runtime.atomicstore
2.50% cgo cgo [.] runtime.acquirem
2.42% cgo cgo [.] runtime.releasem

rsc · 2015-06-08T06:13:16Z

Or maybe it's a sinister plot to encourage people to write Go code.

Either way, too late for Go 1.5.

capnm · 2015-08-26T19:52:13Z

linux/amd64, linux/arm:

amd64  go-13      BenchmarkCgo  10000000               179 ns/op
amd64  go-14      BenchmarkCgo   3000000               494 ns/op
amd64  go-15      BenchmarkCgo   5000000               354 ns/op ~2x

arm go-13         BenchmarkCgo   2000000               821 ns/op
arm go-14         BenchmarkCgo   1000000              2359 ns/op
arm go-15         BenchmarkCgo    500000              2570 ns/op ~3x

mwhudson · 2016-10-02T21:21:16Z

Going by BenchmarkCgoCall in misc/cgo/test, on my linux/amd64 system, current tip is about twice as fast as go 1.7. Go 1.7, 1.6 and 1.5 are all about the same and Go 1.4 is slightly slower (I lack the energy to get 1.3 cgo working on this system). So the thing the bug specifically complains about ("decrease of cgocall performance") is probably fixed, but is this fast enough? (It's always hard to know when a bug like this can be closed).

crawshaw · 2016-10-02T21:36:42Z

http://golang.org/cl/30080 ~halved the cgo overhead between 1.7 and tip.

As reported this can be closed, but how about we keep it as a cgo performance tracking bug? I don't expect any more improvements for 1.8, but in the future a runtime accounting overhaul could make it cheaper.

minux · 2016-10-02T23:26:41Z

New benchmark result on the same machine: go version go1 BenchmarkCgo 50000000 31.5 ns/op go version go1.1.2 linux/amd64 BenchmarkCgo 50000000 42.5 ns/op go version go1.2.1 linux/amd64 BenchmarkCgo 50000000 48.4 ns/op go version go1.2.2 linux/amd64 BenchmarkCgo 50000000 47.6 ns/op go version go1.3 linux/amd64 BenchmarkCgo 50000000 47.6 ns/op go version go1.3.1 linux/amd64 BenchmarkCgo 50000000 48.8 ns/op go version go1.3.2 linux/amd64 BenchmarkCgo 50000000 52.9 ns/op go version go1.3.3 linux/amd64 BenchmarkCgo 50000000 50.5 ns/op go version go1.4 linux/amd64 BenchmarkCgo 10000000 167 ns/op go version go1.4.1 linux/amd64 BenchmarkCgo 10000000 169 ns/op go version go1.4.2 linux/amd64 BenchmarkCgo 10000000 170 ns/op go version go1.4.3 linux/amd64 BenchmarkCgo 10000000 172 ns/op go version go1.5 linux/amd64 BenchmarkCgo-4 10000000 162 ns/op go version go1.5.1 linux/amd64 BenchmarkCgo-4 10000000 159 ns/op go version go1.5.2 linux/amd64 BenchmarkCgo-4 10000000 169 ns/op go version go1.5.3 linux/amd64 BenchmarkCgo-4 10000000 169 ns/op go version go1.5.4 linux/amd64 BenchmarkCgo-4 10000000 170 ns/op go version go1.6 linux/amd64 BenchmarkCgo-4 10000000 160 ns/op go version go1.6.1 linux/amd64 BenchmarkCgo-4 10000000 157 ns/op go version go1.6.2 linux/amd64 BenchmarkCgo-4 10000000 158 ns/op go version go1.6.3 linux/amd64 BenchmarkCgo-4 10000000 158 ns/op go version go1.7 linux/amd64 BenchmarkCgo-4 10000000 162 ns/op go version go1.7.1 linux/amd64 BenchmarkCgo-4 10000000 164 ns/op go version devel +9984195 Sun Oct 2 19:38:37 2016 +0000 linux/amd64 BenchmarkCgo-4 20000000 62.1 ns/op

aclements · 2016-10-03T15:25:24Z

@minux, what benchmark were you running?

dgryski · 2018-01-03T05:48:00Z

@aclements I think @minux was running the Go program at the top of the bug that just calls into a C function returning an int.

navytux · 2018-02-18T12:54:24Z

Note that in addition to serial CGo slowness (i.e. ~ 60ns for 1 call) making several Cgo calls in sequence in presence of other goroutines can bring more slowdown: #19574 (comment).

thepudds · 2019-06-04T18:14:08Z

Now that https://golang.org/cl/171758 is merged for 1.13 fixing #6980, is that expected to help here as well?

When a defer is executed at most once in a function body,
we can allocate the defer record for it on the stack instead
of on the heap.
...
name     old time/op  new time/op  delta
Defer-4  52.2ns ± 5%  36.2ns ± 3%  -30.70%  (p=0.000 n=10+10)

thepudds · 2019-06-04T19:36:18Z

FWIW, from a very quick test, using tip does not seem to be significantly faster than 1.12.5. Also, I don't know if this is already tracked elsewhere or perhaps not a meaningful result, but this quick test seems to show go1.10 improving, but go1.12 slowing down again.

===============
go1.7.6
BenchmarkCgo-8          100000000              197 ns/op
===============
go1.8.7
BenchmarkCgo-8          200000000               79.7 ns/op
===============
go1.9.4
BenchmarkCgo-8          200000000               81.3 ns/op
===============
go1.10.4
BenchmarkCgo-8          200000000               73.4 ns/op
===============
go1.11.4
BenchmarkCgo-8          200000000               73.2 ns/op
===============
go1.12.5
BenchmarkCgo-8          200000000               81.7 ns/op
===============
gotip
BenchmarkCgo-8          152019986               79.0 ns/op

This is running the benchmark from #9704 (comment) via
go run cgobench.go -test.bench=. -test.benchtime=10s on linux/amd64.

gotip is devel +fff4f59 Tue Jun 4 17:35:20 2019 +0000

navytux · 2019-06-05T08:49:22Z

I confirm what @thepudds reports - there is improvement with go1.10, but slowdown back with go1.12, and things are not getting faster with tip:

$ benchstat go17.txt go18.txt go19.txt go110.txt go111.txt go112.txt gotip.txt 
name \ time/op  go17.txt    go18.txt   go19.txt     go110.txt    go111.txt    go112.txt    gotip.txt
Cgo-4           186ns ± 1%  82ns ± 0%
goos:linux goarch:amd64
Cgo-4                                  84.5ns ± 0%  74.9ns ± 0%  76.6ns ± 0%  81.1ns ± 0%  81.4ns ± 0%

All measurements were done on unloaded machine with CPU frequency fixed and CPU idle states except C1 disabled (see http://navytux.spb.ru/~kirr/neo.html#measurements-stability for details)

$ ./neotest info-local
date:   Wed, 05 Jun 2019 11:44:55 +0300
xnode:  ...
uname:  Linux deco 4.19.0-5-amd64 #1 SMP Debian 4.19.37-3 (2019-05-15) x86_64 GNU/Linux
cpu:    Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
cpu/[0-3]/freq: intel_pstate/performance [2.60GHz - 2.60GHz]
cpu/[0-3]/idle: intel_idle/menu: POLL·0/0 C1·2/2 !C1E·10/20 !C3·70/100 !C6·85/200 !C7s·124/800 !C8·200/800 !C9·480/5000 !C10·890/5000 # elat/tres µs
...

$ gotip version
go version devel +5f509148b1 Wed Jun 5 00:53:25 2019 +0000 linux/amd64

minux added repo-main labels Jan 28, 2015

minux added this to the Go1.5 milestone Jan 28, 2015

rsc removed cgo labels Apr 10, 2015

rsc modified the milestones: Unplanned, Go1.5 Jun 8, 2015

bradfitz mentioned this issue Jun 8, 2015

runtime: go1.5 works slower than go1.4? #11114

Closed

minux mentioned this issue Mar 24, 2016

runtime: defer is slow #14939

Closed

ALTree mentioned this issue Jun 13, 2016

proposal: a faster C-call mechanism for non-blocking C functions #16051

Closed

minux changed the title ~~runtime: steady decrease of cgocall performance~~ runtime: cgo performance tracking bug Oct 2, 2016

minux added the Performance label Oct 2, 2016

mostlygeek mentioned this issue Dec 2, 2016

Refactor to use go 1.8 mozilla-services/go-syncstorage#154

Closed

ALTree mentioned this issue Feb 27, 2022

cgo: zero cost cgo calling #51380

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

runtime: cgo performance tracking bug #9704

runtime: cgo performance tracking bug #9704

minux commented Jan 28, 2015

dvyukov commented Jan 28, 2015

rsc commented Jun 8, 2015

capnm commented Aug 26, 2015

mwhudson commented Oct 2, 2016

crawshaw commented Oct 2, 2016 •

edited

minux commented Oct 2, 2016 via email

aclements commented Oct 3, 2016

dgryski commented Jan 3, 2018

navytux commented Feb 18, 2018

thepudds commented Jun 4, 2019

thepudds commented Jun 4, 2019

navytux commented Jun 5, 2019

runtime: cgo performance tracking bug #9704

runtime: cgo performance tracking bug #9704

Comments

minux commented Jan 28, 2015

dvyukov commented Jan 28, 2015

rsc commented Jun 8, 2015

capnm commented Aug 26, 2015

mwhudson commented Oct 2, 2016

crawshaw commented Oct 2, 2016 • edited

minux commented Oct 2, 2016 via email

aclements commented Oct 3, 2016

dgryski commented Jan 3, 2018

navytux commented Feb 18, 2018

thepudds commented Jun 4, 2019

thepudds commented Jun 4, 2019

navytux commented Jun 5, 2019

crawshaw commented Oct 2, 2016 •

edited