Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syscall: Go 1.12 MacOS syscall/IO perf regression (solved: fsync root cause) #30497

Closed
jared2501 opened this issue Mar 1, 2019 · 6 comments
Closed

Comments

@jared2501
Copy link

jared2501 commented Mar 1, 2019

Version: go version go1.12 darwin/amd64 (MacOS 10.13.6)

After upgrading from go 1.11.5 to go 1.12 I'm seeing a very large performance regression in https://github.com/etcd-io/bbolt tests (see etcd-io/bbolt#149). BoltDB is a memory-mapped BTree database used in projects such as etcd.

Filing an issue here since the only change that has happened is I have changed Go versions.

Steps to reproduce:

  1. Clone https://github.com/boltdb/bolt
  2. Run go test -run TestSimulate_1000op_1p with Go 1.11.5
  3. Run go test -run TestSimulate_1000op_1p with Go 1.12

Observe that the the time to run (3) is about 30x slower than the time to run (2). On my machine step (3) takes ~0.09s and step (2) takes ~2.6s. I cannot reproduce this issue on a Linux machine.

@ianlancetaylor
Copy link
Contributor

30x slower is worse than I would have hoped, but some slowdown is expected. See the release notes, specifically https://golang.org/doc/go1.12#darwin. We now make all system calls through the C library on Darwin, because it is the only way to be forward compatible to future releases.

@ianlancetaylor ianlancetaylor changed the title Go 1.12 MacOS syscall/IO perf regression syscall: Go 1.12 MacOS syscall/IO perf regression Mar 1, 2019
@jared2501
Copy link
Author

jared2501 commented Mar 1, 2019

Yeah I figured it might be related to #17490. Would it be possible to get this behaviour flagged on/off by any chance? I'd be willing to not have forward-compatible builds, i.e. for when running tests.

Also, when I used pprof to debug this, I couldn't actually identify the cause of the slow down. Is this expected?

For example, here's the blockprofile graph:

image

And here's the cpuprofile graph:

image

Both were computed by running go test -run TestSimulate_1000op_1p -blockprofile block.out -cpuprofile profile.out. Note how the cpuprofile graph only has a total of 120ms spent in syscalls, while the total test time was ~2.6s and the blockprofile graph shows ~2.6s spent blocking on a mutex.

@ianlancetaylor
Copy link
Contributor

It's not realistically possible to use a flag. The changes are extensive. We can't support both versions, especially since Apple has explicitly told us to stop using the old version.

I don't really know the answers to your profiling questions. But when I look at your profile, one thing jumps out at me: the use of File.Sync. Another change in 1.12, mentioned at https://golang.org/doc/go1.12#os, is that File.Sync now works correctly on Darwin. As a consequence, it is much slower than it used to be. It is possible that that change, rather than the change to use the C library, is the bulk of the slowdown that you are seeing. Although File.Sync is now much slower on Darwin, it's important to understand that the implementation in Go 1.11 and earlier was simply incorrect, and did not provide the protections expected from the use of File.Sync. There is no way to recover that time other than to stop calling File.Sync. See also #26650 and #27415.

@jared2501
Copy link
Author

Makes sense regarding the flag. And ah interesting, thanks for pointing this out @ianlancetaylor. Good to know (and luckily our production workloads all run on Linux).

I tested this hypothesis of yours by commenting out the fsync calls. When I did this, and ran with go 1.12 darwin/amd64, the tests complete in ~0.1s. Much closer to the go 1.11 speed of 0.08s!

✓ jnewman@Jareds-MacBook-Pro:~/go/src/go.etcd.io/bbolt | master● $ git --no-pager diff
diff --git a/boltsync_unix.go b/boltsync_unix.go
index 9587afe..ed070be 100644
--- a/boltsync_unix.go
+++ b/boltsync_unix.go
@@ -4,5 +4,5 @@ package bbolt

 // fdatasync flushes written data to a file descriptor.
 func fdatasync(db *DB) error {
-	return db.file.Sync()
+	return nil
 }
✓ jnewman@Jareds-MacBook-Pro:~/go/src/go.etcd.io/bbolt | master● $ go test -run TestSimulate_1000op_1p
seed: 48000
quick settings: count=5, items=1000, ksize=1024, vsize=1024
PASS
ok  	go.etcd.io/bbolt	0.100s

@ianlancetaylor
Copy link
Contributor

OK, thanks, I don't think there is anything to do here.

@jared2501
Copy link
Author

jared2501 commented Mar 1, 2019

When I run the same test on a linux machine, I see it take ~0.27s:

centos:/home/user/go/src/go.etcd.io/bbolt
$ go test -run TestSimulate_1000op_1p
seed: 8609
quick settings: count=5, items=1000, ksize=1024, vsize=1024
PASS
ok  	go.etcd.io/bbolt	0.270s
centos:/home/user/go/src/go.etcd.io/bbolt
$ go version
go version go1.12 linux/amd64

This means that the macos fsync implementation is still 10x slower than linux's (!!). I guess since the FD.Fsync implementation literally just performs a F_FULLFSYNC syscall, I guess yeah there isn't really much you can do...

@jared2501 jared2501 changed the title syscall: Go 1.12 MacOS syscall/IO perf regression syscall: Go 1.12 MacOS syscall/IO perf regression (solved: fsync root cause) Mar 1, 2019
@golang golang locked and limited conversation to collaborators Feb 29, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants