New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: use vDSO to accelerate time.Now on linux/386 #22190
Comments
Change https://golang.org/cl/69390 mentions this issue: |
Change https://golang.org/cl/69391 mentions this issue: |
This seems obviously desirable to me. Some concrete microbenchmark numbers showing that time.Now() is actually faster on linux/386 would be good though. |
Yes, I'm going to drop this out of the proposal process and make it an ordinary issue. Thanks for tackling this. |
I have a VM-based benchmark to hand. I will try to follow up with more comprehensive results tomorrow. Ubuntu 32-bit running under VMWare on macOS. Proposed patches applied to go.
|
As promised, here are some indicative benchmark results, comparing the go 1.9.1 release, and a version with the proposed vDSO changes. The benchmark code is very simple: package benchmark
import (
"testing"
"time"
)
func BenchmarkTimeNow(b *testing.B) {
for i := 0; i < b.N; i++ {
time.Now()
}
} Binaries are pre-built with their respective compiler versions, and each test run on a single CPU with: $ options="-test.cpu=1 -test.count=20 -test.bench=."
$ ./timetest.191 $options > results.191
$ ./timetest.vdso $options > results.vdso
$ benchstat results.191 results.vdso Intel CeleronThis shows the performance increase on i686, and that amd64 performance on the same hardware is unchanged. $ cat /proc/cpuinfo | grep "model name"
model name : Intel(R) Celeron(R) CPU J1900 @ 1.99GHz
(4x) Kernel: linux/i686,
Kernel: linux/x86_64,
VirtualizedThe performance increase in a virtualized environment is more dramatic, presumably due to the overheads of virtualizing the syscall used in go 1.9.1 for VMWareHost: darwin/amd64 $ cat /proc/cpuinfo | grep "model name"
model name : Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz Guest Kernel: linux/i686,
Guest Kernel: linux/x86_64,
DockerHost Kernel: linux/x86_64 $ cat /proc/cpuinfo | grep "model name"
model name : Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz
(32x) Guest Kernel: linux/i686,
|
This is gonna be a long shot: what are the odds we can get this before Go 1.10, i.e. in a possible 1.9.2? 😅 |
@tsuna None. Point releases are for security fixes and important bug fixes only. |
Alright alright, I figured that would be the answer but at least I asked 😅 Thanks for getting this in though, we have some linux/i386 environments where this change is more than welcome. |
Last updated: 2017-10-09
Abstract
Use the Linux vDSO
__vdso_clock_gettime
function (if available) to accelerate calls totime.Now()
onlinux/386
.Background
The Linux kernel can provide a "fast path" for some heavily used system calls which can be satisfied in user space more efficiently. The vDSO is an ELF-formatted virtual dynamic shared library injected into a process address space by the kernel, usually provided through an
auxv
entry at process startup. Several clock and time-related functions are included in this set of functions. When the vDSO is not present, normal syscalls must be used.This mechanism is already in use on
linux/amd64
to acceleratetime.Now()
.Proposal
The proposal is to use the same approach as used on
linux/amd64
to locate the relevant vDSO function, and use it if available, to acceleratetime.Now()
onlinux/386
.The proposal is to only accelerate the clock functions required to implement
time.Now()
.No other calls will be affected.
Rationale
There is a significant performance difference between a syscall to obtain a clock, and a corresponding vDSO-based call.
A prototype implementation found that the vDSO path is 5x to 10x faster than the syscall equivalent, depending on processor, virtualization etc.
For certain applications that make heavy use of timestamping (for example, metrics and telemetry), improving the performance of timestamping can make a significant performance improvement overall.
As of go 1.9,
time.Now()
onlinux/386
requires two syscalls, which has doubled the call cost over previous versions. Adding vDSO support would more than pay for this.Compatibility
If the vDSO-accelerated function is not found at runtime, then the existing syscall implementation will automatically be used as fallback.
The change will be limited to the time functions provided internally in
runtime
, and used bytime.Now()
, so that other calls will not be affected.Implementation
Adapt the code currently in
src/runtime/vdso_linux_amd64.go
so that it can also be used for ELF32 onlinux/386
. The initial implementation will be based on a code copy-and-edit so that onlylinux/386
is affected by the change.Adapt the
runtime.walltime()
andruntime.nanotime()
functions (insrc/runtime/sys_linux_386.s
) to check for and use__vdso_clock_gettime
if it was found during startup, or fallback to the existing syscall if not.Refactor The vDSO ELF symbol lookup code to eliminate duplication between
linux/386
andlinux/amd64
. The ELF structure definitions, and required symbols differ between 32-bit and 64-bit, but the lookup code is the same.Open issues
Number of changesets
I propose implementing this with two changesets:
linux/386
by duplicating code fromlinux/amd64
, so that 386 support can be reviewed/added without disturbing code for other platforms.linux/amd64
andlinux/386
to eliminate code duplication.Is this OK, or should a single changeset be used?
Tests
There don't appear to be any explicit tests for
linux/amd64
to verify that the fallback path can be called. I'll include a basic test for this coveringlinux/386
andlinux/amd64
, though I am unsure if it is necessary, or alternatively - if the test should be enhanced further.The text was updated successfully, but these errors were encountered: