Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: nanosecond precision lost since mac os x high sierra update #22037

Closed
mgom opened this issue Sep 26, 2017 · 24 comments
Closed

runtime: nanosecond precision lost since mac os x high sierra update #22037

mgom opened this issue Sep 26, 2017 · 24 comments
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. OS-Darwin
Milestone

Comments

@mgom
Copy link

mgom commented Sep 26, 2017

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go1.9 darwin/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

Mac OS X High Sierra

What did you do?

fmt.Println(time.Now().UnixNano())

What did you expect to see?

i would expect to display amount of nanoseconds

What did you see instead?

nanosecond part is not filled
some results
1506416154931120000
1506416154931136000
1506416154931138000
1506416154931139000
1506416154931140000

@ianlancetaylor ianlancetaylor changed the title nanosecond precision lost since mac os x high sierra update runtime: nanosecond precision lost since mac os x high sierra update Sep 26, 2017
@ianlancetaylor ianlancetaylor added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin labels Sep 26, 2017
@ianlancetaylor ianlancetaylor added this to the Go1.10 milestone Sep 26, 2017
@ianlancetaylor
Copy link
Contributor

This implies that High Sierra has changed the implementation of the commpage described in http://www.opensource.apple.com/source/xnu/xnu-1699.26.8/osfmk/i386/cpu_capabilities.h in such a way that the Go code (line 109 of https://tip.golang.org/src/runtime/sys_darwin_amd64.s) falls back to calling gettimeofday.

Can you find the cpu_capabilities.h file that applies to High Sierra? Or post a disassembly of the clock_gettime function from the standard library? Thanks.

It would also be useful to know whether the runtime is fetching useful monotonic time readings. Can you fmt.Println(time.Now()) a few times so that we can see the m= portion? Thanks.

I very much hope this is not another #16570.

@mfrw
Copy link
Contributor

mfrw commented Sep 26, 2017

@ianlancetaylor Here is the ouptut for fmt.Println(time.Now())

2017-09-26 20:48:07.875345 +0530 IST m=+0.000358664
2017-09-26 20:48:07.875498 +0530 IST m=+0.000512769
2017-09-26 20:48:07.875503 +0530 IST m=+0.000517350
2017-09-26 20:48:07.875506 +0530 IST m=+0.000519942
2017-09-26 20:48:07.875508 +0530 IST m=+0.000522385
2017-09-26 20:48:07.87551 +0530 IST m=+0.000524843
2017-09-26 20:48:07.875513 +0530 IST m=+0.000527322
2017-09-26 20:48:07.875515 +0530 IST m=+0.000529614
2017-09-26 20:48:07.875518 +0530 IST m=+0.000531897
2017-09-26 20:48:07.87552 +0530 IST m=+0.000534161

@ianlancetaylor
Copy link
Contributor

Thanks. My vague guess is that the time information on the commpage is present but the generation field is always zero. Or something. The other info I requested might tell us.

@mfrw
Copy link
Contributor

mfrw commented Sep 26, 2017

@ianlancetaylor Here is the dump of the clock_gettime function from the lib. I also saw some clock_gettime_nsec_np and included that too.

@ianlancetaylor
Copy link
Contributor

Thanks! Unfortunately the key steps seem to be in a function called via callq 0x10871. Can you also send the disassembly of the function at that address?

@mfrw
Copy link
Contributor

mfrw commented Sep 26, 2017

@ianlancetaylor Sure.. here you go. It is btw the gettimeofday function.
Just in case you might need any other, I am attaching the full dump of the library.
otool -vt libsystem_c.dylib > libc_macos-10.13_dump.txt
libc_macos-10.13_dump.txt

@ianlancetaylor
Copy link
Contributor

Thanks. Now it looks like I need something that appears in a different library. Looking at the older xnu sources (I can't find the 10.13 sources online anywhere) it looks like I might need libsyscall. Do you have that library?

@mfrw
Copy link
Contributor

mfrw commented Sep 27, 2017

@ianlancetaylor sorry I dont seem to find that library.. Can you give me some pointers on finding it.

@ianlancetaylor
Copy link
Contributor

Sorry, based on Mac OS 10.12 the library I'm looking for is called libsys_kernel.dylib. In particular I'm looking for the functions __commpage_gettimeofday and _mach_absolute_time. Though of course 10.13 may be different.

@rasky
Copy link
Member

rasky commented Sep 27, 2017

Attached
libsystem_kernel.txt

@ianlancetaylor
Copy link
Contributor

Thanks for all the info. The _mach_absolute_time function is unchanged since 10.12. However, ___commpage_gettimeofday seems to have changed. It is now loading values from address 0x7fffffe000d0 rather than from 0x7fffffe00050. It looks like it may now be computing picoseconds somehow--at least clock_gettime_nsec_np seems to multiply the second result by 1000. I'm not at all sure, though.

Here is an annotated disassembly of commpage_gettimeofday_internal showing the use of 0x7fffffe000d0. The annotations may be completely wrong. I'm not completely sure what this code is doing.

___commpage_gettimeofday_internal:
	pushq	%rbp
	movq	%rsp, %rbp
	pushq	%r15
	pushq	%r14
	pushq	%r13
	pushq	%r12
	pushq	%rbx
	subq	$0x18, %rsp
	movq	%rsi, -0x38(%rbp)	# Save tz argument.
	movq	%rdi, -0x30(%rbp)	# Save second argument.
130a0:
	movabsq	$0x7fffffe000d0, %rbx	# Magic address.
	movq	(%rbx), %r14		# Fetch gtod_ns_base
	movq	0x8(%rbx), %rax		# Fetch gtod offset 8
	movq	%rax, -0x40(%rbp)
	movq	0x10(%rbx), %r13	# Fetch gtod offset 0x10
	movq	0x18(%rbx), %r15	# Fetch gtod_scale
	movq	0x20(%rbx), %r12	# Fetch gtod offset 0x20
	callq	0x1230b			# Call _mach_absolute_time.
	movq	%rax, %rcx
	cmpq	(%rbx), %r14		# Check gtod_ns_base.
	jne	0x130a0			# Loop if changed.
	movq	-0x40(%rbp), %rdi	# Load gtod offset 8.
	movl	$0x1, %eax		# Set return value to 1.
	testq	%r14, %r14		# Check for gtod_ns_base == 0.
	je	0x13122			# Return if gtod_ns_base == 0.
	movq	%rcx, %rsi		# Mach absolute time.
	subq	%r14, %rsi		# Subtract gtod_ns_base.
	cmpq	%r12, %rsi		# Compare to gtod offset 0x20.
	jae	0x13122			# Return if larger.
	movq	%rsi, %rax
	mulq	%r15			# Multiply by gtod_scale (set %rdx:%rax)
	addq	%rdi, %rdx		# Add gtod offset 8 to high result
	imulq	%r15, %rsi		# %rsi *= gtod_scale
	addq	%r13, %rsi		# Add gtod offset 0x10
	adcq	$0x0, %rdx		# Add carry to high result
	movq	-0x30(%rbp), %rdi	# tv argument.
	movq	%rdx, (%rdi)		# Set seconds.
	shrq	$0x20, %rsi
	imulq	$0xf4240, %rsi, %rax	# Multiply by 1000000.
	shrq	$0x20, %rax
	movl	%eax, 0x8(%rdi)		# Set picoseconds?
	xorl	%eax, %eax		# Set return value to 0
	movq	-0x38(%rbp), %rdx	# Second argument.
	testq	%rdx, %rdx
	je	0x13122
	movq	%rcx, (%rdx)		# Set second argument to Mach time.
13122:
	addq	$0x18, %rsp
	popq	%rbx
	popq	%r12
	popq	%r13
	popq	%r14
	popq	%r15
	popq	%rbp
	retq

@rasky
Copy link
Member

rasky commented Sep 27, 2017

I dumped values from my Mac right now (~01:36 CEST):

0xD0: d1cc511a4d35 (gtod_ns_base)
0xD8: 59cc35fd (gtod_high_offset?)
0xE0: 6094ecdfd163c916 (gotd_low_offset?)
0xE8: 44b8e1a7e (gotd_ns_scale)
0xF0: 3b9aca00 (gotd_max_value?)

@rasky
Copy link
Member

rasky commented Sep 27, 2017

Updated with more values (~01:42 CEST):

0xD0: d21b247d1256 (gtod_ns_base)
0xD8: 59cc374f (gtod_high_offset?)
0xE0: f0abf03231aa3f42 (gotd_low_offset?)
0xE8: 44b89e2c8 (gotd_ns_scale)
0xF0: 3b9aca00 (gotd_max_value?)
mach_absolute_time(): d21b25eb4b98
gettimeofday(): 59cc374f,000eb622

@jaredborner
Copy link

doesn't ___commpage_gettimeofday_internal take a timeval struct and set seconds/microseconds (not nanoseconds)?

i'm a bit rusty with my assembly. but, after multiplying by ns_scale (at 130ea/130f0), the upper 64bits (rdx register) is used as the "seconds" part of the time. this indicates to me that the units after the multiplication is 1/2^64 of a second. to convert the fractional part (lower 64bits) to microseconds the equation is value * 10^6 / 2^64. to avoid division/overflow it's doing a right shift of 32, the multiplication, then another shift of 32. (ie (value >> 32 * 10^6) >> 32). to get nanoseconds, it would need to be multiplied by 10^9 instead of 10^6.

not sure if i missed anything from the assembly but here's the c code i used to try and get a better grasp of it.

#include <stdio.h>
#include <inttypes.h>
#include <mach/mach_time.h>
#include <sys/time.h>

int main(int argc, char **argv) {
    uint64_t tsc = 0;
    uint64_t ns_base, hi_ofs, lo_ofs, ns_scale, max_val;
    do {
        ns_base = *(uint64_t *)0x7fffffe000d0;
        hi_ofs = *(uint64_t *)0x7fffffe000d8;
        lo_ofs = *(uint64_t *)0x7fffffe000e0;
        ns_scale = *(uint64_t *)0x7fffffe000e8;
        max_val = *(uint64_t *)0x7fffffe000f0;
        tsc = (uint64_t)mach_absolute_time();
    } while (ns_base != *(uint64_t *)0x7fffffe000d0);
    printf("0x7fffffe000d0 ns_base  = %llu\n", ns_base);
    printf("0x7fffffe000d8 hi_ofs   = %llu\n", hi_ofs);
    printf("0x7fffffe000e0 lo_ofs   = %llu\n", lo_ofs);
    printf("0x7fffffe000e8 ns_scale = %llu\n", ns_scale);
    printf("0x7fffffe000f0 max_val  = %llu\n", max_val);

    if (ns_base == 0) {
        return 1;
    }
    __uint128_t curr_time = (__uint128_t)(tsc - ns_base);
    if (curr_time > max_val) {
        return 1;
    }

    curr_time *= (__uint128_t)ns_scale;
    uint64_t seconds = (uint64_t)(curr_time >> 64) + hi_ofs;
    __uint128_t sub_seconds = (curr_time & UINT64_MAX) + lo_ofs;
    if ((sub_seconds >> 64) > 0) {
        seconds += 1;
    }
    sub_seconds &= UINT64_MAX;

    uint64_t nano_seconds = (uint64_t)((sub_seconds >> 32) * 1000000000) >> 32;
    uint64_t micro_seconds = (uint64_t)((sub_seconds >> 32) * 1000000) >> 32;
    printf("tsc=%llu, sec=%llu, us=%llu, ns=%llu\n", tsc, seconds, micro_seconds, nano_seconds);
    struct timespec ts;
    clock_gettime(CLOCK_REALTIME, &ts);
    printf("     clock_gettime() sec=%ld, ns=%ld\n", ts.tv_sec, ts.tv_nsec);

    struct timeval tv;
    gettimeofday(&tv, NULL);
    printf("      gettimeofday() sec=%ld, us=%d\n", tv.tv_sec, tv.tv_usec);

    return 0;
}

@bmerrill42
Copy link
Contributor

so python is able to do this just fine with time.time() by making it into a float

static double
floattime(void)
{
    /* There are three ways to get the time:
      (1) gettimeofday() -- resolution in microseconds
      (2) ftime() -- resolution in milliseconds
      (3) time() -- resolution in seconds
      In all cases the return value is a float in seconds.
      Since on some systems (e.g. SCO ODT 3.0) gettimeofday() may
      fail, so we fall back on ftime() or time().
      Note: clock resolution does not imply clock accuracy! */
#ifdef HAVE_GETTIMEOFDAY
    {
        struct timeval t;
#ifdef GETTIMEOFDAY_NO_TZ
        if (gettimeofday(&t) == 0)
            return (double)t.tv_sec + t.tv_usec*0.000001;
#else /* !GETTIMEOFDAY_NO_TZ */
        if (gettimeofday(&t, (struct timezone *)NULL) == 0)
            return (double)t.tv_sec + t.tv_usec*0.000001;
#endif /* !GETTIMEOFDAY_NO_TZ */
    }

#endif /* !HAVE_GETTIMEOFDAY */
    {
#if defined(HAVE_FTIME)
        struct timeb t;
        ftime(&t);
        return (double)t.time + (double)t.millitm * (double)0.001;
#else /* !HAVE_FTIME */
        time_t secs;
        time(&secs);
        return (double)secs;
#endif /* !HAVE_FTIME */
    }
}

but I don't think it's a fix
I made a little test main in c that and the last 3 numbers aren't 0's

#include <stdio.h>
#include <sys/time.h>
#include <inttypes.h>

int main(void)
{
  struct timeval t;
  double now;
  gettimeofday(&t, (struct timezone *)NULL);
  now = (double)t.tv_sec + t.tv_usec*0.000001;
  printf("%.8f", now);
}

there's also the clock_gettime_nsec_np function but it has the same problem as time.Now().UnixNano()

@huguesb
Copy link
Contributor

huguesb commented Sep 30, 2017

Relevant snippet from XNU sources:

https://opensource.apple.com/source/xnu/xnu-4570.1.46/osfmk/i386/cpu_capabilities.h

#define _COMM_PAGE_NEWTIMEOFDAY_DATA	(_COMM_PAGE_START_ADDRESS+0x0D0) 	/* used by gettimeofday(). Currently, sizeof(new_commpage_timeofday_data_t) = 40*/

https://opensource.apple.com/source/xnu/xnu-4570.1.46/bsd/sys/commpage.h

typedef volatile struct commpage_timeofday_data{
    uint64_t    TimeStamp_tick;
    uint64_t    TimeStamp_sec;
    uint64_t    TimeStamp_frac;
    uint64_t    Ticks_scale;
    uint64_t    Ticks_per_sec;
} new_commpage_timeofday_data_t;

https://opensource.apple.com/source/xnu/xnu-4570.1.46/libsyscall/wrappers/__commpage_gettimeofday.c

int
__commpage_gettimeofday_internal(struct timeval *tp, uint64_t *tbr_out)
{
	uint64_t now, over;
	uint64_t delta,frac;
	uint64_t TimeStamp_tick;
	uint64_t TimeStamp_sec;
	uint64_t TimeStamp_frac;
	uint64_t Tick_scale;
	uint64_t Ticks_per_sec;

	volatile uint64_t *gtod_TimeStamp_tick_p;
	volatile uint64_t *gtod_TimeStamp_sec_p;
	volatile uint64_t *gtod_TimeStamp_frac_p;
	volatile uint64_t *gtod_Ticks_scale_p;
	volatile uint64_t *gtod_Ticks_per_sec_p;

	new_commpage_timeofday_data_t *commpage_timeofday_datap;

	commpage_timeofday_datap =  (new_commpage_timeofday_data_t *)_COMM_PAGE_NEWTIMEOFDAY_DATA;

	gtod_TimeStamp_tick_p = &commpage_timeofday_datap->TimeStamp_tick;
	gtod_TimeStamp_sec_p = &commpage_timeofday_datap->TimeStamp_sec;
	gtod_TimeStamp_frac_p = &commpage_timeofday_datap->TimeStamp_frac;
	gtod_Ticks_scale_p = &commpage_timeofday_datap->Ticks_scale;
	gtod_Ticks_per_sec_p = &commpage_timeofday_datap->Ticks_per_sec;

	do {
		TimeStamp_tick = *gtod_TimeStamp_tick_p;
		TimeStamp_sec = *gtod_TimeStamp_sec_p;
		TimeStamp_frac = *gtod_TimeStamp_frac_p;
		Tick_scale = *gtod_Ticks_scale_p;
		Ticks_per_sec = *gtod_Ticks_per_sec_p;

		/*
		 * This call contains an instruction barrier which will ensure that the
		 * second read of the abs time isn't speculated above the reads of the
		 * other values above
		 */
		now = mach_absolute_time();
	} while (TimeStamp_tick != *gtod_TimeStamp_tick_p);

	if (TimeStamp_tick == 0)
		return(1);

	delta = now - TimeStamp_tick;

	/* If more than one second force a syscall */
	if (delta >= Ticks_per_sec)
		return(1);

	tp->tv_sec = TimeStamp_sec;

	over = multi_overflow(Tick_scale, delta);
	if(over){
		tp->tv_sec += over;
	}

	/* Sum scale*delta to TimeStamp_frac, if it overflows increment sec */
	frac = TimeStamp_frac;
	frac += Tick_scale * delta;
	if( TimeStamp_frac > frac )
		tp->tv_sec++;

	/*
	 * Convert frac (64 bit frac of a sec) to usec
	 * usec = frac * USEC_PER_SEC / 2^64
	 */
	tp->tv_usec = ((uint64_t)1000000 * (uint32_t)(frac >> 32)) >> 32;

	if (tbr_out) {
		*tbr_out = now;
	}

	return(0);
}

@rasky
Copy link
Member

rasky commented Sep 30, 2017

I'm working on a fix for this

@gopherbot
Copy link

Change https://golang.org/cl/67331 mentions this issue: runtime: rename offsets macros to prepare for multiple versions

@gopherbot
Copy link

Change https://golang.org/cl/67332 mentions this issue: runtime: fix wall time computation in macOS High Sierra

@rasky
Copy link
Member

rasky commented Sep 30, 2017

so python is able to do this just fine with time.time() by making it into a float

Python is invoking syscalls that return values with microsecond accuracy. The fact that they get then changed into float is immaterial, there is still no nanosecond accuracy. Basically Python does what Go happens to do now on High Sierra: fallback to the slower path (syscalls) with lower accuracy.

Go accesses the so-called commpage, which is a special area that the kernel maps into every process at the same virtual address, and contains useful information that can be accessed directly without the overhead of a syscall. In High Sierra, the commpage format has been bumped (from 12 to 13) changing the way the wall/real time is returned; the monotonic time is not changed. So what happens right now is that time.Now() on current Go versions is able to read the monotonic timer with nanosecond precision, but then fails to convert it to real time; it does realize there is something wrong, and falls back to calling a syscall to get the real time (but with microsecond precision).

My CL adds support for detecting commpage v13, and implements the new required formulas to convert the monotonic time read into real time, with full nanosecond accuracy. The assembly code is basically the equivalent of the above snippet of code coming from XNU sources, but it's made easier by the fact that we can do some 128-bit calculations in Assembly, while it would not possible in Go, nor it is possible in the dialect of C used for XNU (some GCC versions do support 128-bit calculations on some platforms).

gopherbot pushed a commit that referenced this issue Oct 3, 2017
High Sierra has a new commpage layout (this is issue #3188), so
we need to adjust the code to handle multiple versions of the
layout.

In preparation for this change, we rename the existing offset
macros with a prefix that identifies the commpage version they
refer to.

Updates #22037

Change-Id: Idca4b7a855a2ff6dbc434cd12453fc3194707aa8
Reviewed-on: https://go-review.googlesource.com/67331
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@ianlancetaylor ianlancetaylor modified the milestones: Go1.10, Go1.9.1 Oct 3, 2017
@ianlancetaylor
Copy link
Contributor

Reopen for 1.9 backport.

@ianlancetaylor ianlancetaylor reopened this Oct 3, 2017
@rsc rsc modified the milestones: Go1.9.1, Go1.9.2 Oct 4, 2017
@ianlancetaylor ianlancetaylor added release-blocker and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Oct 13, 2017
@rsc
Copy link
Contributor

rsc commented Oct 13, 2017

This change is a lot of new assembly and scares me. The old code was correctly falling back to calling into the kernel to get the time, which seems OK enough. I'm not even convinced the nanoseconds are accurate on macOS anyway - last time I looked, the kernel was setting the "call into the kernel anyway" bit in the commpage about once a second to slow down time because the commpage version of time was running too fast, consistently. So the nanoseconds were probably racing ahead quite a bit.

Can we leave this out of Go 1.9.2?

@rsc rsc added NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. and removed release-blocker labels Oct 13, 2017
@rasky
Copy link
Member

rasky commented Oct 13, 2017

his change is a lot of new assembly and scares me. The old code was correctly falling back to calling into the kernel to get the time, which seems OK enough.

I think it's ok not to backport it, and reevaluate the decision if somebody comes screaming with real code being broken. Given that time measures use monotonic support, I don't think that real-time nanoseconds are that useful.

I'm not even convinced the nanoseconds are accurate on macOS anyway - last time I looked, the kernel was setting the "call into the kernel anyway" bit in the commpage about once a second to slow down time because the commpage version of time was running too fast, consistently.

I don't see that happening now but I'm not sure I'm testing it correctly.

@rsc
Copy link
Contributor

rsc commented Oct 15, 2017

@rasky, thanks for pointing out that monotonic time is unaffected. Precise timing of small intervals was the only thing I was worried about, so with that unaffected I agree this can wait.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. OS-Darwin
Projects
None yet
Development

No branches or pull requests

9 participants