Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: usleep does not check for early wakeup #40653

Open
ruyi789 opened this issue Aug 8, 2020 · 10 comments
Open

runtime: usleep does not check for early wakeup #40653

ruyi789 opened this issue Aug 8, 2020 · 10 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@ruyi789
Copy link

ruyi789 commented Aug 8, 2020

After calling nanosleep, it did not check whether the return value is 0, resulting in inaccurate waiting time

TEXT runtime·usleep(SB),NOSPLIT,$16
	MOVL	$0, DX
	MOVL	usec+0(FP), AX
	MOVL	$1000000, CX
	DIVL	CX
	MOVQ	AX, 0(SP)
	MOVL	$1000, AX	// usec to nsec
	MULL	DX
	MOVQ	AX, 8(SP)

	// nanosleep(&ts, 0)
	MOVQ	SP, DI
	MOVL	$0, SI
	MOVL	$SYS_nanosleep, AX 
	SYSCALL
	RET
@ALTree ALTree changed the title runtime·usleep bug runtime: usleep bug Aug 8, 2020
@ALTree ALTree added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 8, 2020
@acln0
Copy link
Contributor

acln0 commented Aug 9, 2020

The runtime uses SA_RESTART when installing signal handlers, which I am pretty sure prevents nanosleep from ever returning EINTR. Also, note the comment just before making the system call: the invocation is equivalent to nanosleep(&ts, NULL), therefore, I don't think we could ever get the remaining time written back into the rem parameter anyway.

Have you observed shorter sleeps due to signal interruptions? I am reading the man page and I cannot see other circumstances for an error: can you think of any other reason why nanosleep might return an error, considering the fact that signal handlers are installed with SA_RESTART? In the man page, I see EFAULT for a bad req pointer, EINTR for a signal interruption, and EINVAL for invalid timespec values, which I imagine the surrounding runtime code guards against.

@networkimprov
Copy link

@ianlancetaylor
Copy link
Contributor

Does this ever cause a problem?

The usleep function is internal to the runtime. The exact length of time that it sleeps should never matter.

@ianlancetaylor ianlancetaylor added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Aug 10, 2020
@rsc
Copy link
Contributor

rsc commented Aug 11, 2020

The only places where usleep needs to sleep for the expected length are:

proc.go:4600:		usleep(delay)
runtime.go:33:		usleep(100 * 1000)
signal_unix.go:675:			usleep(5 * 1000 * 1000)

The first is in sysmon, where shortened sleep would result in background work happening faster than expected.

The second is in tickspersecond, which is used by profiles to turn CPU time into CPU cycles.

The third is when a process crashes: we sleep 5 seconds while the SIGQUIT signals are going around causing stack traces. Assuming this actually works, it seems like strong proof that sleeping through signals is fine.

As Cherry said, have you ever observed a short sleep? Does this actually happen?

@rsc rsc changed the title runtime: usleep bug runtime: usleep does not check for early wakeup Aug 12, 2020
@rsc
Copy link
Contributor

rsc commented Aug 12, 2020

I retitled this; the question of whether an early wakeup is even possible remains.

@ruyi789
Copy link
Author

ruyi789 commented Aug 12, 2020

// runtime.Test() ,test in runtime package, on linux system
// print info:

//linux: in usleep
//timeout: 11679078
//linux: in usleep
//timeout: 5000616671

//usleep just only run on the system stack
func Test() {

	systemstack(func() {
		print("linux: in usleep\n")
		t1 := nanotime()
		usleep(1000 * 1000 * 5)
		t2 := nanotime()
		print("timeout: ", t2-t1, "\n")
	})

	entersyscall()
	systemstack(func() {
		print("linux: in usleep\n")
		t1 := nanotime()
		usleep(1000 * 1000 * 5)
		t2 := nanotime()
		print("timeout: ", t2-t1, "\n")
	})
	exitsyscall()
}

@ianlancetaylor ianlancetaylor removed the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Aug 12, 2020
@ianlancetaylor
Copy link
Contributor

@acln0 The runtime.usleep function calls the nanosleep system call. When a signal is caught, the nanosleep call is always interrupted, even if the signal handler has SA_RESTART set. An early wakeup is definitely possible.

@ianlancetaylor ianlancetaylor added this to the Go1.16 milestone Aug 12, 2020
@ianlancetaylor
Copy link
Contributor

The 5 second sleep in sighandler is called with all signals blocked, so that sleep should normally complete.

We could block signals around the call to usleep in tickspersecond; this call happens at most once per execution.

We could also block signals around the call to usleep in sysmon. Or we could keep calling sleep until the full delay occurred. Or we could decide that if a signal was received, it's worth checking whether something happened.

@aclements aclements modified the milestones: Go1.16, Unplanned Dec 8, 2020
@aclements
Copy link
Member

Moving to Unplanned since it's still not clear this is actually a problem.

@felixge
Copy link
Contributor

felixge commented Jan 24, 2021

I would also like to confirm that early wakeup is possible. Here is a simple test case:

func main() {
	start := time.Now()
	runtime.SetBlockProfileRate(42)
	fmt.Println(time.Since(start))
}

Without wakeup, one would expect this to take at least 100ms due to the usleep(100 * 1000) in tickspersecond(). And up to go 1.13 that is indeed the case:

$ go version
go version go1.13.15 darwin/amd64
$ go run main.go
101.339548ms

However, for later go versions I see this:

$ go version
go version go1.15.6 darwin/amd64
$ go run main.go 
13.873457ms

I suspect that this is due to async preemption signals introduced in Go 1.14.

As far as impact is concerned, this could reduce the accuracy of block profiling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

8 participants