Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: TestVDSO failures on linux-arm64-packet #35473

Closed
bcmills opened this issue Nov 8, 2019 · 4 comments
Closed

runtime: TestVDSO failures on linux-arm64-packet #35473

bcmills opened this issue Nov 8, 2019 · 4 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Nov 8, 2019

--- FAIL: TestVDSO (71.53s)
    crash_test.go:95: testprog SignalInVDSO exit status: exit status 2
    crash_test.go:149: output:
        SIGQUIT: quit
        PC=0x6b4c8 m=1 sigcode=0
        
        goroutine 0 [idle]:
        runtime.usleep()
        	/workdir/go/src/runtime/sys_linux_arm64.s:148 +0x48
        runtime.sysmon()
        	/workdir/go/src/runtime/proc.go:4461 +0x9c
        runtime.mstart1()
        	/workdir/go/src/runtime/proc.go:1125 +0xb0
        runtime.mstart()
        	/workdir/go/src/runtime/proc.go:1090 +0x60
        
        goroutine 1 [running]:
        	goroutine running on other thread; stack unavailable
        
        goroutine 18 [sleep]:
        time.Sleep(0x5f5e100)
        	/workdir/go/src/runtime/time.go:247 +0xc0
        runtime/pprof.profileWriter(0x12c880, 0x40001b0018)
        	/workdir/go/src/runtime/pprof/pprof.go:765 +0x60
        created by runtime/pprof.StartCPUProfile
        	/workdir/go/src/runtime/pprof/pprof.go:750 +0x128

2019-11-08T21:27:51-b7d097a/linux-arm64-packet
2019-11-08T18:11:01-1fd3f8b/linux-arm64-packet
2019-11-08T16:20:17-4208dbe/linux-arm64-packet
2019-11-07T20:34:27-4751db9/linux-arm64-packet

See also #33574.

CC @ianlancetaylor @mengzhuo @nyuichi

@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Nov 8, 2019
@bcmills bcmills added this to the Go1.14 milestone Nov 8, 2019
@ianlancetaylor
Copy link
Contributor

As far as I can see this means that this loop in runtime/testdata/testprog/vdso.go

	t0 := time.Now()
	t1 := t0
	for t1.Sub(t0) < time.Second {
		t1 = time.Now()
	}

ran for more than one minute. Hmmm.

@ianlancetaylor
Copy link
Contributor

Given that this just started to appear, I'm naturally suspicious of CL 203461 == 1b0b980.

CC @cherrymui

@cherrymui
Copy link
Member

I think I have a guess:

  • a goroutine running in VDSO. It saves the g on the signal stack before entering VDSO.
  • a profiling signal comes. During the handling of the profiling signal, it calls nanotime, which saves g on the same signal stack before entering VDSO, and clears it after.
  • while the goroutine is still in VDSO, a preemption signal comes. Now sigFetchG fetches a nil G from the signal stack (as it is cleared in the previous step), then calls badsignal, then deadlocks in lockextra (as before CL http://golang.org/cl/202759).

I think we don't want to save G if we're already on the signal stack. This seems to make it work, running 1000 iterations without failure.

@gopherbot
Copy link

Change https://golang.org/cl/206397 mentions this issue: runtime: don't save G during VDSO if we're handling signal

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker
Projects
None yet
Development

No branches or pull requests

4 participants