Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: "fatal error: unexpected signal during runtime execution" on windows-amd64-longtest builder of Go 1.15.2 commit #41296

Closed
dmitshur opened this issue Sep 9, 2020 · 12 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. Testing An issue that has been verified to require only test changes, not just a test failure.
Milestone

Comments

@dmitshur
Copy link
Contributor

dmitshur commented Sep 9, 2020

We've observed the following failure during runtime tests on a windows-amd64-longtest slowbot on a commit for the upcoming Go 1.15.2:

fatal error: unexpected signal during runtime execution
[signal 0xc0000005 code=0x0 addr=0x0 pc=0x12efa46]

runtime stack:
runtime.throw(0x151e8d7, 0x2a)
	C:/workdir/go/src/runtime/panic.go:1116 +0x79 fp=0xc0709ff8f8 sp=0xc0709ff8c8 pc=0x12bc659
runtime.sigpanic()
	C:/workdir/go/src/runtime/signal_windows.go:240 +0x285 fp=0xc0709ff928 sp=0xc0709ff8f8 pc=0x12d1745
runtime.(*pageAlloc).chunkOf(...)
	C:/workdir/go/src/runtime/mpagealloc.go:331
runtime.CheckScavengedBitsCleared.func1()
	C:/workdir/go/src/runtime/export_test.go:905 +0x186 fp=0xc0709ff960 sp=0xc0709ff928 pc=0x12efa46
runtime.systemstack(0x0)
	C:/workdir/go/src/runtime/asm_amd64.s:370 +0x6b fp=0xc0709ff968 sp=0xc0709ff960 pc=0x12f4e0b
runtime.mstart()
	C:/workdir/go/src/runtime/proc.go:1116 fp=0xc0709ff970 sp=0xc0709ff968 pc=0x12c18e0

goroutine 20213428 [running]:
runtime.systemstack_switch()
	C:/workdir/go/src/runtime/asm_amd64.s:330 fp=0xc0001a92c0 sp=0xc0001a92b8 pc=0x12f4d80
[...]
FAIL	runtime	143.911s

See here for more context and the complete build log.

The windows-amd64-longtest slowbot passed all tests on the second try, so this failure is intermittent. It doesn't seem to happen often, but it may be connected to (or the same as) issues #41285, #41099.

I've tried running go test -timeout=0 -count=5 runtime on a machine with Windows 10, and it passed.

We should see if there's any useful information in this particular instance, and if not, this issue can probably be closed as a duplicate.

/cc @randall77 @ianlancetaylor @toothrot @andybons

@dmitshur dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 9, 2020
@dmitshur dmitshur added this to the Backlog milestone Sep 9, 2020
@dmitshur
Copy link
Contributor Author

dmitshur commented Sep 9, 2020

This must've happened during execution of TestScavengedBitsCleared test, since that's the only place CheckScavengedBitsCleared is called.

/cc @mknyszek FYI, in case this is related to CL 201764.

@mknyszek
Copy link
Contributor

mknyszek commented Sep 9, 2020

Oh, yeah, CheckScavengedBitsCleared is just wrong. There's also another place where this could happen. We should just ignore a nil result from chunkOf because the chunks index is sparse.

@dmitshur dmitshur added the Testing An issue that has been verified to require only test changes, not just a test failure. label Sep 9, 2020
@gopherbot
Copy link

Change https://golang.org/cl/253777 mentions this issue: runtime: fix ReadMemStatsSlow's and CheckScavengedBits' chunk iteration

@mknyszek mknyszek removed the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 9, 2020
@mknyszek mknyszek self-assigned this Sep 9, 2020
@dmitshur dmitshur added the NeedsFix The path to resolution is known, but the work has not been done. label Sep 9, 2020
@dmitshur dmitshur modified the milestones: Backlog, Go1.16 Sep 9, 2020
@randall77
Copy link
Contributor

@mknyszek Does this need backporting?

@mknyszek
Copy link
Contributor

mknyszek commented Sep 9, 2020

It's only an issue in our own tests, and it's going to be really rare because it requires the address space to be discontiguous (we do have tests that intentionally do make the address space discontiguous, but those don't run these particular functions).

But, maybe that's a good reason to backport it too, to at least rule out this failure on release branches or folks running all.bash. It's a fairly safe change.

@andybons
Copy link
Member

andybons commented Sep 9, 2020

Let’s backport :)

@mknyszek
Copy link
Contributor

Let’s backport :)

Sounds good!

@gopherbot Please open a backport issue for 1.15.

@gopherbot
Copy link

Backport issue(s) opened: #41317 (for 1.15).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases.

@gopherbot
Copy link

Change https://golang.org/cl/253917 mentions this issue: [release-branch.go1.15] runtime: fix ReadMemStatsSlow's and CheckScavengedBits' chunk iteration

@mknyszek
Copy link
Contributor

@gopherbot Please open a backport issue for 1.14.

@dmitshur
Copy link
Contributor Author

@mknyszek GopherBot is lazy (#25574), you'll need to create it manually.

@gopherbot
Copy link

Change https://golang.org/cl/253922 mentions this issue: [release-branch.go1.14] runtime: fix ReadMemStatsSlow's and CheckScavengedBits' chunk iteration

gopherbot pushed a commit that referenced this issue Sep 10, 2020
…engedBits' chunk iteration

Both ReadMemStatsSlow and CheckScavengedBits iterate over the page
allocator's chunks but don't actually check if they exist. During the
development process the chunks index became sparse, so now this was a
possibility. If the runtime tests' heap is sparse we might end up
segfaulting in either one of these functions, though this will generally
be very rare.

The pattern here to return nil for a nonexistent chunk is also useful
elsewhere, so this change introduces tryChunkOf which won't throw, but
might return nil. It also updates the documentation of chunkOf.

For #41296.
Fixes #41317.

Change-Id: Id5ae0ca3234480de1724fdf2e3677eeedcf76fa0
Reviewed-on: https://go-review.googlesource.com/c/go/+/253777
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
(cherry picked from commit 34835df)
Reviewed-on: https://go-review.googlesource.com/c/go/+/253917
gopherbot pushed a commit that referenced this issue Sep 10, 2020
…engedBits' chunk iteration

Both ReadMemStatsSlow and CheckScavengedBits iterate over the page
allocator's chunks but don't actually check if they exist. During the
development process the chunks index became sparse, so now this was a
possibility. If the runtime tests' heap is sparse we might end up
segfaulting in either one of these functions, though this will generally
be very rare.

The pattern here to return nil for a nonexistent chunk is also useful
elsewhere, so this change introduces tryChunkOf which won't throw, but
might return nil. It also updates the documentation of chunkOf.

For #41296.
Fixes #41322.

Change-Id: Id5ae0ca3234480de1724fdf2e3677eeedcf76fa0
Reviewed-on: https://go-review.googlesource.com/c/go/+/253777
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
(cherry picked from commit 34835df)
Reviewed-on: https://go-review.googlesource.com/c/go/+/253922
Run-TryBot: Dmitri Shuralyov <dmitshur@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
claucece pushed a commit to claucece/go that referenced this issue Oct 22, 2020
…engedBits' chunk iteration

Both ReadMemStatsSlow and CheckScavengedBits iterate over the page
allocator's chunks but don't actually check if they exist. During the
development process the chunks index became sparse, so now this was a
possibility. If the runtime tests' heap is sparse we might end up
segfaulting in either one of these functions, though this will generally
be very rare.

The pattern here to return nil for a nonexistent chunk is also useful
elsewhere, so this change introduces tryChunkOf which won't throw, but
might return nil. It also updates the documentation of chunkOf.

For golang#41296.
Fixes golang#41317.

Change-Id: Id5ae0ca3234480de1724fdf2e3677eeedcf76fa0
Reviewed-on: https://go-review.googlesource.com/c/go/+/253777
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
(cherry picked from commit 34835df)
Reviewed-on: https://go-review.googlesource.com/c/go/+/253917
@golang golang locked and limited conversation to collaborators Sep 10, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. Testing An issue that has been verified to require only test changes, not just a test failure.
Projects
None yet
Development

No branches or pull requests

5 participants