Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: TestGdbBacktrace failures due to GDB "internal-error: wait returned unexpected status 0x0" #43068

Closed
bcmills opened this issue Dec 8, 2020 · 12 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker Testing An issue that has been verified to require only test changes, not just a test failure.
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Dec 8, 2020

2020-12-07T21:01:46-7ad6596/linux-mips64le-mengzhuo
2020-10-23T15:11:15-646531c/linux-mips64le-mengzhuo

--- FAIL: TestGdbBacktrace (1.51s)
    runtime-gdb_test.go:77: gdb version 8.1
    runtime-gdb_test.go:437: gdb output:
        Loading Go Runtime support.
        Breakpoint 1 at 0x7ece0: file /tmp/farm/tmp/go-build405377235/main.go, line 17.
        [New LWP 18663]
        [New LWP 18664]
        [New LWP 18665]
        [New LWP 18666]
        
        Thread 1 "a.exe" hit Breakpoint 1, 0x000000000007ece0 in main.eee (~r0=<optimized out>) at /tmp/farm/tmp/go-build405377235/main.go:17
        17	func eee() bool { return true }
        #0  0x000000000007ece0 in main.eee (~r0=<optimized out>) at /tmp/farm/tmp/go-build405377235/main.go:17
        #1  0x000000000007ecbc in main.ddd (~r0=<optimized out>) at /tmp/farm/tmp/go-build405377235/main.go:14
        #2  0x000000000007ec5c in main.ccc (~r0=<optimized out>) at /tmp/farm/tmp/go-build405377235/main.go:11
        #3  0x000000000007ec0c in main.bbb (~r0=<optimized out>) at /tmp/farm/tmp/go-build405377235/main.go:8
        #4  0x000000000007ebbc in main.aaa (~r0=<optimized out>) at /tmp/farm/tmp/go-build405377235/main.go:5
        #5  0x000000000007ed1c in main.main () at /tmp/farm/tmp/go-build405377235/main.go:22
        ../../gdb/linux-nat.c:2081: internal-error: wait returned unexpected status 0x0
        A problem internal to GDB has been detected,
        further debugging may prove unreliable.
        Quit this debugging session? (y or n) [answered Y; input not from terminal]
        
        This is a bug, please report it.  For instructions, see:
        <http://www.gnu.org/software/gdb/bugs/>.
        
        ../../gdb/linux-nat.c:2081: internal-error: wait returned unexpected status 0x0
        A problem internal to GDB has been detected,
        further debugging may prove unreliable.
        Create a core file of GDB? (y or n) [answered Y; input not from terminal]
    runtime-gdb_test.go:439: gdb exited with error: signal: aborted (core dumped)
FAIL
FAIL	runtime	44.687s

See previously #37405 (TestGdbBacktrace hanging on the same builder), #39228 (TestGdbBacktrace failures on Linux), #39204 (meta-bug about runtime GDB test flakiness).

CC @mengzhuo @ianlancetaylor

@bcmills bcmills added Builders x/build issues (builders, bots, dashboards) NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Dec 8, 2020
@bcmills bcmills added this to the Backlog milestone Dec 8, 2020
@mengzhuo
Copy link
Contributor

mengzhuo commented Dec 9, 2020

https://sourceware.org/bugzilla/show_bug.cgi?id=20301

Can finally reproduce after a few tries.

What happens is that another thread calls exit() while I "next" this thread.
Most of the time gdb handles it nicely, but it also happens that I get the internal-error immediately:

@bcmills
Copy link
Contributor Author

bcmills commented May 11, 2021

Looks like the same underlying GDB bug on linux-riscv64-jsing:

2021-05-10T19:19:34-dc50683/linux-riscv64-jsing

--- FAIL: TestGdbBacktrace (6.72s)
    runtime-gdb_test.go:76: gdb version 9.2
    runtime-gdb_test.go:428: gdb output:
        Loading Go Runtime support.
        Breakpoint 1 at 0x719a8: file /home/gopher/build/tmp/TestGdbBacktrace2306548485/001/main.go, line 17.
        [New LWP 2737319]
        [New LWP 2737320]
        [New LWP 2737321]
        
        Thread 1 "a.exe" hit Breakpoint 1, main.eee (~r0=<optimized out>) at /home/gopher/build/tmp/TestGdbBacktrace2306548485/001/main.go:17
        17	func eee() bool { return true }
        #0  main.eee (~r0=<optimized out>) at /home/gopher/build/tmp/TestGdbBacktrace2306548485/001/main.go:17
        #1  0x0000000000071994 in main.ddd (~r0=<optimized out>) at /home/gopher/build/tmp/TestGdbBacktrace2306548485/001/main.go:14
        #2  0x0000000000071954 in main.ccc (~r0=<optimized out>) at /home/gopher/build/tmp/TestGdbBacktrace2306548485/001/main.go:11
        #3  0x000000000007191c in main.bbb (~r0=<optimized out>) at /home/gopher/build/tmp/TestGdbBacktrace2306548485/001/main.go:8
        #4  0x00000000000718e4 in main.aaa (~r0=<optimized out>) at /home/gopher/build/tmp/TestGdbBacktrace2306548485/001/main.go:5
        #5  0x00000000000719dc in main.main () at /home/gopher/build/tmp/TestGdbBacktrace2306548485/001/main.go:21
        /build/gdb-D4eJJR/gdb-9.2/gdb/linux-nat.c:1963: internal-error: wait returned unexpected status 0x0
        A problem internal to GDB has been detected,
        further debugging may prove unreliable.
        Quit this debugging session? (y or n) [answered Y; input not from terminal]
        
        This is a bug, please report it.  For instructions, see:
        <http://www.gnu.org/software/gdb/bugs/>.
        
    runtime-gdb_test.go:430: gdb exited with error: signal: aborted
FAIL
FAIL	runtime	294.989s

CC @4a6f656c

@bcmills bcmills changed the title runtime: TestGdbBacktrace failures due to GDB internal error on linux-mips64le-mengzhuo builder runtime: TestGdbBacktrace failures due to GDB "internal-error: wait returned unexpected status 0x0" May 11, 2021
@bcmills
Copy link
Contributor Author

bcmills commented Jan 26, 2022

greplogs --dashboard -md -l -e 'FAIL: TestGdbBacktrace .*(?:\n .*)*: wait returned unexpected status' --since=2021-05-12

2022-01-12T00:01:48-8070e70/linux-riscv64-jsing
2021-10-26T22:05:53-80be4a4/linux-riscv64-unmatched

@bcmills
Copy link
Contributor Author

bcmills commented Feb 8, 2022

greplogs --dashboard -md -l -e 'FAIL: TestGdbBacktrace .*(?:\n .*)*: wait returned unexpected status' --since=2022-01-27

2022-02-07T21:57:29-911c78f/linux-386-longtest

@bcmills
Copy link
Contributor Author

bcmills commented Feb 8, 2022

Note that the most recent failure is on linux/386, which is a first-class port.

Given how little time is left in the Go 1.18 release cycle, marking as release-blocker for Go 1.19.

@bcmills bcmills modified the milestones: Backlog, Go1.19 Feb 8, 2022
@bcmills bcmills added release-blocker and removed Builders x/build issues (builders, bots, dashboards) labels Feb 8, 2022
@bcmills
Copy link
Contributor Author

bcmills commented Feb 8, 2022

Given that the failure is within GDB itself, we could resolve this issue by doing one or more of the following:

  • Getting an upstream fix in GDB and updating the builders to pull it in.
  • Identifying affected GDB versions and skipping the test on those versions.
  • Modifying the test to skip itself if it detects this failure mode.
  • Determining that the scenario under test cannot work reliably and removing the test for it.

@bcmills
Copy link
Contributor Author

bcmills commented Feb 8, 2022

There are at least two upstream bugs reported against GDB for this symptom:
https://sourceware.org/bugzilla/show_bug.cgi?id=24628
https://sourceware.org/bugzilla/show_bug.cgi?id=28551

@gopherbot
Copy link

Change https://go.dev/cl/384234 mentions this issue: runtime: skip TestGdbBacktrace flakes matching a known GDB internal error

@bcmills
Copy link
Contributor Author

bcmills commented Feb 8, 2022

Modifying the test to skip itself if it detects this failure mode.

Actually, we can do that one now, and we should: otherwise, this test may flake for Go users when they run go test all in their own module.

@bcmills bcmills modified the milestones: Go1.19, Go1.18 Feb 8, 2022
@bcmills bcmills self-assigned this Feb 8, 2022
@bcmills bcmills added the Testing An issue that has been verified to require only test changes, not just a test failure. label Feb 8, 2022
@bcmills
Copy link
Contributor Author

bcmills commented May 24, 2022

@gopherbot, please backport to Go 1.17. This test still fails intermittently on the release branch, and the patch to skip for that failure mode is small and test-only.

@gopherbot
Copy link

Backport issue(s) opened: #53049 (for 1.17).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases.

@gopherbot
Copy link

Change https://go.dev/cl/408054 mentions this issue: [release-branch.go1.17] runtime: skip TestGdbBacktrace flakes matching a known GDB internal error

gopherbot pushed a commit that referenced this issue May 25, 2022
…g a known GDB internal error

TestGdbBacktrace occasionally fails due to a GDB internal error.
We have observed the error on various linux builders since at least
October 2020, and it has been reported upstream at least twice.¹²

Since the bug is external to the Go project and does not appear to be
fixed upstream, this failure mode can only add noise.

¹https://sourceware.org/bugzilla/show_bug.cgi?id=24628
²https://sourceware.org/bugzilla/show_bug.cgi?id=28551

Fixes #53049
Updates #43068

Change-Id: I6c92006a5d730f1c4df54b0307f080b3d643cc6b
Reviewed-on: https://go-review.googlesource.com/c/go/+/384234
Trust: Bryan Mills <bcmills@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
(cherry picked from commit 275aedc)
Reviewed-on: https://go-review.googlesource.com/c/go/+/408054
Reviewed-by: Alex Rakoczy <alex@golang.org>
@rsc rsc unassigned bcmills Jun 23, 2022
@golang golang locked and limited conversation to collaborators Jun 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker Testing An issue that has been verified to require only test changes, not just a test failure.
Projects
None yet
Development

No branches or pull requests

3 participants