Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: gdb command "goroutine 1 bt" fails on core file #17575

Open
kayuuzu opened this issue Oct 25, 2016 · 14 comments
Open

runtime: gdb command "goroutine 1 bt" fails on core file #17575

kayuuzu opened this issue Oct 25, 2016 · 14 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Debugging NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@kayuuzu
Copy link

kayuuzu commented Oct 25, 2016

What version of Go are you using (go version)?

go version go1.5.3 linux/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/ldata/comp/project/go"
GORACE=""
GOROOT="/ldata/bin/go"
GOTOOLDIR="/ldata/bin/go/pkg/tool/linux_amd64"
GO15VENDOREXPERIMENT=""
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0"
CXX="g++"
CGO_ENABLED="1"

What did you do?

  1. export GOTRACEBACK=crash
  2. ulimit -c unlimited
  3. write a buggy program and run it to generate a core file
  4. use gdb to load core file
  5. source runtime-gdb.py
  6. run goroutine 1 bt

What did you expect to see?

print stack trace of goroutine 1

What did you see instead?

print error msg:
Python Exception <class 'gdb.error'> You can't do that without a process to debug.:
Error occurred in Python command: You can't do that without a process to debug.

@kayuuzu
Copy link
Author

kayuuzu commented Oct 25, 2016

runtime-gdb.py seems to set pc&sp registers when you run goroutine 1 bt, but registers are read-only after you load core file.

@quentinmit quentinmit changed the title fail to print stack trace of goroutine when using core file runtime: gdb command "goroutine 1 bt" fails on core file Oct 28, 2016
@quentinmit quentinmit added the NeedsFix The path to resolution is known, but the work has not been done. label Oct 28, 2016
@quentinmit
Copy link
Contributor

I'm not sure what option we have here instead of setting $pc and $sp.

/cc @aclements @cherrymui

@quentinmit quentinmit added this to the Go1.9Maybe milestone Oct 28, 2016
@aclements
Copy link
Member

Unfortunately, before GDB 7.10, setting $pc and $sp is our only option. For GDB 7.10 and up, I believe we can use "unwinder" support, which is a much better way to do this. Not having GDB 7.10, I can't easily try this out. :) Ubuntu 16.04LTS has GDB 7.11, so this is probably becoming more widely available.

@aclements
Copy link
Member

BTW, if someone does want to try this out with GDB 7.10+, I think the code would be something like:

try:
    from gdb.unwinders import Unwinder
except ImportError:
    Unwinder = None

if Unwinder is not None:
    class FrameID(object):
        def __init__(self, sp, pc):
            self.sp = sp
            self.pc = pc

    class GoUnwinder(Unwinder):
        def __init__(self):
            super(GoUnwinder, self).__init___("go-unwinder")

        def __call__(pending_frame):
            # This only applies to the first frame.
            self.enabled = False

            # Ignore registers in pending_frame. Use stashed PC/SP.
            return pending_frame.create_unwind_info(self.frame_id)
    goUnwinder = GoUnwinder()

# ... in GoroutineCmd ...
goUnwinder.frame_id = FrameID(sp, pc)
goUnwinder.enabled = True
try:
    gdb.execute(cmd)
finally:
    goUnwinder.enabled = False

This is completely untested.

@aclements
Copy link
Member

FWIW, I recently wrote a gdb script that addresses this by copying the core file and rewriting the PC/SP saved in the core file for thread 1. You can then open this new core file and backtrace from it. It's awful, but it works (at least, on linux/amd64). https://gist.github.com/aclements/8d2d6a1d1ade4bc4fd492db6d3eb07a5

@kayuuzu
Copy link
Author

kayuuzu commented Dec 26, 2016

@aclements Good job. I will try it.

@mail2fish
Copy link

Is there any update for this issue?

@bradfitz
Copy link
Contributor

@mail2fish, any updates would be posted here. This is not happening for Go 1.9 apparently.

@bradfitz bradfitz modified the milestones: Go1.10, Go1.9Maybe May 16, 2017
@aclements
Copy link
Member

I got my hands on a more recent version of GDB that supports the frame unwinders API and completely failed to get it to do anything useful. If anyone wants to pick up where I left off, here's the code that monkey-patches in the unwinder: https://gist.github.com/aclements/e8b4b3d887ccc9fd2907e696a8b4b2a2. It gets into the custom unwinder, but GDB seems to ignore the updated unwind info it returns. My next step would probably be to run GDB under GDB and figure out what's going on inside.

@occia
Copy link

occia commented Aug 24, 2017

So can I say, so far, there isn't a way to look the stack trace from a core dump of an exe written in Golang? I really need the answer, and a more detail question is at here thanks : )

@aclements
Copy link
Member

@occia, the closest thing we have right now is the hack in #17575 (comment). Unfortunately, this is a limitation of GDB and there's not much we can do about it. In theory a custom unwinder should make it possible, but all of my experiments with custom unwinding failed to produce any results at all.

@heschi
Copy link
Contributor

heschi commented Oct 17, 2017

For posterity, dlv should work fine for this. dlv core <binary> <core file>.

@rsc rsc modified the milestones: Go1.10, Go1.11 Nov 22, 2017
@gopherbot gopherbot modified the milestones: Go1.11, Unplanned May 23, 2018
@ptsneves
Copy link

Have been burning my few balding hairs trying to get a meaningful bt. gdb displays a bt but it is useless as it mostly shows OS threads bt not really go routines.

I was able to get an unwinder working but I think the way @aclements did it is not how the unwinders are supposed to work. We are supposed to use the frames passed and process them, not completely ignore them. I do not think that is a feasible approach.
I was able to get the core rewriter mentioned by @aclements working but there are some points to mention:

  • One needs to manually get the sp and pc from the 'runtime.allgs' and plug them in the core generator
  • The script is amd64 only
  • I think that it does not take into account the thread, so registers may be not matching the go routines thread.

If i make further progress I will report

@ptsneves
Copy link

I finally was able to get a stack trace from a core dump with the unwinder work @aclements did. I am now trying to find if it is possible to use it for other arches. Unfortunately when I use the "pc" mnemonic it crashes GDB

btg <address of the g>

import gdb
import gdb.xmethod
from gdb.unwinder import Unwinder, register_unwinder


def command(fn):
    name = fn.__name__.replace('_', '-')
    dct = {
        '__doc__': fn.__doc__,
        '__init__': lambda self: super(cls, self).__init__(name, gdb.COMMAND_USER),
        'invoke': lambda self, *args, **kw: fn(*args, **kw),
    }
    cls = type(fn.__name__ + 'Cls', (gdb.Command,), dct)
    cls()
    return fn


class FrameId(object):
    def __init__(self, sp, pc):
        self._sp = sp
        self._pc = pc

    @property
    def sp(self):
        return self._sp

    @property
    def pc(self):
        return self._pc


class GoUnwinder(Unwinder):
    frameId = None

    def __init__(self):
        super(GoUnwinder, self).__init__("go-unwinder")
        self.enabled = False
        gdb.invalidate_cached_frames()

    def setFrameId(self, frame_id):
        self.enabled = True
        GoUnwinder.frameId = frame_id

    def __call__(self, pending_frame):
        pc = pending_frame.read_register("pc")
        sp = pending_frame.read_register("sp")

        unwind_info = pending_frame.create_unwind_info(FrameId(pc, sp))

        unwind_info.add_saved_register('rip', GoUnwinder.frameId.pc)
        unwind_info.add_saved_register('sp', GoUnwinder.frameId.sp)

        self.enabled = False
        return unwind_info


goUnwinder = GoUnwinder()
register_unwinder(None, goUnwinder, replace=True)


def btg1(g, m):
    # If the G is active on an M, find the thread of that M.
    if m is not None:
        for thr in gdb.selected_inferior().threads():
            if thr.ptid[1] == m['procid']:
                break
        else:
            thr = None
        if thr:
            # If this is the current goroutine, use a regular
            # backtrace since the saved state may be stale.
            curthr = gdb.selected_thread()
            try:
                thr.switch()
                cursp = gdb.parse_and_eval('$sp')
            finally:
                curthr.switch()
            if g['stack']['lo'] < cursp <= g['stack']['hi']:
                # if g['syscallsp'] != 0:
                #     sp, pc = g['syscallsp'], g['syscallpc']
                # else:
                #     sp, pc = g['sched']['sp'], g['sched']['pc']
                # goUnwinder.setFrameId(FrameID(sp, pc))
                gdb.execute('thread apply %d backtrace' % thr.num)
                return
        else:
            print("thread %d not found; stack may be incorrect (try import _ \"rutime/cgo\")" % m['procid'])

    # TODO: LR register on LR machines.
    if g['syscallsp'] != 0:
        sp, pc = g['syscallsp'], g['syscallpc']
    else:
        sp, pc = g['sched']['sp'], g['sched']['pc']

    goUnwinder.setFrameId(FrameId(sp, pc))
    gdb.invalidate_cached_frames()
    gdb.execute('backtrace')


#
# Go runtime helpers
#

class SliceValue:
    """Wrapper for slice values."""

    def __init__(self, val):
        self.val = val

    @property
    def len(self):
        return int(self.val['len'])

    @property
    def cap(self):
        return int(self.val['cap'])

    def __getitem__(self, i):
        if i < 0 or i >= self.len:
            raise IndexError(i)
        ptr = self.val["array"]
        return (ptr + i).dereference()


_Gdead = 6


def getg(sp=None, n=None) -> []:
    def checkg(GProc):
        if GProc == 0:
            return
        if GProc['atomicstatus'] == _Gdead:
            return
        if sp is not None and GProc['stack']['lo'] < sp <= GProc['stack']['hi']:
            if found[0] is not None:
                raise gdb.GdbError('multiple Gs with overlapping stacks!')
            found[0] = GProc
        if n is not None and GProc['goid'] == n:
            if found[0] is not None:
                raise gdb.GdbError('multiple Gs with same goid!')
            found[0] = GProc

    if sp is None and n is None:
        sp = gdb.parse_and_eval('$sp')

    found = [None]

    # Check allgs.
    for gp in SliceValue(gdb.parse_and_eval("'runtime.allgs'")):
        checkg(gp)

    # Check g0s and gsignals, which aren't on allgs.
    if sp is not None:
        mp = gdb.parse_and_eval("'runtime.allm'")
        while mp != 0:
            checkg(mp['g0'])
            checkg(mp['gsignal'])
            mp = mp['alllink']

    return found[0]


@command
def btg(arg, from_tty):
    """btg [g]: print a backtrace for G."""
    g = gdb.parse_and_eval(arg)

    if g is None or g == 0:
        print("no goroutine")
        return

    m = g['m']
    if m == 0:
        m = None
    else:
        if g == m['g0']:
            print("g0 stack:")
            btg1(g, m)
            print()
            g, m = m['curg'], None
        elif g == m['gsignal']:
            print("gsignal stack:")
            btg1(g, m)
            print()
            g, m = m['curg'], None

        if g == 0:
            print("no user goroutine")
            return

    print("goroutine %d stack:" % g['goid'])
    btg1(g, m)

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Debugging NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests