Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: invalid GC instruction 0xd183f0 at 0xc208217f28 #7748

Closed
davecheney opened this issue Apr 10, 2014 · 8 comments
Closed

runtime: invalid GC instruction 0xd183f0 at 0xc208217f28 #7748

davecheney opened this issue Apr 10, 2014 · 8 comments
Milestone

Comments

@davecheney
Copy link
Contributor

What steps will reproduce the problem?
1. sudo apt-get install mongodb-server
2. go get -u launchpad.net/juju-core/...
3. go test launchpad.net/juju-core/worker/upgrader

What is the expected output? What do you see instead?

runtime: invalid GC instruction 0xd183f0 at 0xc208217f28
fatal error: scanblock: invalid GC instruction

runtime stack:
runtime.throw(0x11c00e9)
    /home/dfc/go/src/pkg/runtime/panic.c:520 +0x69 fp=0x7fc6f4f8f8d8
scanblock(0x7fc6f71c0000, 0x7fc6f4f8fc00)
    /home/dfc/go/src/pkg/runtime/mgc0.c:1139 +0x375 fp=0x7fc6f4f8fc00
markroot(0xc20800c900, 0x54)
    /home/dfc/go/src/pkg/runtime/mgc0.c:1333 +0xd9 fp=0x7fc6f4f8fc80
runtime.parfordo(0xc20800c900)
    /home/dfc/go/src/pkg/runtime/parfor.c:88 +0xa3 fp=0x7fc6f4f8fcf8
gc(0x7fc6f477e278)
    /home/dfc/go/src/pkg/runtime/mgc0.c:2403 +0x1d6 fp=0x7fc6f4f8fe20
mgc(0xc2081018c0)
    /home/dfc/go/src/pkg/runtime/mgc0.c:2345 +0x2e fp=0x7fc6f4f8fe30
runtime.mcall(0x9a56cc)
    /home/dfc/go/src/pkg/runtime/asm_amd64.s:181 +0x4b fp=0x7fc6f4f8fe40

Please use labels and text to provide additional information.

Full panic message: http://paste.ubuntu.com/7229208/

% uname -a
Linux lucky 3.11.0-19-generic #33-Ubuntu SMP Tue Mar 11 18:48:34 UTC 2014 x86_64 x86_64
x86_64 GNU/Linux
% go version
go version devel +94d84d24086b Wed Apr 09 18:23:53 2014 -0700 + linux/amd64
@dvyukov
Copy link
Member

dvyukov commented Apr 15, 2014

Comment 3:

Currently the build fails with:
$ go test launchpad.net/juju-core/cmd/jujud
# launchpad.net/juju-core/utils/ssh
/ssd/src/gopath/src/launchpad.net/juju-core/utils/ssh/ssh_gocrypto.go:84: undefined:
ssh.ClientConn
FAIL    launchpad.net/juju-core/cmd/jujud [build failed]
I've updated go.crypto to r191, which roughly relates to the report date.
Then it started failing with:
$ go test launchpad.net/juju-core/cmd/jujud
# launchpad.net/juju-core/worker/instancepoller
/ssd/src/gopath/src/launchpad.net/juju-core/worker/instancepoller/aggregate.go:67:
undefined: ratelimit.New
FAIL    launchpad.net/juju-core/cmd/jujud [build failed]
I've reverted launchpad.net/juju-core to r2597, which roughly relates to the report date.
But it still fails to build.
Please provide exact revisions that build.

@davecheney
Copy link
Contributor Author

Comment 4:

I'm sorry. I neglected to mention that we use a tool we developed to pin dependency
versions. 
Please try
$ go get launchpad.net/godeps
$ cd $GOPATH/src/launchpad.net/juju-core
$ godeps -u dependencies.tsv 
There will be a bunch of output as godeps updates each dependency to the correct revision

@gopherbot
Copy link
Contributor

Comment 5:

CL https://golang.org/cl/88100048 mentions this issue.

@davecheney
Copy link
Contributor Author

Comment 6:

LGTM. This fixes the observed crash.

@rsc
Copy link
Contributor

rsc commented Apr 16, 2014

Comment 7:

I narrowed down the problem by recompiling the runtime with Debug=3 in mgc0.c and then I
ran
     ./upgrader.test 2>&1 | gzip > upgrader.log.gz
With Debug=3 there's a lot of repetitive output; the gzip is a trick to reduce I/O and
disk requirements, often dramatically. This one didn't generate too much output in the
end. I wanted the full scan trace, bug setting Debug=3 also enabled some sanity checks
that found the mismatched type information even earlier and gave a mostly reasonable
message. With some tweaks, the message is:
scanblock 0xc2084d0120 32 type 0xc1a960 common.ToolsGetter pc=0xa2aca0
gc_iface @0xc2084d0130: 0x7f8c9c97a7d8/0x0 0xc208213200
gc_aptr @0xc2084d0130: 0xc2081cc140
scanblock 0xc208090190 80 type 0xc208182e80
methodargs(upgrader.Upgrader)(func(params.Entities) (params.ToolsResults, error))
pc=0xc208094b00
gc_ptr @0xc208090190: 0xc208748090 ti=0x9f1160
invalid gc type info for 'upgrader.UpgraderAPI', type info 0x9f1160 [1]=0x9, block info
0xa90680 [1]=0x1
fatal error: invalid gc type info
There's no direct way to map from the gc program to the name of the enclosing type, but
the enclosing type structure holds the gc pointer next to the string name pointer, so if
you can find the gc pointer in the type structure, the string pointer is next:
(gdb) find &rodata, &erodata, 0x9f1160
0xa0c5f8
0xc313b8
2 patterns found.
(gdb) x/2xg 0xa0c5f8
0xa0c5f8:   0x00000000009f1160  0x0000000000000000 # not this one
(gdb) x/2xg 0xc313b8
0xc313b8:   0x00000000009f1160  0x0000000000d9a6e0 # must be this one
(gdb) x/2xg 0xd9a6e0
0xd9a6e0:   0x0000000000d9a6f0  0x0000000000000011 # there's the string
(gdb) x/17xc 0xd9a6f0
0xd9a6f0:   117 'u' 112 'p' 103 'g' 114 'r' 97 'a'  100 'd' 101 'e' 114 'r'
0xd9a6f8:   46 '.'  85 'U'  112 'p' 103 'g' 114 'r' 97 'a'  100 'd' 101 'e'
0xd9a700:   114 'r'
(gdb)
(gdb) find &rodata, &erodata, 0xa90680
0xa0c638
0xcac138
2 patterns found.
(gdb) x/2xg 0xa0c638
0xa0c638:   0x0000000000a90680  0x0000000000000000 # again, not this one
(gdb) x/2xg 0xcac138
0xcac138:   0x0000000000a90680  0x0000000000d9a720 # yes, this one (it's almost always the
last one)
(gdb) x/2xg 0xd9a720
0xd9a720:   0x0000000000d9a730  0x0000000000000014
(gdb) x/20xc 0xd9a730
0xd9a730:   117 'u' 112 'p' 103 'g' 114 'r' 97 'a'  100 'd' 101 'e' 114 'r'
0xd9a738:   46 '.'  85 'U'  112 'p' 103 'g' 114 'r' 97 'a'  100 'd' 101 'e'
0xd9a740:   114 'r' 65 'A'  80 'P'  73 'I'
(gdb)
As it happens, I trust the 'block info', which is registered during malloc, much more
these days than I trust the 'type info' obtained directly from the GC program, which has
had a variety of bugs in corner cases. The block info says this pointer is an
updater.UpdaterAPI, while the type info says it is an updater.Updater Looking at the
code, the former is a struct and the latter is an interface. Knowing that
updater.Updater is an interface makes the lines above the crash suspect:
methodargs(upgrader.Upgrader)(func(params.Entities) is strange because most methods take
a concrete value as the receiver, not an interface. (The exception is something like
io.Reader.Read, which is a func(io.Reader, []byte) (int, error).) And that's the bug:
reflect is writing down a function frame with an interface-typed receiver instead of
using the type of the thing in the interface.
On the Ubuntu VM you set up, if I run the test it fails >>50% of the time. With CL
88100048 applied, it passes 20x in a row. Pretty sure that's the fix.

Status changed to Started.

@gopherbot
Copy link
Contributor

Comment 8:

CL https://golang.org/cl/88090045 mentions this issue.

@dvyukov
Copy link
Member

dvyukov commented Apr 16, 2014

Comment 9:

> I trust the 'block info', which is registered during malloc, much more these days than
I trust the 'type info' obtained directly from the GC program
seconded
I've added that type info check when I debugged series of similar divergences.
Unfortunately the check has false positives, there are legal divergences (legal in the
sense that the current implementation allows that divergences and then specifically
works around them).

@rsc
Copy link
Contributor

rsc commented Apr 16, 2014

Comment 10:

This issue was closed by revision fcf8a77.

Status changed to Fixed.

@rsc rsc added this to the Go1.3 milestone Apr 14, 2015
@rsc rsc removed the release-go1.3 label Apr 14, 2015
@golang golang locked and limited conversation to collaborators Jun 25, 2016
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants