Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: GC: unexpected fault address 0x0 #5074

Closed
ugorji opened this issue Mar 18, 2013 · 17 comments
Closed

runtime: GC: unexpected fault address 0x0 #5074

ugorji opened this issue Mar 18, 2013 · 17 comments
Milestone

Comments

@ugorji
Copy link
Contributor

ugorji commented Mar 18, 2013

I updated to tip yesterday after a few days, and have been getting the following error
since. I just sync'ed again and the error is still there.

This was happening before, and then got fixed. Seems there has been a regression again.

unexpected fault address 0x0
fatal error: fault
[signal 0xb code=0x80 addr=0x0 pc=0x40c9ff]

goroutine 37 [running]:
[fp=0xc2000c3d20] runtime.throw(0x91e757)
    /opt/go-tip/src/pkg/runtime/panic.c:473 +0x67
[fp=0xc2000c3d38] runtime.sigpanic()
    /opt/go-tip/src/pkg/runtime/os_linux.c:237 +0xe7
[fp=0xc2000c3df8] flushptrbuf(0x7f6bc1e41390, 0xc2000c3f38, 0xc2000c41c8, 0xc2000c41c0,
0xc2000c41d0, ...)
    /opt/go-tip/src/pkg/runtime/mgc0.c:383 +0x13f
[fp=0xc2000c41c0] scanblock(0x7f6bc1ccc000, 0x7f6bc1ccc018, 0x0, 0xc2000c4100)
    /opt/go-tip/src/pkg/runtime/mgc0.c:997 +0x203
[fp=0xc2000c4210] markroot(0xc2000c6000, 0x1000000150)
    /opt/go-tip/src/pkg/runtime/mgc0.c:1175 +0xab
[fp=0xc2000c4288] runtime.parfordo(0xc2000c6000)
    /opt/go-tip/src/pkg/runtime/parfor.c:105 +0x9b
[fp=0xc2000c43b8] gc(0x7f6bc1dcd31c)
    /opt/go-tip/src/pkg/runtime/mgc0.c:1856 +0x298
----- stack segment boundary -----
[fp=0x7f6bc1dcd330] runtime.gc(0xc200000000)
    /opt/go-tip/src/pkg/runtime/mgc0.c:1784 +0x11b
[fp=0x7f6bc1dcd388] runtime.mallocgc(0x200, 0x100000000, 0xb0ab36e300000001)
    /opt/go-tip/src/pkg/runtime/zmalloc_linux_amd64.c:101 +0x1e4
[fp=0x7f6bc1dcd3a8] runtime.mal(0x1f0)
    /opt/go-tip/src/pkg/runtime/zmalloc_linux_amd64.c:611 +0x3f
[fp=0x7f6bc1dcd3e0] runtime.mapassign(0x64c480, 0xc20029d100, 0xc20029e1f0, 0xc2001bfe00)
    /opt/go-tip/src/pkg/runtime/hashmap.c:1022 +0xfd
[fp=0x7f6bc1dcd418] reflect.mapassign(0x64c480, 0xc20029d100, 0xc20029e1f0,
0xc2001bfe00, 0x1, ...)
    /opt/go-tip/src/pkg/runtime/hashmap.c:1107 +0x88
[fp=0x7f6bc1dcd488] reflect.Value.SetMapIndex(0x64c480, 0xc2001bf9c8, 0x156, 0x655b40,
0xc20029e1f0, ...)
    /opt/go-tip/src/pkg/reflect/value.go:1429 +0x24c

<<snip>>

go version: devel +bcb5f45aa10e Mon Mar 18 09:52:39 2013 -0700 linux/amd64

I reproduce it consistently when running a benchmark on my msgpack decoder. 

I want to file it now, sooner than later.

OS: Linux 3.5.0-25-generic #39-Ubuntu SMP x86_64 GNU/Linux

Thanks.
@minux
Copy link
Member

minux commented Mar 18, 2013

Comment 1:

could you please provide a small test case that demonstrate this problem?

@ugorji
Copy link
Contributor Author

ugorji commented Mar 18, 2013

Comment 2:

I tried writing a small reproducer, but I can't get it to fail. I'm not sure what
triggers the failure. 
It only occurs when running through the whole msgpack.Decode benchmark. All tests pass,
but running benchmark fails. The error occurs in the benchmark after the 92'th iteration
i.e.
    for i := 0; i < b.N; i++ {
        log(">>>> i: %v", i)
        // <snip>
    }
Only when i=92, then I get the error. So it seems to be an issue under some kind of load.
Instead of waiting till I could create a small reproducer, I decided to file it hoping
that something may jump out to folks familiar with the code.
If running the benchmark directly will help, you can use the following steps to
reproduce:
  git clone https://github.com/ugorji/go-msgpack .
  git checkout dev
  go test -bench '_Msgpack__Decode'
You may not see the error on your first run. The second time you run it (and
thereafter), you will.
Note: The benchmark above depends on:
    vmsgpack "github.com/vmihailenco/msgpack"
    "launchpad.net/mgo/bson"
You should 'go get' these libraries. Alternatively, you can edit the benchmark_test.go
to remove references to bson and vmsgpack.

@ianlancetaylor
Copy link
Contributor

Comment 3:

Labels changed: added priority-soon, go1.1, removed priority-triage.

@remyoudompheng
Copy link
Contributor

Comment 4:

ugorji: you should still provide the already edited version of benchmark_test.go that
doesn't reference bson and vmsgpack in order to help debugging experience.
I don't think anything jumps out: segfaults in new GC or runtime are common in late days
and don't point out at anything in particular.

@davecheney
Copy link
Contributor

@remyoudompheng
Copy link
Contributor

Comment 6:

The tests want to run a Python script that doesn't work on my machine, and the correct
import path for mgo is labix.org/v2/mgo, not launchpad.net/mgo. Can you correct that?

@ugorji
Copy link
Contributor Author

ugorji commented Mar 19, 2013

Comment 7:

Hi Remy,
I just saw your message. I had started working on separating both from it. I'd update
this Issue in a few minutes once uploaded.

@ugorji
Copy link
Contributor Author

ugorji commented Mar 19, 2013

Comment 8:

Hi Remy,
Just updated. Please can you try again.
I moved all the tests which require a local installation or external setup (go get, etc)
to an ext_dep_test.go, which is excluded by default (//+build ignore).
You should be able to re-fetch and run as before.
Thanks.

@remyoudompheng
Copy link
Contributor

Comment 9:

After a bisection session I obtained that:
* It is buggy for a long time, at least since ~13 Feb.
* For some reason network poller hid the problem:
WORKS:
16202   a45e271add6c   2013-03-12 21:39 +0400   dvyukov
  runtime: fix build
NOT WORKS:
16200   a364be6bc34f   2013-03-12 11:46 -0400   rsc
  encoding/xml: name space bug fixes
(16201 does not build)
* Then the problem reappeared as reported by ugorji:
NOT WORKS:
16243   7bcfc5879223   2013-03-14 13:48 +0400   dvyukov
  runtime: revert UseSpanType back to 1
WORKS:
16241   5af2130aec77   2013-03-14 10:38 +0400   dvyukov
  runtime: integrated network poller for darwin
(16242 does not build).
The revisions that work are exactly those where UseSpanType = 0 in runtime.h
(accidentally set by Dmitry when submitting runtime polling).

@gopherbot
Copy link

Comment 10:

Fix: https://golang.org/cl/7913043/
Please test.

Status changed to WaitingForReply.

@ugorji
Copy link
Contributor Author

ugorji commented Mar 19, 2013

Comment 11:

Tested a few times. Everything works now without errors.
LGTM.

@gopherbot
Copy link

Comment 12:

This issue was closed by revision 54dffda.

Status changed to Fixed.

@remyoudompheng
Copy link
Contributor

@rsc
Copy link
Contributor

rsc commented Mar 21, 2013

Comment 14:

Status changed to Accepted.

@gopherbot
Copy link

Comment 15:

This issue was fixed in two steps.
#13 has been fixed by revision 61fa5c7d741f
What was the proper procedure in a situation like this? Reopen the issue after seeing
#13 and then submit a CL containing "Fixes issue NNN" to close it again?

@gopherbot
Copy link

Comment 16:

This issue was closed by revision bf1f461.

Status changed to Fixed.

@rsc
Copy link
Contributor

rsc commented Mar 25, 2013

Comment 17:

Re comment #15, it is okay to check in multiple CLs with 'Fixes issue NNNN.' The issue
tracker will attach them to the issue correctly, though of course only the first will
change the status to Fixed. It is also okay to reopen the issue if you want a reminder
that something needs fixing, but if you've already got the fix ready it's okay to skip
that step and just mail the fix.
Thanks.

@rsc rsc added this to the Go1.1 milestone Apr 14, 2015
@rsc rsc removed the go1.1 label Apr 14, 2015
@golang golang locked and limited conversation to collaborators Jun 24, 2016
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants