Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: 1.6rc1: bad pointer in wrote barrier on Linux64 #14149

Closed
snadrus opened this issue Jan 29, 2016 · 6 comments
Closed

runtime: 1.6rc1: bad pointer in wrote barrier on Linux64 #14149

snadrus opened this issue Jan 29, 2016 · 6 comments
Milestone

Comments

@snadrus
Copy link

snadrus commented Jan 29, 2016

This is from an http server getting stress-tested for data retrieval (300conns, 18h). It has never happened on any previous version of GoLang (same codebase works fine on 1.5.2).

runtime: writebarrierptr *0x7fea78276540 = 0xc
fatal error: bad pointer in write barrier
runtime: writebarrierptr *0x7feb89f7b460 = 0xc
fatal error: bad pointer in write barrier

runtime stack:
runtime.throw(0xe447e0, 0x1c)
/usr/local/go/src/runtime/panic.go:530 +0x90 fp=0x7fea78276420 sp=0x7fea78276408
runtime.writebarrierptr.func1()
/usr/local/go/src/runtime/mbarrier.go:140 +0xb3 fp=0x7fea78276448 sp=0x7fea78276420
runtime.systemstack(0x7fea78276460)
/usr/local/go/src/runtime/asm_amd64.s:307 +0xab fp=0x7fea78276450 sp=0x7fea78276448
runtime.writebarrierptr(0x7fea78276540, 0xc)
/usr/local/go/src/runtime/mbarrier.go:141 +0x97 fp=0x7fea78276480 sp=0x7fea78276450
runtime.mmap.func1()
/usr/local/go/src/runtime/cgo_mmap.go:21 +0x78 fp=0x7fea782764b8 sp=0x7fea78276480
runtime.systemstack(0x7fea782764e8)
/usr/local/go/src/runtime/asm_amd64.s:307 +0xab fp=0x7fea782764c0 sp=0x7fea782764b8
runtime.mmap(0x0, 0x40000, 0x2200000003, 0xffffffff, 0xc)
/usr/local/go/src/runtime/cgo_mmap.go:22 +0x8f fp=0x7fea78276520 sp=0x7fea782764c0
runtime.sysAlloc(0x40000, 0x141a380, 0x7fcc0000003b)
/usr/local/go/src/runtime/mem_linux.go:58 +0x3b fp=0x7fea78276558 sp=0x7fea78276520
runtime.persistentalloc1(0x800, 0x40, 0x141a378, 0x7fcc0db00100)
/usr/local/go/src/runtime/malloc.go:932 +0x284 fp=0x7fea78276590 sp=0x7fea78276558
runtime.persistentalloc.func1()
/usr/local/go/src/runtime/malloc.go:890 +0x3b fp=0x7fea782765c0 sp=0x7fea78276590
runtime.systemstack(0x7fea782765d8)
/usr/local/go/src/runtime/asm_amd64.s:307 +0xab fp=0x7fea782765c8 sp=0x7fea782765c0
runtime.persistentalloc(0x800, 0x40, 0x141a378, 0x42ce6f)
/usr/local/go/src/runtime/malloc.go:891 +0x58 fp=0x7fea78276608 sp=0x7fea782765c8
runtime.getempty(0x65, 0x0)
/usr/local/go/src/runtime/mgcwork.go:347 +0x8c fp=0x7fea78276638 sp=0x7fea78276608
runtime.(_gcWork).init(0xc820016720)
/usr/local/go/src/runtime/mgcwork.go:97 +0x20 fp=0x7fea78276650 sp=0x7fea78276638
runtime.(_gcWork).put(0xc820016720, 0xc823a390a0)
/usr/local/go/src/runtime/mgcwork.go:113 +0x38 fp=0x7fea78276670 sp=0x7fea78276650
runtime.greyobject(0xc823a390a0, 0xc83d25c9b0, 0x0, 0xc81fe2e37a, 0xc800000000, 0x7feb9212d258, 0xc820016720)
/usr/local/go/src/runtime/mgcmark.go:1090 +0x2f1 fp=0x7fea782766f8 sp=0x7fea78276670
runtime.scanblock(0xc83d25c9b0, 0x20, 0xed59a0, 0xc820016720)
/usr/local/go/src/runtime/mgcmark.go:947 +0x15e fp=0x7fea78276780 sp=0x7fea782766f8
runtime.scanframeworker(0x7fea78276a68, 0x7fea78276b60, 0xc820016720)
/usr/local/go/src/runtime/mgcmark.go:762 +0x1cc fp=0x7fea78276820 sp=0x7fea78276780
runtime.scanstack.func1(0x7fea78276a68, 0x0, 0x7fea78276801)
/usr/local/go/src/runtime/mgcmark.go:659 +0x64 fp=0x7fea78276868 sp=0x7fea78276820
runtime.gentraceback(0x432ef3, 0xc83d25c900, 0x0, 0xc82bd2b200, 0x0, 0x0, 0x7fffffff, 0x7fea78276c88, 0x0, 0x0, ...)
/usr/local/go/src/runtime/traceback.go:369 +0xd95 fp=0x7fea78276ac0 sp=0x7fea78276868
runtime.scanstack(0xc82bd2b200)
/usr/local/go/src/runtime/mgcmark.go:682 +0x3cc fp=0x7fea78276cd0 sp=0x7fea78276ac0
runtime.scang(0xc82bd2b200)
/usr/local/go/src/runtime/proc.go:797 +0x96 fp=0x7fea78276cf0 sp=0x7fea78276cd0
runtime.markroot.func1()
/usr/local/go/src/runtime/mgcmark.go:189 +0xac fp=0x7fea78276d20 sp=0x7fea78276cf0
runtime.systemstack(0x7fea78276da0)
/usr/local/go/src/runtime/asm_amd64.s:307 +0xab fp=0x7fea78276d28 sp=0x7fea78276d20
runtime.markroot(0x9f9)
/usr/local/go/src/runtime/mgcmark.go:194 +0x31d fp=0x7fea78276db8 sp=0x7fea78276d28
runtime.gcDrain(0x7fea78276e28, 0x0)
/usr/local/go/src/runtime/mgcmark.go:812 +0x241 fp=0x7fea78276df0 sp=0x7fea78276db8
runtime.gcMark(0x6fd4ef1a5f09c)
/usr/local/go/src/runtime/mgc.go:1559 +0xf9 fp=0x7fea78276e50 sp=0x7fea78276df0
runtime.gcMarkTermination.func1()
/usr/local/go/src/runtime/mgc.go:1173 +0x23 fp=0x7fea78276e60 sp=0x7fea78276e50
runtime.systemstack(0xc820015500)
/usr/local/go/src/runtime/asm_amd64.s:291 +0x79 fp=0x7fea78276e68 sp=0x7fea78276e60
runtime.mstart()
/usr/local/go/src/runtime/proc.go:1048 fp=0x7fea78276e70 sp=0x7fea78276e68

goroutine 20 [garbage collection]:
runtime.systemstack_switch()
/usr/local/go/src/runtime/asm_amd64.s:245 fp=0xc820018d48 sp=0xc820018d40
runtime.gcMarkTermination()
/usr/local/go/src/runtime/mgc.go:1181 +0x136 fp=0xc820018f58 sp=0xc820018d48
runtime.gcStart(0x2, 0xadc400)
/usr/local/go/src/runtime/mgc.go:1017 +0x399 fp=0xc820018f80 sp=0xc820018f58
runtime/debug.freeOSMemory()
/usr/local/go/src/runtime/mheap.go:874 +0x25 fp=0xc820018f98 sp=0xc820018f80
runtime/debug.FreeOSMemory()
/usr/local/go/src/runtime/debug/garbage.go:108 +0x14 fp=0xc820018fa0 sp=0xc820018f98
main.main.func1(0xc820256000)
/var/lib/jenkins/workspace/nebula/.build/go/src/git2.co.bitcasa.com/go/nebula.git/cmd/nebula-supervisor/nebula-supervisor.go:92 +0x50 fp=0xc820018fb8 sp=0xc820018fa0
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc820018fc0 sp=0xc820018fb8
created by main.main
/var/lib/jenkins/workspace/nebula/.build/go/src/git2.co.bitcasa.com/go/nebula.git/cmd/nebula-supervisor/nebula-supervisor.go:95 +0xf4b

@bradfitz
Copy link
Contributor

Is the code available somewhere?

Do you use unsafe? Do you use cgo?

Does the code pass with the race detector?

@snadrus
Copy link
Author

snadrus commented Jan 29, 2016

It's proprietary code, but I'm the lead and have considerable latitude for
whatever is needed.

I haven't stress tested with the race detector, but no problems during unit
tests or light server usage.

The only one cgo call is to zlib compress which writes to go-allocated
memory synchronously.

No unsafe other than at server start (changing process name).

I can test whatever you recommend.

Thanks,
Andrew Jackson.
On Jan 29, 2016 12:06 PM, "Brad Fitzpatrick" notifications@github.com
wrote:

Is the code available somewhere?

Do you use unsafe? Do you use cgo?

Does the code pass with the race detector?


Reply to this email directly or view it on GitHub
#14149 (comment).

@ianlancetaylor ianlancetaylor changed the title Seeing a crash of 1.6rc1 Linux64 runtime: 1.6rc1: bad pointer in wrote barrier on Linux64 Jan 29, 2016
@ianlancetaylor
Copy link
Contributor

Does Linux64 mean linux-amd64?

@ianlancetaylor ianlancetaylor added this to the Go1.6 milestone Jan 29, 2016
@ianlancetaylor
Copy link
Contributor

This error means that you have somehow gotten a pointer whose value is 0xc. The Go runtime is rejecting that pointer value as being invalid.

OK, I think this is a bug in the support I added for calling mmap when using cgo (https://golang.org/cl/15170). The call to mmap is failing, with errno 12 (ENOMEM). But rather than being handled in a reasonable way, it is causing your program to crash.

Note that even when we fix this, your program is still going to crash, with "runtime: cannot allocate memory". I don't know what changed in Go 1.6 to cause your program to run out of memory.

@snadrus
Copy link
Author

snadrus commented Jan 29, 2016

Yes, amd64. This is more specifically an AWS EC2 with an Ubuntu AMI.

I'd checked the pprof stats 1 hour before the crash and it used only 480mb
of 16gb with nothing else happening on the machine. The load was
consistent, so I'm surprised to hear it's an OOM situation. Though it did
hit a ulimit crash during stressing on Golang 1.5.


Reply to this email directly or view it on GitHub
#14149 (comment).

@gopherbot
Copy link

CL https://golang.org/cl/19084 mentions this issue.

@golang golang locked and limited conversation to collaborators Feb 3, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants