Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime, x/mobile: go1.9 causes memory issue on Android #21044

Closed
achojoao opened this issue Jul 17, 2017 · 20 comments
Closed

runtime, x/mobile: go1.9 causes memory issue on Android #21044

achojoao opened this issue Jul 17, 2017 · 20 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@achojoao
Copy link

What version of Go are you using (go version)?

go version go1.9beta2 darwin/amd64 (same problem detected using go version go1.9beta1 darwin/amd64)

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/jvicente/go"
GORACE=""
GOROOT="/Users/jvicente/sdk/go1.9beta2"
GOTOOLDIR="/Users/jvicente/sdk/go1.9beta2/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/hk/wxyjvmbs4j3bbh_cfpr6j0c80000gn/T/go-build179128729=/tmp/go-build -gno-record-gcc-switches -fno-common"
CXX="clang++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"

What did you do?

We have a custom native library implemented in Go and we are using it (via NDK-JNI) in an Android library project using gomobile. The final library is used in two completely different apps. We were using Go 1.8.3 version and it worked ok, until we had a problem on several Android devices (mostly Samsung devices) with Android 7 regarding PTHREAD_MAX. That problem was fixed in #19472 (reference), so we tried the new betas, 1.9beta1 and lately, 1.9beta2. The error with those Samsung with Android 7 have disappeared, but, now we have reports from users with Android 5 to 6.0.1 devices having memory issues in one of the apps.

List of devices with reported failures:

Samsung Galaxy S5 - Android 6.0.1 - ARMv7 32-bit
Samsung Galaxy A3 - Android 6.0.1 - ARMv8-A 64-bit
BQ Aquaris M5 - Android 6.0.1 - ARMv8-A 64-bit
Motorola XT1562 - Android 6.0.1 - ARMv8-A 64-bit
Samsung S4 - Android: 5.0.1 - ARMv7 32-bit

As yo can see below, we are obtaining two different errors: "index out of range" and "out of memory". Both errors appears the first time I call any method of the Go library via JNI. I'm very sure we are calling from the Java side to the native libraries. I have put some logs in the Go classes and methods, but I can't see any in the Android console, so no Go code from my library is being executed. I guess errors are throwed during the initialization process of the Go environment.

I've managed to get the failing Samsung Galaxy A3. Weird thing, from my point of view, is that I've tried both apps in this device and is failing only in one and not in the other. Both apps are using the same library. Failing one is heavier than the other (both in RAM use and HD) but analizing memory dump and other data I can't see anything out of normal (see attached files). I mean, it looks like there is enough free memory.

failing_app
working_app

What did you expect to see?

Normal execution of the app

What did you see instead?

After the errors occurred, the app is automatically crashing and closing.

For Samsung A3 (Android 6.0.1)

07-11 11:05:45.439 24692 25169 E Go : fatal error: index out of range
07-11 11:05:45.439 24692 25169 E Go : runtime stack:
07-11 11:05:45.449 24692 25169 E Go : runtime.throw(0x90074853, 0x12)
07-11 11:05:45.449 24692 25169 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/panic.go:605 +0x78 fp=0x8ff89680 sp=0x8ff89674 pc=0x8ffe0554
07-11 11:05:45.449 24692 25169 E Go : runtime.panicCheckMalloc(0x9017c870, 0x90179430)
07-11 11:05:45.449 24692 25169 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/panic.go:20 +0x60 fp=0x8ff89690 sp=0x8ff89680 pc=0x8ffdeacc
07-11 11:05:45.449 24692 25169 E Go : runtime.panicindex()
07-11 11:05:45.449 24692 25169 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/panic.go:27 +0x34 fp=0x8ff8969c sp=0x8ff89690 pc=0x8ffdeb44
07-11 11:05:45.449 24692 25169 E Go : runtime.(*mheap).grow(0x90191558, 0x8, 0x0)
07-11 11:05:45.449 24692 25169 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/mheap.go:905 +0x2b0 fp=0x8ff896d0 sp=0x8ff8969c pc=0x8ffd8e38
07-11 11:05:45.449 24692 25169 E Go : runtime.(*mheap).allocSpanLocked(0x90191558, 0x1, 0x9019dae0, 0x8fffd1a4)
07-11 11:05:45.449 24692 25169 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/mheap.go:795 +0x32c fp=0x8ff896f0 sp=0x8ff896d0 pc=0x8ffd8a08
07-11 11:05:45.449 24692 25169 E Go : runtime.(*mheap).alloc_m(0x90191558, 0x1, 0x9017001e, 0x3cc3)
07-11 11:05:45.449 24692 25169 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/mheap.go:661 +0x144 fp=0x8ff89710 sp=0x8ff896f0 pc=0x8ffd8198
07-11 11:05:45.449 24692 25169 E Go : runtime.(*mheap).alloc.func1()
07-11 11:05:45.449 24692 25169 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/mheap.go:728 +0x3c fp=0x8ff89728 sp=0x8ff89710 pc=0x90009034
07-11 11:05:45.449 24692 25169 E Go : runtime.systemstack(0x8ff8973c)
07-11 11:05:45.449 24692 25169 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/asm_arm.s:278 +0xb4 fp=0x8ff8972c sp=0x8ff89728 pc=0x9000abf4
07-11 11:05:45.449 24692 25169 E Go : runtime.(*mheap).alloc(0x90191558, 0x1, 0x1001e, 0x1)
07-11 11:05:45.449 24692 25169 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/mheap.go:727 +0x64 fp=0x8ff89750 sp=0x8ff8972c pc=0x8ffd8570
07-11 11:05:45.449 24692 25169 E Go : runtime.(*mcentral).grow(0x901927b8, 0x0)
07-11 11:05:45.449 24692 25169 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/mcentral.go:232 +0x98 fp=0x8ff89774 sp=0x8ff89750 pc=0x8ffc8ae0
07-11 11:05:45.449 24692 25169 E Go : runtime.(*mcentral).cacheSpan(0x901927b8, 0x8ffb5c04)
07-11 11:05:45.449 24692 25169 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/mcentral.go:106 +0x374 fp=0x8ff897a0 sp=0x8ff89774 pc=0x8ffc85a0
07-11 11:05:45.449 24692 25169 E Go : runtime.(*mcache).refill(0x994c7000, 0x3c01e, 0x8ffc5148)
07-11 11:05:45.449 24692 25169 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/mcache.go:123 +0x84 fp=0x8ff897b8 sp=0x8ff897a0 pc=0x8ffc80ac
07-11 11:05:45.449 24692 25169 E Go : runtime.(*mcache).nextFree.func1()
07-11 11:05:45.449 24692 25169 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/malloc.go:553 +0x24 fp=0x8ff897c8 sp=0x8ff897b8 pc=0x90007f58
07-11 11:05:45.449 24692 25169 E Go : runtime.systemstack(0x8ff897ec)
07-11 11:05:45.449 24692 25169 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/asm_arm.s:278 +0xb4 fp=0x8ff897cc sp=0x8ff897c8 pc=0x9000abf4
07-11 11:05:45.449 24692 25169 E Go : runtime.(*mcache).nextFree(0x994c7000, 0x1e, 0x0, 0x9019db20, 0x994c7000)
07-11 11:05:45.449 24692 25169 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/malloc.go:552 +0x90 fp=0x8ff897f8 sp=0x8ff897cc pc=0x8ffc40c4
07-11 11:05:45.449 24692 25169 E Go : runtime.mallocgc(0xe0, 0x90118660, 0x1, 0x1)
07-11 11:05:45.449 24692 25169 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/malloc.go:707 +0x658 fp=0x8ff89854 sp=0x8ff897f8 pc=0x8ffc490c
07-11 11:05:45.449 24692 25169 E Go : runtime.newobject(0x90118660, 0x8ffe3348)
07-11 11:05:45.449 24692 25169 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/malloc.go:836 +0x2c fp=0x8ff89868 sp=0x8ff89854 pc=0x8ffc4c1c
07-11 11:05:45.449 24692 25169 E Go : runtime.malg(0x8000, 0x4ca9094d)
07-11 11:05:45.449 24692 25169 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/proc.go:2884 +0x20 fp=0x8ff89884 sp=0x8ff89868 pc=0x8ffe939c
07-11 11:05:45.449 24692 25169 E Go : runtime.mpreinit(0x9018bf38)
07-11 11:05:45.449 24692 25169 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/os_linux.go:302 +0x1c fp=0x8ff89894 sp=0x8ff89884 pc=0x8ffde6fc
07-11 11:05:45.449 24692 25169 E Go : runtime.mcommoninit(0x9018bf38)
07-11 11:05:45.449 24692 25169 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/proc.go:536 +0xa4 fp=0x8ff898ac sp=0x8ff89894 pc=0x8ffe3388
07-11 11:05:45.449 24692 25169 E Go : runtime.schedinit()
07-11 11:05:45.449 24692 25169 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/proc.go:472 +0x44 fp=0x8ff898c8 sp=0x8ff898ac pc=0x8ffe2e90
07-11 11:05:45.449 24692 25169 E Go : runtime.rt0_go(0x1, 0x9017c3e8, 0xb6c8680f, 0xb6c867f5, 0x8ff89938, 0xb6c867f5, 0xb6c5e317, 0x8ff89978, 0x0, 0x929cf930, ...)
07-11 11:05:45.449 24692 25169 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/asm_arm.s:61 +0x8c fp=0x8ff89908 sp=0x8ff898c8 pc=0x9000a9b4

For Samsung Galaxy S5 (Android 6.0.1)

07-11 11:41:18.474 23471 24803 E Go : runtime: out of memory: cannot allocate 65536-byte block (0 in use)
07-11 11:41:18.474 23471 24803 E Go : fatal error: out of memory
07-11 11:41:18.544 23471 24803 E Go : runtime stack:
07-11 11:41:18.544 23471 24803 E Go : runtime.throw(0x89983cef, 0xd)
07-11 11:41:18.544 23471 24803 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/panic.go:605 +0x78 fp=0x884927a0 sp=0x88492794 pc=0x898f0554
07-11 11:41:18.544 23471 24803 E Go : runtime.(*mcache).refill(0x91b33000, 0x3c01e, 0x898d5148)
07-11 11:41:18.544 23471 24803 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/mcache.go:125 +0xfc fp=0x884927b8 sp=0x884927a0 pc=0x898d8124
07-11 11:41:18.544 23471 24803 E Go : runtime.(*mcache).nextFree.func1()
07-11 11:41:18.544 23471 24803 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/malloc.go:553 +0x24 fp=0x884927c8 sp=0x884927b8 pc=0x89917f58
07-11 11:41:18.544 23471 24803 E Go : runtime.systemstack(0x884927ec)
07-11 11:41:18.544 23471 24803 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/asm_arm.s:278 +0xb4 fp=0x884927cc sp=0x884927c8 pc=0x8991abf4
07-11 11:41:18.544 23471 24803 E Go : runtime.(*mcache).nextFree(0x91b33000, 0x1e, 0x0, 0x89aadb20, 0x91b33000)
07-11 11:41:18.544 23471 24803 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/malloc.go:552 +0x90 fp=0x884927f8 sp=0x884927cc pc=0x898d40c4
07-11 11:41:18.544 23471 24803 E Go : runtime.mallocgc(0xe0, 0x89a28660, 0x1, 0x1)
07-11 11:41:18.544 23471 24803 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/malloc.go:707 +0x658 fp=0x88492854 sp=0x884927f8 pc=0x898d490c
07-11 11:41:18.544 23471 24803 E Go : runtime.newobject(0x89a28660, 0x898f3348)
07-11 11:41:18.544 23471 24803 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/malloc.go:836 +0x2c fp=0x88492868 sp=0x88492854 pc=0x898d4c1c
07-11 11:41:18.544 23471 24803 E Go : runtime.malg(0x8000, 0x14996bc1)
07-11 11:41:18.544 23471 24803 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/proc.go:2884 +0x20 fp=0x88492884 sp=0x88492868 pc=0x898f939c
07-11 11:41:18.544 23471 24803 E Go : runtime.mpreinit(0x89a9bf38)
07-11 11:41:18.544 23471 24803 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/os_linux.go:302 +0x1c fp=0x88492894 sp=0x88492884 pc=0x898ee6fc
07-11 11:41:18.544 23471 24803 E Go : runtime.mcommoninit(0x89a9bf38)
07-11 11:41:18.544 23471 24803 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/proc.go:536 +0xa4 fp=0x884928ac sp=0x88492894 pc=0x898f3388
07-11 11:41:18.544 23471 24803 E Go : runtime.schedinit()
07-11 11:41:18.544 23471 24803 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/proc.go:472 +0x44 fp=0x884928c8 sp=0x884928ac pc=0x898f2e90
07-11 11:41:18.544 23471 24803 E Go : runtime.rt0_go(0x1, 0x89a8c3e8, 0xb6c9750b, 0xb6c974f1, 0x88492938, 0xb6c974f1, 0xb6c71c95, 0x88492978, 0x0, 0x82703930, ...)
07-11 11:41:18.544 23471 24803 E Go : /Users/jvicente/sdk/go1.9beta2/src/runtime/asm_arm.s:61 +0x8c fp=0x88492908 sp=0x884928c8 pc=0x8991a9b4

@ALTree ALTree changed the title go 1.9beta1/1.9beta2 and gomobile cause memory issue on android runtime: go1.9 and gomobile cause memory issue on android Jul 17, 2017
@ALTree ALTree added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jul 17, 2017
@achojoao
Copy link
Author

achojoao commented Jul 17, 2017

I've just made another test, now in the working app. I'm loading a background image and raising the size of it everytime. When the size of this image is above 1,8mb or the size of the app is above 10,5mb, the app also crash with the same error, using the library compiled with 1.9betaX. Library compiled with 1.8.3 is working perfectly.

@odeke-em
Copy link
Member

So this is a regression from Go1.8, I'll ping the gomobile and runtime folks.
/cc @hyangah @rakyll @crawshaw @aclements @RLH

@odeke-em odeke-em modified the milestones: Go1.9, Go1.9Maybe Jul 21, 2017
@odeke-em
Copy link
Member

I've marked it for Go1.9Maybe, @achojoao it'd have been nice to report such bugs earlier on :) That way we aren't too close to Go1.9's release

@odeke-em odeke-em changed the title runtime: go1.9 and gomobile cause memory issue on android runtime, x/mobile: go1.9 causes memory issue on Android Jul 21, 2017
@aclements
Copy link
Member

For ease of reading, here are the cleaned-up tracebacks:

fatal error: index out of range
runtime stack:
runtime.throw(0x90074853, 0x12)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/panic.go:605 +0x78 fp=0x8ff89680 sp=0x8ff89674 pc=0x8ffe0554
runtime.panicCheckMalloc(0x9017c870, 0x90179430)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/panic.go:20 +0x60 fp=0x8ff89690 sp=0x8ff89680 pc=0x8ffdeacc
runtime.panicindex()
	/Users/jvicente/sdk/go1.9beta2/src/runtime/panic.go:27 +0x34 fp=0x8ff8969c sp=0x8ff89690 pc=0x8ffdeb44
runtime.(*mheap).grow(0x90191558, 0x8, 0x0)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/mheap.go:905 +0x2b0 fp=0x8ff896d0 sp=0x8ff8969c pc=0x8ffd8e38
runtime.(*mheap).allocSpanLocked(0x90191558, 0x1, 0x9019dae0, 0x8fffd1a4)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/mheap.go:795 +0x32c fp=0x8ff896f0 sp=0x8ff896d0 pc=0x8ffd8a08
runtime.(*mheap).alloc_m(0x90191558, 0x1, 0x9017001e, 0x3cc3)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/mheap.go:661 +0x144 fp=0x8ff89710 sp=0x8ff896f0 pc=0x8ffd8198
runtime.(*mheap).alloc.func1()
	/Users/jvicente/sdk/go1.9beta2/src/runtime/mheap.go:728 +0x3c fp=0x8ff89728 sp=0x8ff89710 pc=0x90009034
runtime.systemstack(0x8ff8973c)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/asm_arm.s:278 +0xb4 fp=0x8ff8972c sp=0x8ff89728 pc=0x9000abf4
runtime.(*mheap).alloc(0x90191558, 0x1, 0x1001e, 0x1)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/mheap.go:727 +0x64 fp=0x8ff89750 sp=0x8ff8972c pc=0x8ffd8570
runtime.(*mcentral).grow(0x901927b8, 0x0)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/mcentral.go:232 +0x98 fp=0x8ff89774 sp=0x8ff89750 pc=0x8ffc8ae0
runtime.(*mcentral).cacheSpan(0x901927b8, 0x8ffb5c04)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/mcentral.go:106 +0x374 fp=0x8ff897a0 sp=0x8ff89774 pc=0x8ffc85a0
runtime.(*mcache).refill(0x994c7000, 0x3c01e, 0x8ffc5148)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/mcache.go:123 +0x84 fp=0x8ff897b8 sp=0x8ff897a0 pc=0x8ffc80ac
runtime.(*mcache).nextFree.func1()
	/Users/jvicente/sdk/go1.9beta2/src/runtime/malloc.go:553 +0x24 fp=0x8ff897c8 sp=0x8ff897b8 pc=0x90007f58
runtime.systemstack(0x8ff897ec)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/asm_arm.s:278 +0xb4 fp=0x8ff897cc sp=0x8ff897c8 pc=0x9000abf4
runtime.(*mcache).nextFree(0x994c7000, 0x1e, 0x0, 0x9019db20, 0x994c7000)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/malloc.go:552 +0x90 fp=0x8ff897f8 sp=0x8ff897cc pc=0x8ffc40c4
runtime.mallocgc(0xe0, 0x90118660, 0x1, 0x1)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/malloc.go:707 +0x658 fp=0x8ff89854 sp=0x8ff897f8 pc=0x8ffc490c
runtime.newobject(0x90118660, 0x8ffe3348)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/malloc.go:836 +0x2c fp=0x8ff89868 sp=0x8ff89854 pc=0x8ffc4c1c
runtime.malg(0x8000, 0x4ca9094d)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/proc.go:2884 +0x20 fp=0x8ff89884 sp=0x8ff89868 pc=0x8ffe939c
runtime.mpreinit(0x9018bf38)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/os_linux.go:302 +0x1c fp=0x8ff89894 sp=0x8ff89884 pc=0x8ffde6fc
runtime.mcommoninit(0x9018bf38)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/proc.go:536 +0xa4 fp=0x8ff898ac sp=0x8ff89894 pc=0x8ffe3388
runtime.schedinit()
	/Users/jvicente/sdk/go1.9beta2/src/runtime/proc.go:472 +0x44 fp=0x8ff898c8 sp=0x8ff898ac pc=0x8ffe2e90
runtime.rt0_go(0x1, 0x9017c3e8, 0xb6c8680f, 0xb6c867f5, 0x8ff89938, 0xb6c867f5, 0xb6c5e317, 0x8ff89978, 0x0, 0x929cf930, ...)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/asm_arm.s:61 +0x8c fp=0x8ff89908 sp=0x8ff898c8 pc=0x9000a9b4
runtime: out of memory: cannot allocate 65536-byte block (0 in use)
fatal error: out of memory
runtime stack:
runtime.throw(0x89983cef, 0xd)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/panic.go:605 +0x78 fp=0x884927a0 sp=0x88492794 pc=0x898f0554
runtime.(*mcache).refill(0x91b33000, 0x3c01e, 0x898d5148)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/mcache.go:125 +0xfc fp=0x884927b8 sp=0x884927a0 pc=0x898d8124
runtime.(*mcache).nextFree.func1()
	/Users/jvicente/sdk/go1.9beta2/src/runtime/malloc.go:553 +0x24 fp=0x884927c8 sp=0x884927b8 pc=0x89917f58
runtime.systemstack(0x884927ec)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/asm_arm.s:278 +0xb4 fp=0x884927cc sp=0x884927c8 pc=0x8991abf4
runtime.(*mcache).nextFree(0x91b33000, 0x1e, 0x0, 0x89aadb20, 0x91b33000)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/malloc.go:552 +0x90 fp=0x884927f8 sp=0x884927cc pc=0x898d40c4
runtime.mallocgc(0xe0, 0x89a28660, 0x1, 0x1)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/malloc.go:707 +0x658 fp=0x88492854 sp=0x884927f8 pc=0x898d490c
runtime.newobject(0x89a28660, 0x898f3348)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/malloc.go:836 +0x2c fp=0x88492868 sp=0x88492854 pc=0x898d4c1c
runtime.malg(0x8000, 0x14996bc1)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/proc.go:2884 +0x20 fp=0x88492884 sp=0x88492868 pc=0x898f939c
runtime.mpreinit(0x89a9bf38)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/os_linux.go:302 +0x1c fp=0x88492894 sp=0x88492884 pc=0x898ee6fc
runtime.mcommoninit(0x89a9bf38)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/proc.go:536 +0xa4 fp=0x884928ac sp=0x88492894 pc=0x898f3388
runtime.schedinit()
	/Users/jvicente/sdk/go1.9beta2/src/runtime/proc.go:472 +0x44 fp=0x884928c8 sp=0x884928ac pc=0x898f2e90
runtime.rt0_go(0x1, 0x89a8c3e8, 0xb6c9750b, 0xb6c974f1, 0x88492938, 0xb6c974f1, 0xb6c71c95, 0x88492978, 0x0, 0x82703930, ...)
	/Users/jvicente/sdk/go1.9beta2/src/runtime/asm_arm.s:61 +0x8c fp=0x88492908 sp=0x884928c8 pc=0x8991a9b4

@aclements
Copy link
Member

I'm not sure what could lead to the "index out of range" panic, but the "out of memory" panic seems like an address space conflict.

@achojoao are you able to attach a debugger and/or try out custom patches to the Go tree? If you can attach a debugger, it would be great if you could break on runtime.throw and dump the value of the 'runtime.mheap_' variable (getting this from a core dump would work, too).

@achojoao
Copy link
Author

@odeke-em Sorry for the delay reporting this problem, but I told you as soon as we noticed it ;)

@aclements Regarding the debugger, I've never use GDM nor LLDB, so It'll be difficult for me but if you think that this is the only solution, I'll try. What method do you recommend? Do you have any nice tutorial or magic link? What specific file and line I'll have to look for?

On the other hand, I can try custom patches. In fact, what I'm using rigth now (until we find the solution) is my custom build, starting from the source code of version 1.8.3 and applying only the #19472 patch.

@aclements
Copy link
Member

aclements commented Jul 25, 2017

It sounds like gdb may be tricky on Android (I've never done this myself), but if you can find a way to get gdb running, just do

break 'runtime.throw'
run # or "continue" if the program is already running
print 'runtime.mheap_'

Alternatively, here's a patch to add debugging prints for both panics:

--- a/src/runtime/mheap.go
+++ b/src/runtime/mheap.go
@@ -891,6 +891,22 @@ func (h *mheap) grow(npage uintptr) bool {
                        v = h.sysAlloc(ask)
                }
                if v == nil {
+                       println(h.bitmap, h.bitmap_mapped, h.arena_start, h.arena_used, h.arena_alloc, h.arena_end, h.spans)
+                       var path = []byte("/proc/self/maps\x00")
+                       fd := open(&path[0], 0, 0)
+                       if fd < 0 {
+                               println("failed to open /proc/self/maps")
+                       } else {
+                               var buf [64]byte
+                               for {
+                                       n := read(fd, noescape(unsafe.Pointer(&buf[0])), int32(len(buf)))
+                                       if n < 0 {
+                                               break
+                                       }
+                                       gwrite(buf[:n])
+                               }
+                               closefd(fd)
+                       }
                        print("runtime: out of memory: cannot allocate ", ask, "-byte block (", memstats.heap_sys, " in use)\n")
                        return false
                }
@@ -902,6 +918,10 @@ func (h *mheap) grow(npage uintptr) bool {
        s.init(uintptr(v), ask>>_PageShift)
        p := (s.base() - h.arena_start) >> _PageShift
        for i := p; i < p+s.npages; i++ {
+               if i >= uintptr(len(h.spans)) {
+                       println(i, p, s.npages)
+                       println(h.bitmap, h.bitmap_mapped, h.arena_start, h.arena_used, h.arena_alloc, h.arena_end, h.spans)
+               }
                h.spans[i] = s
        }
        atomic.Store(&s.sweepgen, h.sweepgen)

@achojoao
Copy link
Author

achojoao commented Jul 27, 2017

@aclements While trying to get the Samsung A3 back and test again, I've tried the app on the Samsung S5. Please, find attached the logs. I haven't got time to prepare gdb so I used the patch you posted. I'll try to include the Samsung A3 logs as soon as possible. Thanks!

samsung_s6.txt

@aclements
Copy link
Member

Thanks. A few observations:

07-27 17:07:11.885 15750-0/es.mytestapp.develop E/Go: 314564608 0 0 314564608 314564608 314572800 [0/524288]0x29fe000
# bitmap=arena_used=arena_alloc=0x12bfe000
# arena_end=0x12c00000
# bitmap_mapped=arena_start=0

07-27 17:07:11.886 15750-0/es.mytestapp.develop E/Go: 029fe000-12c00000 ---p 00000000 00:00 0 
07-27 17:07:11.886 15750-0/es.mytestapp.develop E/Go: 12c00000-12ddf000 rw-p 00000000 00:04 11784      /dev/ashmem/dalvik-main space (deleted)

This first mapping is exactly the Go spans and bitmap. It ends at arena_end, which means we fell all the way back to reserving a 0-byte arena after failing to reserve 512MB, 256MB, or 128MB. This is a little surprising because the largest hole in your address space is 253MB (at 0x60c00000--0x7099f000). Maybe something was in the hole a moment before the crash.

There's a Dalvik mapping immediately after Go's mapping, so Go can't grow that mapping. This isn't great, but Go should fall back to taking whatever addresses the kernel returns.

The address space is decidedly crunched. It has 1,998 mappings (a suspiciously round number). The next largest hole is 113MB (at 0xb6f6e000), then 78MB (at 0x78ada000), etc.

This shouldn't stop it from finding space for the 64kB allocation that's failing, though.

@achojoao, could you try two things:

  1. On the device, bring up a local terminal and run sysctl vm.max_map_count. What's the output?

  2. Apply the following patch (plus the patch above) and reproduce:

--- a/src/runtime/malloc.go
+++ b/src/runtime/malloc.go
@@ -416,6 +416,7 @@ func (h *mheap) sysAlloc(n uintptr) unsafe.Pointer {
 			var reserved bool
 			p := uintptr(sysReserve(unsafe.Pointer(h.arena_end), p_size, &reserved))
 			if p == 0 {
+				println("sysReserve", hex(h.arena_end), p_size, "failed to grow arena")
 				return nil
 			}
 			// p can be just about anywhere in the address
@@ -486,6 +487,7 @@ func (h *mheap) sysAlloc(n uintptr) unsafe.Pointer {
 	p_size := round(n, _PageSize) + _PageSize
 	p := uintptr(sysAlloc(p_size, &memstats.heap_sys))
 	if p == 0 {
+		println("sysAlloc", p_size, "failed")
 		return nil
 	}
 
diff --git a/src/runtime/mem_linux.go b/src/runtime/mem_linux.go
index 094658de51..ea9c9c2381 100644
--- a/src/runtime/mem_linux.go
+++ b/src/runtime/mem_linux.go
@@ -68,6 +68,7 @@ func sysAlloc(n uintptr, sysStat *uint64) unsafe.Pointer {
 			print("runtime: mmap: too much locked memory (check 'ulimit -l').\n")
 			exit(2)
 		}
+		println("mmap", n, "failed:", uintptr(p))
 		return nil
 	}
 	mSysStatInc(sysStat, n)

@achojoao
Copy link
Author

This is the log for the Samsung A3 with the first patch. I'll try to test the new patch today.

samsung_a3.txt

@achojoao
Copy link
Author

achojoao commented Jul 28, 2017

For both devices the result is vm.max_map_count = 65530
I can't reproduce the "out of memory" error anymore on the Samsung S5. Now, every time I try, I'm getting the "index out of range" error, so I'm afraid we have to follow that trace...

Anyway, here is the log file for the Samsung S5 with the second patch.

samsung_s6_2.txt

@aclements
Copy link
Member

samsung_a3.txt

07-28 08:15:32.927 21833 22375 E Go      : 198064 198064 128
07-28 08:15:32.927 21833 22375 E Go      : 2416705536 0 0 2416705536 1623588864 1890971648 [0/524288]0x7fec0000

We're installing a 128 page (1MB) span at page 198064 (address 0x6c65fea0) and crashing on the very first page we try to install because len(h.spans) is 0. Similar to the first crash, we haven't mapped any of the bitmap or the spans array. arena_used is 0x900c0000, which is also where the bitmap ends. However, this time arena_alloc is 0x60c60000 and arena_end is 0x70b5f000, so we should be allocating from [0x60c60000, 0x70b5f000). The fact that this range is just shy of 256MB suggests that we took the path in mheap.sysAlloc that tries to grow the arena, but gets discontiguous memory.

I think what's happening is, because this is the very first allocation, we've never called mheap.setArenaUsed. Furthermore, because the address space is swiss cheese, we got an allocation hole that was lower than the initial arena_used, so we failed to take the h.arena_alloc > h.arena_used path in mheap.sysAlloc that would have patched things up by calling setArenaUsed.

@achojoao, this patch might fix things:

--- a/src/runtime/mheap.go
+++ b/src/runtime/mheap.go
@@ -503,6 +503,9 @@ func (h *mheap) init(spansStart, spansBytes uintptr) {
 	sp.array = unsafe.Pointer(spansStart)
 	sp.len = 0
 	sp.cap = int(spansBytes / sys.PtrSize)
+
+	// Map metadata structures.
+	h.setArenaUsed(h.arena_used, true)
 }
 
 // setArenaUsed extends the usable arena to address arena_used and

@aclements
Copy link
Member

Oh, I think I understand the "out of memory" crash, too. It's a slightly different cause, though the initial circumstances are similar. Right now, if we're out of arena reservation, we try to allocate another 256MB and if that fails, rather than falling back to smaller mmaps, we simply fail. That should be easy to fix.

@gopherbot
Copy link

Change https://golang.org/cl/51714 mentions this issue: runtime: map bitmap and spans during heap initialization

@gopherbot
Copy link

Change https://golang.org/cl/51713 mentions this issue: runtime: fall back to small mmaps if we fail to grow reservation

gopherbot pushed a commit that referenced this issue Jul 31, 2017
Right now, if it's possible to grow the arena reservation but
mheap.sysAlloc fails to get 256MB more of memory, it simply fails.
However, on 32-bit we have a fallback path that uses much smaller
mmaps that could take in this situation, but fail to.

This commit fixes mheap.sysAlloc to use a common failure path in case
it can't grow the reservation. On 32-bit, this path includes the
fallback.

Ideally, mheap.sysAlloc would attempt smaller reservation growths
first, but taking the fallback path is a simple change for Go 1.9.

Updates #21044 (fixes one of two issues).

Change-Id: I1e0035ffba986c3551479d5742809e43da5e7c73
Reviewed-on: https://go-review.googlesource.com/51713
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
@gopherbot
Copy link

Change https://golang.org/cl/52190 mentions this issue: [release-branch.go1.9] runtime: fall back to small mmaps if we fail to grow reservation

@gopherbot
Copy link

Change https://golang.org/cl/52191 mentions this issue: [release-branch.go1.9] runtime: map bitmap and spans during heap initialization

gopherbot pushed a commit that referenced this issue Jul 31, 2017
…o grow reservation

Right now, if it's possible to grow the arena reservation but
mheap.sysAlloc fails to get 256MB more of memory, it simply fails.
However, on 32-bit we have a fallback path that uses much smaller
mmaps that could take in this situation, but fail to.

This commit fixes mheap.sysAlloc to use a common failure path in case
it can't grow the reservation. On 32-bit, this path includes the
fallback.

Ideally, mheap.sysAlloc would attempt smaller reservation growths
first, but taking the fallback path is a simple change for Go 1.9.

Updates #21044 (fixes one of two issues).

Cherry-pick of CL 51713. Updates #21234.

Change-Id: I1e0035ffba986c3551479d5742809e43da5e7c73
Reviewed-on: https://go-review.googlesource.com/52190
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
gopherbot pushed a commit that referenced this issue Jul 31, 2017
…ialization

We lazily map the bitmap and spans areas as the heap grows. However,
right now we're very slightly too lazy. Specifically, the following
can happen on 32-bit:

1. mallocinit fails to allocate any heap arena, so
   arena_used == arena_alloc == arena_end == bitmap.

2. There's less than 256MB between the end of the bitmap mapping and
   the next mapping.

3. On the first allocation, mheap.sysAlloc sees that there's not
   enough room in [arena_alloc, arena_end) because there's no room at
   all. It gets a 256MB mapping from somewhere *lower* in the address
   space than arena_used and sets arena_alloc and arena_end to this
   hole.

4. Since the new arena_alloc is lower than arena_used, mheap.sysAlloc
   doesn't bother to call mheap.setArenaUsed, so we still don't have a
   bitmap mapping or a spans array mapping.

5. mheap.grow, which called mheap.sysAlloc, attempts to fill in the
   spans array and crashes.

Fix this by mapping the metadata regions for the initial arena_used
when the heap is initialized, rather than trying to wait for an
allocation. This maintains the intended invariant that the structures
are always mapped for [arena_start, arena_used).

Fixes #21044.

Cherry-pick of CL 51714. Fixes #21234.

Change-Id: I4422375a6e234b9f979d22135fc63ae3395946b0
Reviewed-on: https://go-review.googlesource.com/52191
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@achojoao
Copy link
Author

achojoao commented Aug 1, 2017

@aclements Just to let you know that I've tested the previous fixes in both devices and is working ok! :) Thank you very much for your help.

@aclements
Copy link
Member

Great! Thanks for the follow-up.

@gopherbot
Copy link

Change https://golang.org/cl/85887 mentions this issue: runtime: use sparse mappings for the heap

gopherbot pushed a commit that referenced this issue Feb 15, 2018
This replaces the contiguous heap arena mapping with a potentially
sparse mapping that can support heap mappings anywhere in the address
space.

This has several advantages over the current approach:

* There is no longer any limit on the size of the Go heap. (Currently
  it's limited to 512GB.) Hence, this fixes #10460.

* It eliminates many failures modes of heap initialization and
  growing. In particular it eliminates any possibility of panicking
  with an address space conflict. This can happen for many reasons and
  even causes a low but steady rate of TSAN test failures because of
  conflicts with the TSAN runtime. See #16936 and #11993.

* It eliminates the notion of "non-reserved" heap, which was added
  because creating huge address space reservations (particularly on
  64-bit) led to huge process VSIZE. This was at best confusing and at
  worst conflicted badly with ulimit -v. However, the non-reserved
  heap logic is complicated, can race with other mappings in non-pure
  Go binaries (e.g., #18976), and requires that the entire heap be
  either reserved or non-reserved. We currently maintain the latter
  property, but it's quite difficult to convince yourself of that, and
  hence difficult to keep correct. This logic is still present, but
  will be removed in the next CL.

* It fixes problems on 32-bit where skipping over parts of the address
  space leads to mapping huge (and never-to-be-used) metadata
  structures. See #19831.

This also completely rewrites and significantly simplifies
mheap.sysAlloc, which has been a source of many bugs. E.g., #21044,
 #20259, #18651, and #13143 (and maybe #23222).

This change also makes it possible to allocate individual objects
larger than 512GB. As a result, a few tests that expected huge
allocations to fail needed to be changed to make even larger
allocations. However, at the moment attempting to allocate a humongous
object may cause the program to freeze for several minutes on Linux as
we fall back to probing every page with addrspace_free. That logic
(and this failure mode) will be removed in the next CL.

Fixes #10460.
Fixes #22204 (since it rewrites the code involved).

This slightly slows down compilebench and the x/benchmarks garbage
benchmark.

name       old time/op     new time/op     delta
Template       184ms ± 1%      185ms ± 1%    ~     (p=0.065 n=10+9)
Unicode       86.9ms ± 3%     86.3ms ± 1%    ~     (p=0.631 n=10+10)
GoTypes        599ms ± 0%      602ms ± 0%  +0.56%  (p=0.000 n=10+9)
Compiler       2.87s ± 1%      2.89s ± 1%  +0.51%  (p=0.002 n=9+10)
SSA            7.29s ± 1%      7.25s ± 1%    ~     (p=0.182 n=10+9)
Flate          118ms ± 2%      118ms ± 1%    ~     (p=0.113 n=9+9)
GoParser       147ms ± 1%      148ms ± 1%  +1.07%  (p=0.003 n=9+10)
Reflect        401ms ± 1%      404ms ± 1%  +0.71%  (p=0.003 n=10+9)
Tar            175ms ± 1%      175ms ± 1%    ~     (p=0.604 n=9+10)
XML            209ms ± 1%      210ms ± 1%    ~     (p=0.052 n=10+10)

(https://perf.golang.org/search?q=upload:20171231.4)

name                       old time/op  new time/op  delta
Garbage/benchmem-MB=64-12  2.23ms ± 1%  2.25ms ± 1%  +0.84%  (p=0.000 n=19+19)

(https://perf.golang.org/search?q=upload:20171231.3)

Relative to the start of the sparse heap changes (starting at and
including "runtime: fix various contiguous bitmap assumptions"),
overall slowdown is roughly 1% on GC-intensive benchmarks:

name        old time/op     new time/op     delta
Template        183ms ± 1%      185ms ± 1%  +1.32%  (p=0.000 n=9+9)
Unicode        84.9ms ± 2%     86.3ms ± 1%  +1.65%  (p=0.000 n=9+10)
GoTypes         595ms ± 1%      602ms ± 0%  +1.19%  (p=0.000 n=9+9)
Compiler        2.86s ± 0%      2.89s ± 1%  +0.91%  (p=0.000 n=9+10)
SSA             7.19s ± 0%      7.25s ± 1%  +0.75%  (p=0.000 n=8+9)
Flate           117ms ± 1%      118ms ± 1%  +1.10%  (p=0.000 n=10+9)
GoParser        146ms ± 2%      148ms ± 1%  +1.48%  (p=0.002 n=10+10)
Reflect         398ms ± 1%      404ms ± 1%  +1.51%  (p=0.000 n=10+9)
Tar             173ms ± 1%      175ms ± 1%  +1.17%  (p=0.000 n=10+10)
XML             208ms ± 1%      210ms ± 1%  +0.62%  (p=0.011 n=10+10)
[Geo mean]      369ms           373ms       +1.17%

(https://perf.golang.org/search?q=upload:20180101.2)

name                       old time/op  new time/op  delta
Garbage/benchmem-MB=64-12  2.22ms ± 1%  2.25ms ± 1%  +1.51%  (p=0.000 n=20+19)

(https://perf.golang.org/search?q=upload:20180101.3)

Change-Id: I5daf4cfec24b252e5a57001f0a6c03f22479d0f0
Reviewed-on: https://go-review.googlesource.com/85887
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Rick Hudson <rlh@golang.org>
@golang golang locked and limited conversation to collaborators Jan 2, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

5 participants