Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/mobile: Binding go mobile framework on iOS 9 with golang1.7rc6 crash when call debug.FreeOSMemory() #16644

Closed
bronze1man opened this issue Aug 9, 2016 · 32 comments

Comments

@bronze1man
Copy link
Contributor

Please answer these questions before submitting your issue. Thanks!

  1. What version of Go are you using (go version)?
    go version go1.7rc5 darwin/amd64
  2. What operating system and processor architecture are you using (go env)?
    GOARCH="amd64"
    GOBIN=""
    GOEXE=""
    GOHOSTARCH="amd64"
    GOHOSTOS="darwin"
    GOOS="darwin"
    GOPATH=""
    GORACE=""
    GOROOT="/usr/local/go"
    GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
    CC="clang"
    GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/m9/qtbxkp6s3p96fk54rln7qhj80000gp/T/go-build097177340=/tmp/go-build -gno-record-gcc-switches -fno-common"
    CXX="clang++"
    CGO_ENABLED="1"
  3. What did you do?
    I create a simple one page ios project.
    I write a simple package with follow code:
package iosGomobileLib

import (
    "runtime/debug"
    "fmt"
)

func RunFromGo(){
    fmt.Println("before debug.FreeOSMemory()")
    debug.FreeOSMemory()
    fmt.Println("after debug.FreeOSMemory()")
}

change the ViewController.m with follow code:

#import "ViewController.h"
@import IosGomobileLib;

@interface ViewController ()

@end

@implementation ViewController

- (void)viewDidLoad {
    [super viewDidLoad];
    GoIosGomobileLibRunFromGo();
    // Do any additional setup after loading the view, typically from a nib.
}

- (void)didReceiveMemoryWarning {
    [super didReceiveMemoryWarning];
    // Dispose of any resources that can be recreated.
}

@end

I build the framework with follow code(the kmg stuff should be easy to understand):

package main

import (
    "github.com/bronze1man/kmg/kmgCmd"
    "github.com/bronze1man/kmg/kmgConfig"
    "os"
)

func main(){
    goPath := kmgConfig.DefaultEnv().GOPATHToString()
    androidSdkPath := "/usr/local/AndroidSdk"
    kmgCmd.CmdString("gomobile bind -target ios -o client/ios-gomobile/ufr/iosGomobileLib.framework make/tplIosGomobile/iosGomobileLib").
    MustSetEnvMap(map[string]string{
        "GOPATH":                     "/usr/local/gopath:"+goPath,
        "PATH":                           os.Getenv("PATH"),
        "ANDROID_HOME":               androidSdkPath,
        "IPHONEOS_DEPLOYMENT_TARGET": "8.0",
    }).
    RunAndReturnOutput()
}
  1. What did you expect to see?
    see following in ios console:
before debug.FreeOSMemory()
after debug.FreeOSMemory()
  1. What did you see instead?
    see following in ios console:
before debug.FreeOSMemory()

and crash with EXC_BAD_ACCESS

* thread #1: tid = 0x6dd76, 0x00000001000c8cec ufr`notok + 4, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x00000001000c8cec ufr`notok + 4
    frame #1: 0x00000001000c8e48 ufr`runtime.madvise + 32
    frame #2: 0x0000000100080fdc ufr`runtime.sysUnused + 52
    frame #3: 0x00000001000901c4 ufr`runtime.scavengelist + 332
    frame #4: 0x000000010009027c ufr`runtime.(*mheap).scavenge + 132
    frame #5: 0x00000001000bfd68 ufr`runtime.runtime_debug_freeOSMemory.func1 + 64
    frame #6: 0x00000001000c6178 ufr`runtime.systemstack + 144
    frame #7: 0x000000010009d8e8 ufr`runtime.startTheWorldWithSema + 800
    frame #8: 0x0000000100101dbc ufr`GoIosGomobileLibRunFromGo + 12
  * frame #9: 0x000000010006bba8 ufr`-[ViewController viewDidLoad](self=0x0000000127da0c70, _cmd="viewDidLoad") + 72 at ViewController.m:20
    frame #10: 0x00000001898bf610 UIKit`-[UIViewController loadViewIfRequired] + 1124
    frame #11: 0x00000001898bf194 UIKit`-[UIViewController view] + 28
    frame #12: 0x00000001898c5f70 UIKit`-[UIWindow addRootViewControllerViewIfPossible] + 76
    frame #13: 0x00000001898c3454 UIKit`-[UIWindow _setHidden:forced:] + 264
    frame #14: 0x000000018993775c UIKit`-[UIWindow makeKeyAndVisible] + 48
    frame #15: 0x0000000189b5c088 UIKit`-[UIApplication _callInitializationDelegatesForMainScene:transitionContext:] + 3456
    frame #16: 0x0000000189b601a4 UIKit`-[UIApplication _runWithMainScene:transitionContext:completion:] + 1660
    frame #17: 0x0000000189b5d2e4 UIKit`-[UIApplication workspaceDidEndTransaction:] + 168
    frame #18: 0x000000018e3f77ec FrontBoardServices`-[FBSSerialQueue _performNext] + 184
    frame #19: 0x000000018e3f7b6c FrontBoardServices`-[FBSSerialQueue _performNextFromRunLoopSource] + 56
    frame #20: 0x00000001843205a4 CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 24
    frame #21: 0x0000000184320038 CoreFoundation`__CFRunLoopDoSources0 + 540
    frame #22: 0x000000018431dd38 CoreFoundation`__CFRunLoopRun + 724
    frame #23: 0x000000018424cdc0 CoreFoundation`CFRunLoopRunSpecific + 384
    frame #24: 0x000000018992c0ac UIKit`-[UIApplication _run] + 460
    frame #25: 0x0000000189926f44 UIKit`UIApplicationMain + 204
    frame #26: 0x000000010006bf40 ufr`main(argc=1, argv=0x000000016fd9bb18) + 124 at main.m:14
    frame #27: 0x000000019a0768b8 libdyld.dylib`start + 4

gomobile version

gomobile version +8ab5dbb Thu Aug 4 01:15:58 2016 +0000 (android,ios); androidSDK=

ios version: 9.0.2
ios device name: iphone 6s

It works ok with golang1.6.
I have uploaded several apps with gomobile and golang1.6 to apple app store.

I can use golang1.6 to work around this problem.
This issue may be same as #16598

@dcu
Copy link

dcu commented Aug 9, 2016

I'm seeing a random crash from time to time as well. I'll confirm if it is the same. /cc @crawshaw

@bronze1man
Copy link
Contributor Author

bronze1man commented Aug 10, 2016

I think this bug is emergency for me.
Ios10 is not working with golang1.6 because of the ios syscall change of time.Now().
It looks like golang1.7 fixed "the ios syscall change bug".
But golang1.7 also has this gc crash bug.

If the golang team do not fix this bug before ios10 public release.I have to fix it myself after ios10 public release in the golang source code, and maintain my own version of golang.

@crawshaw
Copy link
Member

cc @aclements @RLH for the runtime crash

@bronze1man Do you see the crash when running the app without XCode? With XCode, can you replicate the crash with -tags lldb so it produces a Go stack trace?

@bronze1man
Copy link
Contributor Author

bronze1man commented Aug 11, 2016

@crawshaw I do not know how to run the app without XCode.
I just tried the -tags lldb and go version go1.7rc6 darwin/amd64 it looks the same as old one.
Another thing, I can replicate this crash 100% when the debug.FreeOSMemory() was called.

before debug.FreeOSMemory()
(lldb) bt
* thread #1: tid = 0x19759, 0x0000000100130cac ufr`notok + 4, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x0000000100130cac ufr`notok + 4
    frame #1: 0x0000000100130e08 ufr`runtime.madvise + 32
    frame #2: 0x00000001000e8fac ufr`runtime.sysUnused + 52
    frame #3: 0x00000001000f8194 ufr`runtime.scavengelist + 332
    frame #4: 0x00000001000f824c ufr`runtime.(*mheap).scavenge + 132
    frame #5: 0x0000000100127d38 ufr`runtime.runtime_debug_freeOSMemory.func1 + 64
    frame #6: 0x000000010012e138 ufr`runtime.systemstack + 144
    frame #7: 0x00000001001058b8 ufr`runtime.startTheWorldWithSema + 800
    frame #8: 0x0000000100169dbc ufr`GoIosGomobileLibRunFromGo + 12
  * frame #9: 0x00000001000d3b78 ufr`-[ViewController viewDidLoad](self=0x0000000145da5850, _cmd="viewDidLoad") + 72 at ViewController.m:12
    frame #10: 0x00000001899cf610 UIKit`-[UIViewController loadViewIfRequired] + 1124
    frame #11: 0x00000001899cf194 UIKit`-[UIViewController view] + 28
    frame #12: 0x00000001899d5f70 UIKit`-[UIWindow addRootViewControllerViewIfPossible] + 76
    frame #13: 0x00000001899d3454 UIKit`-[UIWindow _setHidden:forced:] + 264
    frame #14: 0x0000000189a4775c UIKit`-[UIWindow makeKeyAndVisible] + 48
    frame #15: 0x0000000189c6c088 UIKit`-[UIApplication _callInitializationDelegatesForMainScene:transitionContext:] + 3456
    frame #16: 0x0000000189c701a4 UIKit`-[UIApplication _runWithMainScene:transitionContext:completion:] + 1660
    frame #17: 0x0000000189c6d2e4 UIKit`-[UIApplication workspaceDidEndTransaction:] + 168
    frame #18: 0x000000018e5077ec FrontBoardServices`-[FBSSerialQueue _performNext] + 184
    frame #19: 0x000000018e507b6c FrontBoardServices`-[FBSSerialQueue _performNextFromRunLoopSource] + 56
    frame #20: 0x00000001844305a4 CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 24
    frame #21: 0x0000000184430038 CoreFoundation`__CFRunLoopDoSources0 + 540
    frame #22: 0x000000018442dd38 CoreFoundation`__CFRunLoopRun + 724
    frame #23: 0x000000018435cdc0 CoreFoundation`CFRunLoopRunSpecific + 384
    frame #24: 0x0000000189a3c0ac UIKit`-[UIApplication _run] + 460
    frame #25: 0x0000000189a36f44 UIKit`UIApplicationMain + 204
    frame #26: 0x00000001000d3f10 ufr`main(argc=1, argv=0x000000016fd33b18) + 124 at main.m:14
    frame #27: 0x000000019a1868b8 libdyld.dylib`start + 4

xcode also get some asm code:

ufr`notok:
    0x100130ca8 <+0>:  movz   x8, #0
->  0x100130cac <+4>:  str    x8, [x8]  // this one has red backgroud
    0x100130cb0 <+8>:  b      0x100130cb0               ; <+8> 
    0x100130cb4 <+12>: .long  0x00000000                ; unknown opcode
ufr`runtime.madvise:
    0x100130de8 <+0>:  str    x30, [sp, #-16]!
    0x100130dec <+4>:  ldr    x0, [sp, #24]
    0x100130df0 <+8>:  ldr    x1, [sp, #32]
    0x100130df4 <+12>: ldrsw  x2, [sp, #40]
    0x100130df8 <+16>: movz   w16, #0x4b
    0x100130dfc <+20>: svc    #0x80
    0x100130e00 <+24>: b.lo   0x100130e08               ; <+32>
    0x100130e04 <+28>: bl     0x100130ca8               ; notok
    0x100130e08 <+32>: ldr    x30, [sp], #16 // this line has red backgroud
    0x100130e0c <+36>: ret    
    0x100130e10 <+40>: .long  0x00000000                ; unknown opcode
    0x100130e14 <+44>: .long  0x00000000                ; unknown opcode
ufr`runtime.sysUnused:
    0x1000e8f78 <+0>:  ldr    x1, [x28, #16]
    0x1000e8f7c <+4>:  mov    x2, sp
    0x1000e8f80 <+8>:  cmp    x2, x1
    0x1000e8f84 <+12>: b.ls   0x1000e8fb4               ; <+60>
    0x1000e8f88 <+16>: str    x30, [sp, #-32]!
    0x1000e8f8c <+20>: ldr    x0, [sp, #40]
    0x1000e8f90 <+24>: str    x0, [sp, #8]
    0x1000e8f94 <+28>: ldr    x0, [sp, #48]
    0x1000e8f98 <+32>: str    x0, [sp, #16]
    0x1000e8f9c <+36>: movz   x0, #0x5
    0x1000e8fa0 <+40>: sxtw   x0, w0
    0x1000e8fa4 <+44>: str    w0, [sp, #24]
    0x1000e8fa8 <+48>: bl     0x100130de8               ; runtime.madvise
    0x1000e8fac <+52>: ldr    x30, [sp], #32   // this line has red backgroud
    0x1000e8fb0 <+56>: ret    
    0x1000e8fb4 <+60>: mov    x3, x30
    0x1000e8fb8 <+64>: bl     0x10012e1d8               ; runtime.morestack_noctxt
    0x1000e8fbc <+68>: b      0x1000e8f78               ; <+0>
    0x1000e8fc0 <+72>: .long  0x00000000                ; unknown opcode
    0x1000e8fc4 <+76>: .long  0x00000000                ; unknown opcode
ufr`runtime.scavengelist:
    0x1000f8048 <+0>:   ldr    x1, [x28, #16]
    0x1000f804c <+4>:   mov    x2, sp
    0x1000f8050 <+8>:   cmp    x2, x1
    0x1000f8054 <+12>:  b.ls   0x1000f81ac               ; <+356>
    0x1000f8058 <+16>:  str    x30, [sp, #-48]!
    0x1000f805c <+20>:  ldr    x2, [sp, #56]
    0x1000f8060 <+24>:  movz   x0, #0
    0x1000f8064 <+28>:  movz   x0, #0
    0x1000f8068 <+32>:  ubfx   x0, x0, #0, #8
    0x1000f806c <+36>:  ldr    x0, [x2]
    0x1000f8070 <+40>:  movz   x1, #0
    0x1000f8074 <+44>:  cmp    x0, x1
    0x1000f8078 <+48>:  b.eq   0x1000f81a0               ; <+344>
    0x1000f807c <+52>:  movz   x0, #0
    0x1000f8080 <+56>:  ubfx   x1, x0, #0, #8
    0x1000f8084 <+60>:  ubfx   x0, x1, #0, #8
    0x1000f8088 <+64>:  ubfx   x1, x0, #0, #8
    0x1000f808c <+68>:  ubfx   x0, x1, #0, #8
    0x1000f8090 <+72>:  cmp    x0, xzr
    0x1000f8094 <+76>:  b.eq   0x1000f80a8               ; <+96>
    0x1000f8098 <+80>:  movz   x0, #0
    0x1000f809c <+84>:  str    x0, [sp, #80]
    0x1000f80a0 <+88>:  ldr    x30, [sp], #48
    0x1000f80a4 <+92>:  ret    
    0x1000f80a8 <+96>:  movz   x7, #0
    0x1000f80ac <+100>: ldr    x2, [x2]
    0x1000f80b0 <+104>: movz   x1, #0
    0x1000f80b4 <+108>: cmp    x2, x1
    0x1000f80b8 <+112>: b.eq   0x1000f8130               ; <+232>
    0x1000f80bc <+116>: ldr    x0, [x2, #112]
    0x1000f80c0 <+120>: ldr    x1, [sp, #64]
    0x1000f80c4 <+124>: sub    x1, x1, x0
    0x1000f80c8 <+128>: mov    x0, x1
    0x1000f80cc <+132>: ldr    x1, [sp, #72]
    0x1000f80d0 <+136>: cmp    x0, x1
    0x1000f80d4 <+140>: b.ls   0x1000f8120               ; <+216>
    0x1000f80d8 <+144>: ldr    x0, [x2, #120]
    0x1000f80dc <+148>: ldr    x1, [x2, #32]
    0x1000f80e0 <+152>: cmp    x0, x1
    0x1000f80e4 <+156>: b.eq   0x1000f8120               ; <+216>
    0x1000f80e8 <+160>: movz   x5, #0
    0x1000f80ec <+164>: ldr    x3, [x2, #24]
    0x1000f80f0 <+168>: ldr    x4, [x2, #32]
    0x1000f80f4 <+172>: lsl    x4, x4, #13
    0x1000f80f8 <+176>: add    x4, x4, x3
    0x1000f80fc <+180>: ldr    x1, #196                  ; <+376>
    0x1000f8100 <+184>: add    x5, x3, x1
    0x1000f8104 <+188>: sub    x5, x5, #1                ; =1 
    0x1000f8108 <+192>: movn   x1, #0xffff
    0x1000f810c <+196>: and    x5, x5, x1
    0x1000f8110 <+200>: movn   x3, #0xffff
    0x1000f8114 <+204>: and    x3, x3, x4
    0x1000f8118 <+208>: cmp    x5, x3
    0x1000f811c <+212>: b.ne   0x1000f813c               ; <+244>
    0x1000f8120 <+216>: ldr    x2, [x2]
    0x1000f8124 <+220>: movz   x1, #0
    0x1000f8128 <+224>: cmp    x2, x1
    0x1000f812c <+228>: b.ne   0x1000f80bc               ; <+116>
    0x1000f8130 <+232>: str    x7, [sp, #80]
    0x1000f8134 <+236>: ldr    x30, [sp], #48
    0x1000f8138 <+240>: ret    
    0x1000f813c <+244>: sub    x0, x3, x5
    0x1000f8140 <+248>: mov    x4, x0
    0x1000f8144 <+252>: ldr    x0, [x2, #120]
    0x1000f8148 <+256>: lsl    x0, x0, #13
    0x1000f814c <+260>: sub    x1, x4, x0
    0x1000f8150 <+264>: cmp    x1, xzr
    0x1000f8154 <+268>: b.eq   0x1000f8120               ; <+216>
    0x1000f8158 <+272>: adrp   x27, 277
    0x1000f815c <+276>: add    x27, x27, #1072           ; =1072 
    0x1000f8160 <+280>: ldr    x6, [x27]
    0x1000f8164 <+284>: add    x0, x1, x6
    0x1000f8168 <+288>: adrp   x27, 277
    0x1000f816c <+292>: add    x27, x27, #1072           ; =1072 
    0x1000f8170 <+296>: str    x0, [x27]
    0x1000f8174 <+300>: add    x0, x7, x1
    0x1000f8178 <+304>: str    x0, [sp, #32]
    0x1000f817c <+308>: str    x2, [sp, #40]
    0x1000f8180 <+312>: lsr    x1, x4, #13
    0x1000f8184 <+316>: str    x1, [x2, #120]
    0x1000f8188 <+320>: str    x5, [sp, #8]
    0x1000f818c <+324>: str    x4, [sp, #16]
    0x1000f8190 <+328>: bl     0x1000e8f78               ; runtime.sysUnused
    0x1000f8194 <+332>: ldr    x7, [sp, #32]   // this line has red background
    0x1000f8198 <+336>: ldr    x2, [sp, #40]
    0x1000f819c <+340>: b      0x1000f8120               ; <+216>
    0x1000f81a0 <+344>: movz   x0, #0x1
    0x1000f81a4 <+348>: ubfx   x1, x0, #0, #8
    0x1000f81a8 <+352>: b      0x1000f8084               ; <+60>
    0x1000f81ac <+356>: mov    x3, x30
    0x1000f81b0 <+360>: bl     0x10012e1d8               ; runtime.morestack_noctxt
    0x1000f81b4 <+364>: b      0x1000f8048               ; <+0>
    0x1000f81b8 <+368>: b      0x1000f81b8               ; <+368>
    0x1000f81bc <+372>: .long  0x00000000                ; unknown opcode
    0x1000f81c0 <+376>: .long  0x00010000                ; unknown opcode
    0x1000f81c4 <+380>: .long  0x00000000                ; unknown opcode

@quentinmit quentinmit added this to the Unreleased milestone Aug 11, 2016
@minux
Copy link
Member

minux commented Aug 14, 2016 via email

@bronze1man
Copy link
Contributor Author

"A safe fix is to disable FreeOSMemory on iOS or ignore the error from
madvise (which effectively disabled FreeOSMemory too)"

I think that disabling FreeOSMemory is not an option for me.
I am using the golang as a background process of ios,which ios only give me 10MB memory.
I have to free os memory to the system periodly to prevent the process been killed by the ios system.

@minux
Copy link
Member

minux commented Aug 15, 2016 via email

@minux
Copy link
Member

minux commented Aug 15, 2016 via email

@bronze1man
Copy link
Contributor Author

(lldb) register read
General Purpose Registers:
        x0 = 0x0000000000000016
        x1 = 0x0000000000000000
        x2 = 0x0000000000000005
        x3 = 0x00000001b0440000
        x4 = 0xffffffffffff0000
        x5 = 0x00000001b0450000
        x6 = 0x0000000000000000
        x7 = 0x0000000000000000
        x8 = 0x0000000000000000
        x9 = 0x000000016fdd1cf0
       x10 = 0x00000001b0416100
       x11 = 0x0000000000000000
       x12 = 0x0000000000000008
       x13 = 0x0000000000000000
       x14 = 0x0000000000000000
       x15 = 0x00000000000000c0
       x16 = 0x000000000000004b
       x17 = 0x00000001b0436c30
       x18 = 0x0000000000000000
       x19 = 0x0000000100093c08  ufr`_cgoexp_5c61409660f0_proxyiosGomobileLib__RunFromGo
       x20 = 0x00000001456b26f0
       x21 = 0x0000000000000018
       x22 = 0x000000018ac81f5a  "count"
       x23 = 0x00000001a08c1000  UIKit`_UIBlendingHighlightView._colorBurnView
       x24 = 0x0000000000000000
       x25 = 0x0000000000000010
       x26 = 0x000000010010a6c0  ufr`runtime.runtime_debug_freeOSMemory.func1.f
       x27 = 0x000000010016d430  ufr`runtime.memstats + 80
       x28 = 0x00000001b04004e0
        fp = 0x000000016fdd1f80
        lr = 0x0000000100090e08  ufr`runtime.madvise + 32
        sp = 0x000000016fdd1d20
        pc = 0x0000000100090cac  ufr`notok + 4
      cpsr = 0x20000000

@minux
Copy link
Member

minux commented Aug 16, 2016 via email

@quentinmit
Copy link
Contributor

If you need a workaround, you can apply the patch for time.Now to Go 1.6; I believed it should apply cleanly.

@bronze1man
Copy link
Contributor Author

bronze1man commented Aug 16, 2016

"I'm puzzled by this. Perhaps iOS 10 doesn't support MADV_FREE
anymore? Will need to get access to iOS 10 to further debug this...
"
I am testing on iOS 9 and golang1.7 ,not on iOS 10.
It is ok with iOS 9 and golang1.6.
Maybe you solve this bug on iOS 9 and golang1.7 ,this bug will disappear on iOS10 and golang1.7.
On iOS 10, golang 1.6 time.Now() always return 1970-01-01.

@bronze1man
Copy link
Contributor Author

bronze1man commented Aug 16, 2016

(lldb) register read sp
      sp = 0x000000016fd29d20
(lldb) memory read --size 8 --format x --count 16 0x000000016fd29d20
0x16fd29d20: 0x00000001000f0fac 0x0000000100201488
0x16fd29d30: 0x0000000100100194 0x000000013fc50000
0x16fd29d40: 0xffffffffffff0000 0x0000000000000005
0x16fd29d50: 0x000000010010024c 0x000000013fc50000
0x16fd29d60: 0xffffffffffff0000 0x0000000000000000
0x16fd29d70: 0xffffffffffff0000 0x0000000102170aa0
0x16fd29d80: 0x000000010012fd38 0x0000000100201488
0x16fd29d90: 0xffffffffffffffff 0x0000000000000000
int madvise(void *addr, size_t len, int advice);
addr=0x000000013fc50000
len=0xffffffffffff0000  -65536 // size_t should be an unsigned int,this is wired.
advice=0x0000000000000005  _MADV_FREE

Then I add one line here:
image
/usr/local/go/src/runtime/mheap.go:933

Then I got following from xcode:

before debug.FreeOSMemory()
-65536(lldb) 

So I found the reason of EINVAL, The value of advice is incorrect.


let's print again:

println("[scavengelist]",len,start,end,s.base(),s.npages,sys.PhysPageSize,_PageSize)

Then I got following:

[scavengelist] -65536 4972019712 4971954176 4972003328 1 65536 8192

@bronze1man
Copy link
Contributor Author

bronze1man commented Aug 16, 2016

I found a patch that can fix my first test case:

/usr/local/go/src > git diff
diff --git a/src/runtime/internal/sys/arch_arm64.go b/src/runtime/internal/sys/arch_arm64.go
index aaaa4b0..95cba6a 100644
--- a/src/runtime/internal/sys/arch_arm64.go
+++ b/src/runtime/internal/sys/arch_arm64.go
@@ -8,7 +8,7 @@ const (
        ArchFamily    = ARM64
        BigEndian     = 0
        CacheLineSize = 32
-       PhysPageSize  = 65536
+       PhysPageSize  = 8*1024
        PCQuantum     = 4
        Int64Align    = 8
        HugePageSize  = 0

This patch also make the app which uploaded to appStore works.
But I do not know the side effort of this patch.

@minux
Copy link
Member

minux commented Aug 16, 2016 via email

@scosman
Copy link

scosman commented Aug 16, 2016

I'm seeing a similar bug (notok crashes on go 1.7 rc6). However, I'm not calling FreeOSMemory. This is (unfortunately) in the wild, so I have a wealth of crash reports in Crashlytics (hundreds).

Details:

  • happening on all version of iOS (7 though 10 at least)
  • bare bones stack trace in Crashlytics (right after a new thread is created, no go dump, only the usual system trace)
  • didn't occur on go 1.6 (exact same library)
  • I'm seeing hundreds of these crashes, but I don't have a consistent retro. It may have something to do with the app being resumed after being suspended, but I am still trying to verify that.
  • It may only happen when a device has low RAM (usually < 10%, never over 30%). Not sure if this is related, but did seem like a major pattern in the crash reports.

@crawshaw
Copy link
Member

I can't work out why a smaller PhysPageSize would fix this either. But if it does, we could set it just for iOS.

@dcu
Copy link

dcu commented Aug 16, 2016

this is the crash we are getting randomly, not sure if it is related:

runtime: unexpected return pc for runtime.cgocallback_gofunc called from 0x0
fatal error: unknown caller pc
runtime stack:
runtime.throw(0x5bca2e, 0x11)
/opt/go/src/runtime/panic.go:566 +0x80
runtime.gentraceback(0x2eaf9c, 0x503db8e8, 0x0, 0x503983c0, 0x0, 0x0, 0x7fffffff, 0x70726efc, 0x0, 0x0, ...)
/opt/go/src/runtime/traceback.go:317 +0x1620
runtime.scanstack(0x503983c0, 0x503bb32c)
/opt/go/src/runtime/mgcmark.go:770 +0x344
runtime.scang(0x503983c0, 0x503bb32c)
/opt/go/src/runtime/proc.go:830 +0xb0
runtime.markroot.func1()
/opt/go/src/runtime/mgcmark.go:234 +0xac
runtime.systemstack(0x503ba000)
/opt/go/src/runtime/asm_arm.s:247 +0x8c
runtime.mstart()
/opt/go/src/runtime/proc.go:1081
goroutine 29 [running]:
runtime.systemstack_switch()
/opt/go/src/runtime/asm_arm.s:192 +0x4 fp=0x50492f24 sp=0x50492f20
runtime.markroot(0x503bb32c, 0x7)
/opt/go/src/runtime/mgcmark.go:239 +0x34c fp=0x50492f60 sp=0x50492f24
runtime.gcDrain(0x503bb32c, 0x5)
/opt/go/src/runtime/mgcmark.go:956 +0x394 fp=0x50492f90 sp=0x50492f60
runtime.gcBgMarkWorker(0x503baa00)
/opt/go/src/runtime/mgc.go:1453 +0x66c fp=0x50492fcc sp=0x50492f90
runtime.goexit()
/opt/go/src/runtime/asm_arm.s:998 +0x4 fp=0x50492fcc sp=0x50492fcc
created by runtime.gcBgMarkStartWorkers
/opt/go/src/runtime/mgc.go:1342 +0x8c
goroutine 17 [GC assist wait (scan), locked to thread]:
oursdk/vendor/github.com/boltdb/bolt.(*meta).sum64(0x50830580, 0x4b3141ff, 0x0)
oursdk/vendor/github.com/boltdb/bolt/db.go:968 +0x34
oursdk/vendor/github.com/boltdb/bolt.(*meta).write(0x50830580, 0x5083e000)

@bronze1man
Copy link
Contributor Author

"I can't work out why a smaller PhysPageSize would fix this either. But if it does, we could set it just for iOS."
I think that len==-65536 and that if sys. PhysPageSize> _PageSize{} around /usr/local/go/src/runtime/mheap.go:933 should be digged.

@aclements
Copy link
Member

let's print again:
println("[scavengelist]",len,start,end,s.base(),s.npages,sys.PhysPageSize,_PageSize)
Then I got following:
[scavengelist] -65536 4972019712 4971954176 4972003328 1 65536 8192

@bronze1man, this is very helpful. Could you try changing mheap.go line 920 to say "if start >= end" (and reverting your change to PhysPageSize)?

I am testing on iOS 9 and golang1.7 ,not on iOS 10.
It is ok with iOS 9 and golang1.6.

Note that Go 1.6 simply never used MADV_FREE or returned memory to the OS on ARM64. That's all new in Go 1.7.

I'm even more puzzled. Why decreasing PhysPageSize fixes
the problem? With higher PhysPageSize, we always align
the address to madvise using the higher value, which should
also be aligned to any smaller power-of-two too.

Am I missing something?

@minux, I think the rounding check at mheap.go:920 is wrong (oops). Decreasing PhysPageSize to 8K makes the rounding a no-op, which covers up the bad check.

@aclements aclements modified the milestones: Go1.7.1, Unreleased Aug 17, 2016
@aclements
Copy link
Member

I mailed https://go-review.googlesource.com/27230, which implements the "if start >= end" fix I asked @bronze1man to try. @scosman, would you also be able to try this change? I don't know if you can reproduce it locally.

I'm seeing a similar bug (notok crashes on go 1.7 rc6). However, I'm not calling FreeOSMemory.

@scosman, if this is what I think it is, FreeOSMemory isn't necessary. FreeOSMemory forces the scavenger to run, but it runs on its own periodically and will start releasing memory after it's gone unused for ~5 minutes.

I can't work out why a smaller PhysPageSize would fix this either. But if it does, we could set it just for iOS.

@crawshaw, it's not safe to reduce PhysPageSize below the actual physical page size of the system (see #9993). I don't know what the actual physical page size of iOS on ARM64 is. (https://go-review.googlesource.com/25022 will make the runtime fetch this from the OS, but I wouldn't want to put that in 1.7.) Plus, if I understand this bug, reducing it to anything more than 8K won't fix it.

@gopherbot
Copy link

CL https://golang.org/cl/27230 mentions this issue.

@scosman
Copy link

scosman commented Aug 17, 2016

@aclements I cherry-picked 25022 on top of the release-branch.go1.7 branch, but I reproduced the issue with that build. Again, not sure it's the same issue, but it does sound similar. As you mentioned, I need to wait a few minutes (with the app in the background) to repro.

Stack trace

Test App`notok:
    0x10077fc28 <+0>:  movz   x8, #0
->  0x10077fc2c <+4>:  str    x8, [x8]
    0x10077fc30 <+8>:  b      0x10077fc30               ; <+8>
    0x10077fc34 <+12>: .long  0x00000000                ; unknown opcode

@bronze1man
Copy link
Contributor Author

bronze1man commented Aug 17, 2016

Sorry for not replying in time.I just have other stuff to do.

Following patch works for me:

/usr/local/go/src > git diff
diff --git a/src/runtime/mheap.go b/src/runtime/mheap.go
index db60f7a..77fd7d3 100644
--- a/src/runtime/mheap.go
+++ b/src/runtime/mheap.go
@@ -917,7 +917,7 @@ func scavengelist(list *mSpanList, now, limit uint64) uintptr {
                                // more memory than we want.)
                                start = (start + sys.PhysPageSize - 1) &^ (sys.PhysPageSize - 1)
                                end &^= sys.PhysPageSize - 1
-                               if start == end {
+                               if start >= end {
                                        continue
                                }
                        }

My first test case and the bigger app both work.
I have tested the bigger app for an hour. This patch is good for me.

@eliasnaur
Copy link
Contributor

@dcu I don't think your "runtime: unexpected return pc..." crash is related. Please file a new issue for that. Thank you.

@dcu
Copy link

dcu commented Aug 17, 2016

@eliasnaur I created #16760 I'm sorry I don't have an easy way to reproduce the error

@aclements
Copy link
Member

@bronze1man, great! I'll push that fix shortly and it should appear in Go 1.7.1 (which we already have a few bugs for, so I suspect that won't be too far off).

@scosman, it sounds like you have a different bug, then. The crash in "notok" just means some system call failed, which could have any number of causes. Could you open a new bug and CC me, @minux, and @crawshaw? Please include any additional information you have. Unfortunately, the one stack frame isn't enough to figure much out. More frames would be great. Even just a dump of the registers would be helpful.

@bronze1man
Copy link
Contributor Author

I do not think this bug is fixed totally.
You just fix this situation.It may crash again with another situation.
The compute process of value "start" and "end" should have some bugs that need to be fixed.
Do we need a range which is 64KB align that is smaller than the input range? or larger?
Both kind of compute should not get a negative range.
Do we need a test to ensure we will never get a negative/meaningless range.

@aclements
Copy link
Member

@bronze1man, I'm sorry, but I don't understand what "other situation" you mean. What bugs in "start" and "end" are you referring to?

Do we need a range which is 64KB align that is smaller than the input range? or larger?

Smaller. Specifically, we need the 64KB-aligned range that is contained in the range [start, end). It was possible to get a negative range because we compute this by rounding start up to the next 64KB boundary and end down to the previous 64KB boundary. If they're already in the same 64KB region, this can cause start and end to cross over, leading to the negative range.

@bronze1man
Copy link
Contributor Author

@aclements Thanks, I got it.
Thanks for your explanation.I know what this function is doing now.
More comments in that function may be better.

gopherbot pushed a commit that referenced this issue Aug 19, 2016
The previous fix for this, commit 336dad2, had everything right in
the commit message, but reversed the test in the code. Fix the test in
the code.

This reversal effectively disabled the scavenger on large page systems
*except* in the rare cases where this code was originally wrong, which
is why it didn't obviously show up in testing.

Fixes #16644. Again. :(

Change-Id: I27cce4aea13de217197db4b628f17860f27ce83e
Reviewed-on: https://go-review.googlesource.com/27402
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@scosman
Copy link

scosman commented Aug 19, 2016

@aclements FYI, I won't open the issue as you requested because I retested the patch and couldn't repro that issue. I must have tested with the wrong file :(. Thanks!

@gopherbot
Copy link

CL https://golang.org/cl/28631 mentions this issue.

gopherbot pushed a commit that referenced this issue Sep 7, 2016
…ounding again

The previous fix for this, commit 336dad2, had everything right in
the commit message, but reversed the test in the code. Fix the test in
the code.

This reversal effectively disabled the scavenger on large page systems
*except* in the rare cases where this code was originally wrong, which
is why it didn't obviously show up in testing.

Fixes #16644. Again. :(

Change-Id: I27cce4aea13de217197db4b628f17860f27ce83e
Reviewed-on: https://go-review.googlesource.com/27402
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-on: https://go-review.googlesource.com/28631
Run-TryBot: Chris Broadfoot <cbro@golang.org>
@golang golang locked and limited conversation to collaborators Sep 7, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants