Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: android/arm and android/arm64 builders hang in bootstrap #35554

Closed
eliasnaur opened this issue Nov 13, 2019 · 29 comments
Closed

runtime: android/arm and android/arm64 builders hang in bootstrap #35554

eliasnaur opened this issue Nov 13, 2019 · 29 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker
Milestone

Comments

@eliasnaur
Copy link
Contributor

eliasnaur commented Nov 13, 2019

The android arm builders have hung in the bootstrap. I used to think it was caused by a flaky network, because of #35553. Example:

https://farmer.golang.org/temporarylogs?name=android-arm-corellium&rev=99957b6930c76b683dbca1ff4bcdd56e59b1e035&st=0xc009856f20

Killing the build with SIGQUIT gives various stack traces:

https://build.golang.org/log/e41631f86f761d215a8a9282ab67af9dbd6397d5
https://build.golang.org/log/f878ba2305112c955d167419607afb76f8ffa78f
https://build.golang.org/log/49023d89fec5e6e62cdda9cd37bdf1d1ea9962e3

I bisected the arm64 hangs to start at https://golang.org/cl/203461, which is the enablement of arm64 preemption.

CC @cherrymui.

@cherrymui
Copy link
Member

Hmmm. I tested that CL on android/arm64 manually and with trybots, both worked fine. Not sure why it starts to fail now...

@eliasnaur
Copy link
Contributor Author

Note the android/arm* builders have been timing out for a week:

https://build.golang.org/?page=4

@cherrymui
Copy link
Member

Yeah, I submitted that CL roughly a week ago, but I tested it before that, and it worked fine. I also tested the corresponding ARM32 CL (https://go-review.googlesource.com/c/go/+/202338) and it worked fine. And the android/arm builder showed ok after that CL was submitted.

@eliasnaur
Copy link
Contributor Author

FWIW, darwin/arm64 can also hang. I killed this one with SIGQUIT:

https://build.golang.org/log/294e3f1975bd1cd17bd05e8cf4c0f4ec83dd3a00

@cherrymui
Copy link
Member

@eliasnaur Could you test whether disabling async preemption fix the timeout?
Change https://go.googlesource.com/go/+/refs/heads/master/src/runtime/signal_arm64.go#82 to false.

@networkimprov
Copy link

There's also the env variable GODEBUG=asyncpreemptoff=1

@eliasnaur
Copy link
Contributor Author

eliasnaur commented Nov 13, 2019

Thank you. Setting GODEBUG=asyncpreemptoff=1 seems to get rid of the hangs on android/arm64. I haven't tried android/arm yet.

@cherrymui
Copy link
Member

Thanks. Could you get a stack trace with GOTRACEBACK=crash set?

@cherrymui
Copy link
Member

On the android/arm builder, do we build an ARM64 toolchain first? (It looks that way from the log.) Is this the tip version of Go?

On https://build.golang.org/?page=3&branch=master , there are some ok on android/arm after the ARM64 preemption CL submitted.

Also, on either builder, in the build log it says

GOROOT_BOOTSTRAP=/data/data/com.termux/files/home/go-android-arm64-bootstrap"

but also

Building Go cmd/dist using /data/data/com.termux/files/usr/lib/go.

I'm not sure I understand how the build process works on the android builder.

@eliasnaur
Copy link
Contributor Author

eliasnaur commented Nov 13, 2019

There are currently two slightly different builders because I'm reconstructing them to be built from a setup script. You may disregard android/arm for now, it doesn't work on the new builder types yet, and is a cross compile from arm64. The android/arm64 builds are native.

The old builder uses a prebuilt toolchain I made (go-android-arm64-bootstrap). The new builder uses Go from the golang package in Termux.

@cherrymui
Copy link
Member

Thanks @eliasnaur

@eliasnaur
Copy link
Contributor Author

For some reason I'm now failing to reproduce the hangs :( Another crash appeared, but it seemed unrelated:

https://farmer.golang.org/temporarylogs?name=android-arm64-corellium&rev=54cf7760203c2b138d9ecf653cd3b2402444cf9b&st=0xc0037da160

panic: runtime error: slice bounds out of range [6:5]

goroutine 51 [running]:
runtime/pprof.(*profileBuilder).appendLocsForStack(0x40000a5ce0, 0x400001cca0, 0x0, 0x4, 0x4000286050, 0x5, 0x3f6, 0x0, 0x1, 0x4)
	/data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/src/runtime/pprof/proto.go:397 +0x5ac
runtime/pprof.(*profileBuilder).build(0x40000a5ce0)
	/data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/src/runtime/pprof/proto.go:362 +0xf4
runtime/pprof.profileWriter(0x771c67e380, 0x400034a4e0)
	/data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/src/runtime/pprof/pprof.go:779 +0xd0
created by runtime/pprof.StartCPUProfile
	/data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/src/runtime/pprof/pprof.go:750 +0x114

I'll file a separate issue.

@cherrymui
Copy link
Member

This looks like #35538

@cherrymui
Copy link
Member

Good to see it doesn't hang, at least no always. Thanks!

@eliasnaur
Copy link
Contributor Author

Got one with GOTRACEBACK=crash on android/arm:

https://build.golang.org/log/aa5855e5d65bbe8361fa9b79d53c1f9be22f1804

SIGQUIT: quit
PC=0x72e3f82ab8 m=1 sigcode=0

goroutine 0 [idle]:
runtime.usleep()
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/sys_linux_arm64.s:148 +0x40 fp=0x4000045f20 sp=0x4000045ef0 pc=0x72e3f82ab8
runtime.sysmon()
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:4491 +0x90 fp=0x4000045fa0 sp=0x4000045f20 pc=0x72e3f61e48
runtime.mstart1()
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:1133 +0xa4 fp=0x4000045fd0 sp=0x4000045fa0 pc=0x72e3f5a17c
runtime.mstart()
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:1098 +0x58 fp=0x4000046000 sp=0x4000045fd0 pc=0x72e3f5a0b0

goroutine 1 [running]:
	goroutine running on other thread; stack unavailable

goroutine 2 [runnable]:
runtime.gopark(0x72e4413908, 0x72e4663bf0, 0x1411, 0x1)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:304 +0xc8 fp=0x4000032fa0 sp=0x4000032f80 pc=0x72e3f57d70
runtime.goparkunlock(...)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:310
runtime.forcegchelper()
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:253 +0xb0 fp=0x4000032fd0 sp=0x4000032fa0 pc=0x72e3f57c28
runtime.goexit()
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/asm_arm64.s:1148 +0x4 fp=0x4000032fd0 sp=0x4000032fd0 pc=0x72e3f81f2c
created by runtime.init.5
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:242 +0x28

goroutine 3 [GC sweep wait]:
runtime.gopark(0x72e4413908, 0x72e4663e20, 0x140c, 0x1)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:304 +0xc8 fp=0x40000337a0 sp=0x4000033780 pc=0x72e3f57d70
runtime.goparkunlock(...)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:310
runtime.bgsweep(0x4000020150)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/mgcsweep.go:89 +0x170 fp=0x40000337d0 sp=0x40000337a0 pc=0x72e3f47208
runtime.goexit()
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/asm_arm64.s:1148 +0x4 fp=0x40000337d0 sp=0x40000337d0 pc=0x72e3f81f2c
created by runtime.gcenable
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/mgc.go:214 +0x4c

goroutine 4 [sleep]:
runtime.gopark(0x72e4413908, 0x72e4663de0, 0x1313, 0x2)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:304 +0xc8 fp=0x4000033f00 sp=0x4000033ee0 pc=0x72e3f57d70
runtime.goparkunlock(...)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:310
runtime.scavengeSleep(0x40895, 0x171a42)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/mgcscavenge.go:197 +0x9c fp=0x4000033f40 sp=0x4000033f00 pc=0x72e3f45b44
runtime.bgscavenge(0x4000020150)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/mgcscavenge.go:296 +0x2cc fp=0x4000033fd0 sp=0x4000033f40 pc=0x72e3f45e44
runtime.goexit()
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/asm_arm64.s:1148 +0x4 fp=0x4000033fd0 sp=0x4000033fd0 pc=0x72e3f81f2c
created by runtime.gcenable
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/mgc.go:215 +0x6c

goroutine 5 [finalizer wait, 14 minutes]:
runtime.gopark(0x72e4413908, 0x72e467ecd0, 0x4000021410, 0x1)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:304 +0xc8 fp=0x4000032730 sp=0x4000032710 pc=0x72e3f57d70
runtime.goparkunlock(...)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:310
runtime.runfinq()
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/mfinal.go:175 +0xac fp=0x40000327d0 sp=0x4000032730 pc=0x72e3f3d714
runtime.goexit()
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/asm_arm64.s:1148 +0x4 fp=0x40000327d0 sp=0x40000327d0 pc=0x72e3f81f2c
created by runtime.createfing
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/mfinal.go:156 +0x64

goroutine 6 [GC worker (idle), 14 minutes]:
runtime.gopark(0x72e44137a8, 0x40003cd5a0, 0x1418, 0x0)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:304 +0xc8 fp=0x4000034750 sp=0x4000034730 pc=0x72e3f57d70
runtime.gcBgMarkWorker(0x4000024000)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/mgc.go:1874 +0xe0 fp=0x40000347d0 sp=0x4000034750 pc=0x72e3f40e58
runtime.goexit()
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/asm_arm64.s:1148 +0x4 fp=0x40000347d0 sp=0x40000347d0 pc=0x72e3f81f2c
created by runtime.gcBgMarkStartWorkers
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/mgc.go:1822 +0x70

r0      0xfffffffffffffffc
r1      0x0
r2      0x0
r3      0x0
r4      0x3e8
r5      0x989680
r6      0x72e464b8d0
r7      0x1bf08eb000
r8      0x65
r9      0xc1f5e213020
r10     0x3410
r11     0x336d788c
r12     0x18
r13     0x5dcc761c
r14     0x6052340000000
r15     0x68507e000000
r16     0x4000045f18
r17     0x0
r18     0xffffffff
r19     0x30
r20     0x4000045ef0
r21     0x4000036000
r22     0x4000038000
r23     0x72e467f335
r24     0x72e4663ac0
r25     0x18
r26     0x72e44139a8
r27     0x2710
r28     0x4000000480
r29     0x7fc0eba6f8
lr      0x72e3f61e48
sp      0x4000045ef0
pc      0x72e3f82ab8
fault   0x0

-----

SIGQUIT: quit
PC=0x72e3f82fc4 m=2 sigcode=0

goroutine 0 [idle]:
runtime.futex(0x40000364c8, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x72e3f5b66c, 0x0, 0x72e3f5b6cc, ...)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/sys_linux_arm64.s:536 +0x1c fp=0x4000047d50 sp=0x4000047d50 pc=0x72e3f82fc4
runtime.futexsleep(0x40000364c8, 0x0, 0xffffffffffffffff)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/os_linux.go:44 +0x30 fp=0x4000047da0 sp=0x4000047d50 pc=0x72e3f52918
runtime.notesleep(0x40000364c8)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/lock_futex.go:151 +0x90 fp=0x4000047de0 sp=0x4000047da0 pc=0x72e3f31098
runtime.stopm()
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:1864 +0xa4 fp=0x4000047e10 sp=0x4000047de0 pc=0x72e3f5b6cc
runtime.findrunnable(0x4000024000, 0x0)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:2396 +0xa20 fp=0x4000047f10 sp=0x4000047e10 pc=0x72e3f5ccd8
runtime.schedule()
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:2556 +0x2c4 fp=0x4000047f90 sp=0x4000047f10 pc=0x72e3f5d8dc
runtime.park_m(0x4000000900)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:2696 +0x80 fp=0x4000047fc0 sp=0x4000047f90 pc=0x72e3f5dcb8
runtime.mcall(0x0)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/asm_arm64.s:174 +0x58 fp=0x4000047fd0 sp=0x4000047fc0 pc=0x72e3f7f880
r0      0x40000364c8
r1      0x80
r2      0x0
r3      0x0
r4      0x0
r5      0x0
r6      0x1
r7      0x4
r8      0x62
r9      0x3b9aca00000000
r10     0x3098
r11     0x10e5eac4
r12     0x18
r13     0x5dcc72a3
r14     0x1f0dd440000000
r15     0x72dbb6000000
r16     0x4000047e08
r17     0x0
r18     0xffffffff
r19     0x8
r20     0x4000047de0
r21     0x4000036380
r22     0x400004a000
r23     0x72e467f335
r24     0x72e4663ac0
r25     0x18
r26     0x72e4413908
r27     0x0
r28     0x4000000c00
r29     0x0
lr      0x72e3f52918
sp      0x4000047d50
pc      0x72e3f82fc4
fault   0x0

-----

SIGQUIT: quit
PC=0x72e3f82fc4 m=3 sigcode=0

goroutine 0 [idle]:
runtime.futex(0x4000036848, 0x80, 0x0, 0x0, 0x0, 0x72e3f5db70, 0x4000024000, 0x72e3f5b66c, 0xb507a562252, 0x72e3f5b6cc, ...)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/sys_linux_arm64.s:536 +0x1c fp=0x4000041d50 sp=0x4000041d50 pc=0x72e3f82fc4
runtime.futexsleep(0x4000036848, 0x7200000000, 0xffffffffffffffff)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/os_linux.go:44 +0x30 fp=0x4000041da0 sp=0x4000041d50 pc=0x72e3f52918
runtime.notesleep(0x4000036848)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/lock_futex.go:151 +0x90 fp=0x4000041de0 sp=0x4000041da0 pc=0x72e3f31098
runtime.stopm()
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:1864 +0xa4 fp=0x4000041e10 sp=0x4000041de0 pc=0x72e3f5b6cc
runtime.findrunnable(0x4000024000, 0x0)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:2396 +0xa20 fp=0x4000041f10 sp=0x4000041e10 pc=0x72e3f5ccd8
runtime.schedule()
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:2556 +0x2c4 fp=0x4000041f90 sp=0x4000041f10 pc=0x72e3f5d8dc
runtime.park_m(0x4000000900)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:2696 +0x80 fp=0x4000041fc0 sp=0x4000041f90 pc=0x72e3f5dcb8
runtime.mcall(0x0)
	/data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/src/runtime/asm_arm64.s:174 +0x58 fp=0x4000041fd0 sp=0x4000041fc0 pc=0x72e3f7f880
r0      0x4000036848
r1      0x80
r2      0x0
r3      0x0
r4      0x0
r5      0x0
r6      0x1
r7      0x4
r8      0x62
r9      0xb507a5d906e
r10     0x1
r11     0x1
r12     0x72e4663e00
r13     0x72e4663b90
r14     0x1
r15     0x1
r16     0x0
r17     0x2
r18     0xffffffff
r19     0x72bed00000
r20     0x4000041d90
r21     0x4000036700
r22     0x4000052000
r23     0x72e467f335
r24     0x72e4663ac0
r25     0x18
r26     0x72e4413908
r27     0x0
r28     0x4000000f00
r29     0x0
lr      0x72e3f52918
sp      0x4000041d50
pc      0x72e3f82fc4
fault   0x0

-----

go tool dist: FAILED: /data/data/com.termux/files/home/tmpdir/workdir-host-android-arm64-corellium-android/go/pkg/tool/android_arm64/go_bootstrap install -gcflags=all= -ldflags=all= std cmd: signal: killed


Error: build failed: make script failed: exit status 2

@eliasnaur
Copy link
Contributor Author

And another android/arm on the old builder, but without GOTRACEBACK=crash:

https://farmer.golang.org/temporarylogs?name=android-arm-corellium&rev=bf4990522263503a1219372cd8f1ee9422b51324&st=0xc00bc78c60

SIGQUIT: quit
PC=0x79f9e44a58 m=1 sigcode=0

goroutine 0 [idle]:
runtime.usleep()
	/data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/src/runtime/sys_linux_arm64.s:148 +0x40
runtime.sysmon()
	/data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:4484 +0x90
runtime.mstart1()
	/data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:1133 +0xa4
runtime.mstart()
	/data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/src/runtime/proc.go:1098 +0x58

goroutine 1 [semacquire, 23 minutes]:
sync.runtime_Semacquire(0x40006dea18)
	/data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/src/runtime/sema.go:56 +0x30
sync.(*WaitGroup).Wait(0x40006dea10)
	/data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/src/sync/waitgroup.go:130 +0x64
cmd/go/internal/work.(*Builder).Do(0x4000cde5a0, 0x4000d78140)
	/data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/src/cmd/go/internal/work/exec.go:187 +0x340
cmd/go/internal/list.runList(0x79fa505740, 0x40000100c0, 0x2, 0x2)
	/data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/src/cmd/go/internal/list/list.go:531 +0x1ab8
main.main()
	/data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/src/cmd/go/main.go:189 +0x50c

goroutine 3768 [runnable]:
os.(*File).Read(0x4000dde370, 0x4000bde000, 0x8000, 0x8000, 0x8000, 0x0, 0x0)
	/data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/src/os/file.go:112 +0x1f0
io.copyBuffer(0x77d278b700, 0x400079ae00, 0x79fa2ddf60, 0x4000dde370, 0x4000bde000, 0x8000, 0x8000, 0x79f9ec0c20, 0x0, 0x79f9ec0c0c)
	/data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/src/io/io.go:405 +0xcc
io.Copy(...)
	/data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/src/io/io.go:364
cmd/go/internal/cache.FileHash(0x400079ab00, 0x80, 0x0, 0x0, 0x0, 0x0, 0x400079ab00, 0x80)
	/data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/src/cmd/go/internal/cache/hash.go:149 +0x2ac
cmd/go/internal/work.(*Builder).fileHash(0x4000cde5a0, 0x400079ab00, 0x80, 0x400079ab00, 0x80)
	/data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/src/cmd/go/internal/work/buildid.go:403 +0x28
cmd/go/internal/work.(*Builder).buildActionID(0x4000cde5a0, 0x4000db48c0, 0x0, 0x0, 0x0, 0x0)
	/data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/src/cmd/go/internal/work/exec.go:317 +0x888
cmd/go/internal/work.(*Builder).build(0x4000cde5a0, 0x4000db48c0, 0x0, 0x0)
	/data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/src/cmd/go/internal/work/exec.go:398 +0x3bd4
cmd/go/internal/work.(*Builder).Do.func2(0x4000db48c0)
	/data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/src/cmd/go/internal/work/exec.go:118 +0x2cc
cmd/go/internal/work.(*Builder).Do.func3(0x40006dea10, 0x4000cde5a0, 0x400013a400)
	/data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/src/cmd/go/internal/work/exec.go:178 +0x50
created by cmd/go/internal/work.(*Builder).Do
	/data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/src/cmd/go/internal/work/exec.go:165 +0x31c

r0      0xfffffffffffffffc
r1      0x0
r2      0x0
r3      0x0
r4      0x3e8
r5      0x989680
r6      0x79fa50c8d0
r7      0x1bf08eb000
r8      0x65
r9      0x30fd4430e90
r10     0xd26
r11     0x1f077f8c
r12     0x18
r13     0x5dcc772a
r14     0x112a8800000000
r15     0x4d9298000000
r16     0x4000045f18
r17     0x0
r18     0xffffffff
r19     0x30
r20     0x4000045ef0
r21     0x4000036000
r22     0x4000038000
r23     0x79fa540335
r24     0x79fa524ac0
r25     0x18
r26     0x79fa2d51a8
r27     0x2710
r28     0x4000000480
r29     0x7fd4d954e8
lr      0x79f9e23de8
sp      0x4000045ef0
pc      0x79f9e44a58
fault   0x0

go tool dist: FAILED: /data/data/com.termux/files/usr/tmp/workdir-host-android-arm64-corellium-android/go/pkg/tool/android_arm64/go_bootstrap list -gcflags=all= -ldflags=all= -f={{if .Stale}}	STALE {{.ImportPath}}: {{.StaleReason}}{{end}} std cmd: exit status 2


Error: build failed: make script failed: exit status 2

@andybons andybons added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker labels Nov 13, 2019
@andybons andybons added this to the Go1.14 milestone Nov 13, 2019
@eliasnaur
Copy link
Contributor Author

@eliasnaur
Copy link
Contributor Author

eliasnaur commented Nov 13, 2019

Finally, a GOTRACEBACK=crash enabled log from the new android/arm64 builder:

https://build.golang.org/log/d1bc80f88a28797a24773d9a98525e9ce8db9aa2

EDIT: and another from android/arm:

https://build.golang.org/log/71a889d265d2fd092aa0bcb5cb6109eb02405764

@gopherbot
Copy link

Change https://golang.org/cl/206959 mentions this issue: runtime: crash if a signal is received with bad G and no extra M

@cherrymui
Copy link
Member

Thanks for the stack traces. These to some degree similar to the ones I saw when I debug #35473. If so, it indicates that a signal is received when the G is bad.

Could you try https://go-review.googlesource.com/c/go/+/206959 ? This doesn't fix anything but it will make it crash instead of hanging if my assumption is right.

@eliasnaur
Copy link
Contributor Author

Thank you, Cherry. With your CL, I got a "bad g in signal handler" for an android/arm build:

$ (unset LD_PRELOAD && GOARCH=arm CGO_ENABLED=1 GOTRACEBACK=crash ./all.bash )
Building Go cmd/dist using /data/data/com.termux/files/usr/lib/go. (go1.13.3 android/arm64)
Building Go toolchain1 using /data/data/com.termux/files/usr/lib/go.
Building Go bootstrap cmd/go (go_bootstrap) using Go toolchain1.
warning: unable to find runtime/cgo.a
Building Go toolchain2 using go_bootstrap and Go toolchain1.
Building Go toolchain3 using go_bootstrap and Go toolchain2.
Building packages and commands for host, android/arm64.
fatal: bad g in signal handler
go tool dist: FAILED: /data/data/com.termux/files/home/goroot/pkg/tool/android_arm64/go_bootstrap install -gcflags=all= -ldflags=all= std cmd: exit status 2

And a similar for android/arm64:

$ (unset LD_PRELOAD && GOARCH=arm64 CGO_ENABLED=1 GOTRACEBACK=crash ./all.bash )
Building Go cmd/dist using /data/data/com.termux/files/usr/lib/go. (go1.13.3 android/arm64)
Building Go toolchain1 using /data/data/com.termux/files/usr/lib/go.
Building Go bootstrap cmd/go (go_bootstrap) using Go toolchain1.
warning: unable to find runtime/cgo.a
Building Go toolchain2 using go_bootstrap and Go toolchain1.
fatal: bad g in signal handler
go tool dist: FAILED: /data/data/com.termux/files/home/goroot/pkg/tool/android_arm64/go_bootstrap install -gcflags=all= -ldflags=all= -i cmd/asm cmd/cgo cmd/compile cmd/link: exit status 2

No stack traces though, but perhaps you expected that?

@cherrymui
Copy link
Member

Thank you. This confirms my assumption, although I don't yet know what causes the bad g. I'll keep looking.

I also want to note that it seems all the hangings happen in go_bootstrap. If I understand correctly, on Android we default to PIE, which default to external linking, which brings in cgo. So all binaries are effectively cgo binaries. But go_bootstrap is the only exception, due to

warning: unable to find runtime/cgo.a

It attempted to bring in cgo but failed. A non-cgo externally linked PIE may still working, but it is also an unusual configuration. Cgo affects whether we use TLS to save the G.

gopherbot pushed a commit that referenced this issue Nov 15, 2019
When we receive a signal, if G is nil we call badsignal, which
calls needm. When cgo is not used, there is no extra M, so needm
will just hang. In this situation, even GOTRACEBACK=crash cannot
get a stack trace, as we're in the signal handler and cannot
receive another signal (SIGQUIT).

Instead, just crash.

For #35554.
Updates #34391.

Change-Id: I061ac43fc0ac480435c050083096d126b149d21f
Reviewed-on: https://go-review.googlesource.com/c/go/+/206959
Run-TryBot: Cherry Zhang <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@eliasnaur
Copy link
Contributor Author

That's #31343. Do you have an idea of the scope to implement it? I don't mind working on it this cycle if it helps async preemption.

@gopherbot
Copy link

Change https://golang.org/cl/207299 mentions this issue: cmd/link: bootstrap android/arm64 in internal linking mode

@cherrymui
Copy link
Member

Thanks, @eliasnaur .

Does android require buildmode=pie? Does buildmode=exe work? The go_bootstrap program is only used for building the toolchain and the actual go command, it will be deleted then. If buildmode=exe works, we could just use that. It won't affect any user program.

If not, could we try internal linking? Internal linking PIE works well on Linux/ARM64. It may work just fine on Android as well. (It doesn't work on ARM32 though. But you bootstrap from ARM64 anyway.)

@eliasnaur
Copy link
Contributor Author

Android refuses to run non-pie programs. See https://go-review.googlesource.com/c/go/+/170943. Fortunately it seems internal linking of non-cgo PIE programs work on android/arm64.

@gopherbot
Copy link

Change https://golang.org/cl/207445 mentions this issue: runtime: always use Go signal stack in non-cgo program

@eliasnaur
Copy link
Contributor Author

Thank you very much, Cherry! I'll keep working on CL 207299 because it's a gain independent of this issue.

@cherrymui
Copy link
Member

Thanks @eliasnaur . The builders are happy now.

@golang golang locked and limited conversation to collaborators Nov 17, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker
Projects
None yet
Development

No branches or pull requests

5 participants