-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime/cgo: arm5 sigill #19674
Comments
For example, linux-arm-arm5spacemonkey at d0ff9ec Notably, these are the first real ARM5 builders we've ever had. The old "arm5" builders we had were just modern-ish ARMv6-or-7-or-something (Scaleway C1) machines running the ARM5 binaries, which meant we got away with cheating (illegal instructions) without realizing it. So this is probably some codegen not respecting the $GOARM=="5". |
(That's one example, but the whole build column is like that.) |
Oh, looking closer finally, this is cgo/testplugin, and during a cgo call, so the C compiler is not generating the ARM5 code. I guess we might need to pass down special flags to the C compiler to change its target processor version? Is this something that cmd/go needs to do? |
There are a few places in the toolchain that pass architecture-specific options to the C compiler:
I think it would be reasonable to modify those places (or unify them!) and arrange that if |
@zeebo, interested in tackling this? |
Sure. |
We have an initial patchset ready for this but even though
Here is the diff (with some stuff elided, like the deps changes):
|
What is the instruction at PC value 0x1bb608? |
Is this the right thing?
I figured out how to keep the binary/so files around. Would a copy of those be helpful? |
Poking around with gdb some more, I can't get the program to finish the topmost stack frame at
This seems to be the jump that causes stuff to go wrong (it picks a different address to jump to when I have the breakpoint set. No idea):
Let me know if there's anything useful I can do to help debug this more, because I don't really know what I'm looking at or if any of this is helpful. 😃 |
@ianlancetaylor, would you like SSH access to the builder? We're happy to get you on there if that will help. If so, get me your SSH pubkey and I'll get an account set up. |
Ping? We're stuck on this. |
If you are stuck on this because misc/cgo/testplugin is failing on ARM5, then I think the answer is to disable that test. It's clear that plugin support is very patchy at the moment, and we shouldn't let plugin test failures hold us up anywhere. (Does the test fail on normal ARM? I know very little about Go on ARM myself.) The comment above suggests that the program is somehow trying to execute the code at |
CC @crawshaw because this appears to be plugin related. |
CL https://golang.org/cl/39716 mentions this issue. |
Plugin support is patchy at the moment, so disable the test for now until the test can be fixed. This way, we can get builders for ARMv5 running for the rest of the code. Updates #19674 Change-Id: I08aa211c08a85688656afe2ad2e680a2a6e5dfac Reviewed-on: https://go-review.googlesource.com/39716 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
On Mar 25, 2017 5:58 PM, "Jeff" <notifications@github.com> wrote:
Is this the right thing?
(gdb) disas 0x1bb608
Dump of assembler code for function runtime.finalizer1:
=> 0x001bb608 <+0>: ; <UNDEFINED> instruction: 0xf7bdef7b
0x001bb60c <+4>: ldrdeq r0, [r0], -lr
End of assembler dump.
I figured out how to keep the binary/so files around. Would a copy of those
be helpful?
runtime.finalizer1 is generated by our compiler, and it's using armv6 ldrd.
I'd check the SSA rule for that instruction. And what's the first
instruction at 0x1bb608.
|
@randall77, got any time to check this out? Looks like it might be the Go compiler's fault and not cmd/go failing to pass down flags to gcc. |
runtime.finalizer1 isn't code, it is data. Disassembling it is sure to give you junk. |
|
@zeebo, a few questions:
|
I took the liberty of running a disas on the contents of lr there: https://gist.github.com/zeebo/52f7436e9392cc821b6dfe3af05b5963 |
Also if you would like, I can probably arrange getting you a shell in to the hardware that is having the issues. Does that sound like a good idea? |
@zeebo, yes, that'd probably be helpful, but @ianlancetaylor is on vacation now. If I ever finish gomote ssh suppport we'd have access to your existing builders. :-/ |
Thanks. Turns out
Looks like it's not doing anything funny with LR here. It was pushed on the stack in the prologue and popped from the stack in the epilogue. This suggests either the stack slot got corrupted or the SP itself did.
Thanks. This is exactly what I would expect it to be, so the LR is sane on entry. @zeebo, a shell would be useful. Alternatively, could I get you to break on entry to
It should dump a few thousand lines giving the register state and top-of-stack at every instruction in |
I ran the python gdb snippet and captured the output here: https://gist.github.com/zeebo/c5b3bd0ff0658132e8794ee39bf3df4a I've also set you up with a user account on the builder. I used your ssh keys from github, so you should be able to ssh in with
There is a checkout of the go repository in Let me know if you run in to any issues. I'm also on the gophers slack as zeebo if you want a less asynchronous communication environment. |
Perfect. Curiously, it looks like this traced into
Awesome. Though it's giving me "ssh_exchange_identification: read: Connection reset by peer". If I try connecting directly, I don't get the OpenSSH server identification (tried from two different hosts on very different networks). |
Ok, sorry about the relay issues. Maybe try this one |
$ ssh -p 14975 aclements@relay005.spacemonkey.com
ssh: connect to host relay005.spacemonkey.com port 14975: Connection refused :( |
Alright, apparently the binary I'm using to forward connections silently decides to stop working after some point in time. Instead, I have set up a wacky system of ssh reverse tunnels and socat, but maybe it's more reliable. The address is now |
Perfect. That's working for me. So far all I've managed to figure out is that SP goes bonkers at some point during I haven't been able to figure out where it goes bonkers. However, I suspect this is why my Python snippet fails to |
Found where it goes bonkers as soon as I sent that. Entering the function epilogue, SP still good:
Execute
Execute
Next step is to pop the saved registers, so we're supposed to be back at a sane SP now, but aren't. (I have no idea why it did the intermediate It looks to me like r11 is a frame pointer in this function. It uses a variable-length array, so the stack frame is dynamically adjusted at runtime. And the epilogue sequence that restores SP from r11 mirrors the prologue. r11 got clobbered by the |
CL https://golang.org/cl/47831 mentions this issue. |
I'm pretty sure I found it, but I managed to completely toast the Go tree you set up for me on the ARM host (oops). @zeebo, would you mind testing https://golang.org/cl/47831? |
No problem. I'll give it a shot right now. Edit: I forgot to mention it will take about 30-40 minutes because it requires a rebuild of the toolchain. These boards aren't the quickest :) |
It seems fixed. Do we want to revert 168eb9c and let the arm builders chew on it? |
CL https://golang.org/cl/47834 mentions this issue. |
This reverts commit 168eb9c. CL 47831 fixes the issue with plugins on ARMv5, so we can re-enable the test. Updates #19674. Change-Id: Idcb29f93ffb0460413f1fab5bb82fa2605795038 Reviewed-on: https://go-review.googlesource.com/47834 Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
What version of Go are you using (
go version
)?current-ish master. see the linux-arm-arm5spacemonkey column on https://build.golang.org/
What operating system and processor architecture are you using (
go env
)?GOARCH="arm"
GOOS="linux"
GOARM=5
What did you do?
ran all.bash
What did you expect to see?
the tests to pass
What did you see instead?
The text was updated successfully, but these errors were encountered: