Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: special netFD mutex causes hang for cross-compiled linux/arm5 #6462

Closed
josharian opened this issue Sep 23, 2013 · 13 comments
Closed

net: special netFD mutex causes hang for cross-compiled linux/arm5 #6462

josharian opened this issue Sep 23, 2013 · 13 comments

Comments

@josharian
Copy link
Contributor

What steps will reproduce the problem?

1. Cross-compile for linux/arm5.
2. Make a network request. My test harness for this bug was
http://play.golang.org/p/drwd1JwnTd


What is the expected output?

Success or failure


What do you see instead?

Hang


Which compiler are you using (5g, 6g, 8g, gccgo)?

5g


Which operating system are you using?

OS X 10.8.4


Which version are you using?  (run 'go version')

go version devel +f31759c38a32 Mon Sep 23 13:19:08 2013 -0400 darwin/amd64

hg bisect shows that this bug was introduced by
https://code.google.com/p/go/source/detail?r=7afd81b7fe122e743f0db88f538eb3b4004be650


Please provide any additional information below.

I don't know whether cross-compilation or linux/arm5 is necessary; it is just how I
reproduced it here. There may also be something else unusual about my setup here.

Compiling for host (OS X) does not reproduce the problem, including with CGO disabled.

Let me know what other info I can provide.
@dvyukov
Copy link
Member

dvyukov commented Sep 23, 2013

Comment 1:

Please run the program with GOTRACEBACK=2, and then do 'kill -6 PID', and post the stack
traces here.

@josharian
Copy link
Contributor Author

Comment 3:

go version devel +7afd81b7fe12 Fri Aug 09 21:43:00 2013 +0400 darwin/amd64
$ GOTRACEBACK=2 ./test-network 
Starting request
SIGABRT: abort
PC=0x3c8dc
runtime.notetsleepg(0x3c32bc, 0x7e0e0570, 0x3)
    .../src/pkg/runtime/lock_futex.c:188 +0x40 fp=0x4005ef74
timerproc()
    .../src/pkg/runtime/time.goc:217 +0xfc fp=0x4005efcc
runtime.goexit()
    .../src/pkg/runtime/proc.c:1364 fp=0x4005efcc
created by addtimer
    .../src/pkg/runtime/time.goc:90
goroutine 1 [select]:
runtime.park(0x17b34, 0x1070d310, 0x3bf927)
    .../src/pkg/runtime/proc.c:1312 +0x48
selectgo(0x40050f24)
    .../src/pkg/runtime/chan.c:995 +0x798
runtime.selectgo(0x1070d310)
    .../src/pkg/runtime/chan.c:845 +0x10
main.main()
    test-network.go:20 +0x2f0
runtime.main()
    .../src/pkg/runtime/proc.c:200 +0xf4
runtime.goexit()
    .../src/pkg/runtime/proc.c:1364
goroutine 2 [syscall]:
runtime.notetsleepg(0x4005ffa0, 0xf8475800, 0xd)
    .../src/pkg/runtime/lock_futex.c:188 +0x40
runtime.MHeap_Scavenger()
    .../src/pkg/runtime/mheap.c:467 +0xe4
runtime.goexit()
    .../src/pkg/runtime/proc.c:1364
created by runtime.main
    .../src/pkg/runtime/proc.c:167
goroutine 4 [select]:
runtime.park(0x17b34, 0x1070d460, 0x3bf927)
    .../src/pkg/runtime/proc.c:1312 +0x48
selectgo(0x4005dd04)
    .../src/pkg/runtime/chan.c:995 +0x798
runtime.selectgo(0x1070d460)
    .../src/pkg/runtime/chan.c:845 +0x10
net/http.(*Transport).getConn(0x107460c0, 0x1071d800, 0x1071d800, 0x0, 0x0)
    .../src/pkg/net/http/transport.go:424 +0x228
net/http.(*Transport).RoundTrip(0x107460c0, 0x1070d380, 0x0, 0x0, 0x0)
    .../src/pkg/net/http/transport.go:182 +0x2bc
net/http.send(0x1070d380, 0x4000c270, 0x107460c0, 0x0, 0x0, ...)
    .../src/pkg/net/http/client.go:168 +0x328
net/http.(*Client).send(0x3c3260, 0x1070d380, 0x16, 0x0, 0x0)
    .../src/pkg/net/http/client.go:100 +0x10c
net/http.(*Client).doFollowingRedirects(0x3c3260, 0x1070d380, 0x227ab0, 0x0, 0x0, ...)
    .../src/pkg/net/http/client.go:294 +0x5bc
net/http.(*Client).Get(0x3c3260, 0x1fd948, 0x16, 0x0, 0x0, ...)
    .../src/pkg/net/http/client.go:248 +0xb0
net/http.Get(0x1fd948, 0x16, 0x0, 0x0, 0x0)
    .../src/pkg/net/http/client.go:225 +0x50
main.func·001()
    test-network.go:14 +0x44
runtime.goexit()
    .../src/pkg/runtime/proc.c:1364
created by main.main
    test-network.go:19 +0x148
goroutine 5 [running]:
    goroutine running on other thread; stack unavailable
created by net/http.(*Transport).getConn
    .../src/pkg/net/http/transport.go:421 +0x118
trap    0x0
error   0x0
oldmask 0x0
r0      0xfffffffc
r1      0x0
r2      0x0
r3      0x4005ef28
r4      0x0
r5      0x0
r6      0x0
r7      0xf0
r8      0x3b9aca00
r9      0x3c6290
r10     0x107016e0
fp      0x0
ip      0x1070d310
sp      0x4005ef0c
lr      0x259e4
pc      0x3c8dc
cpsr    0x20000010
fault   0x0

@dvyukov
Copy link
Member

dvyukov commented Sep 23, 2013

Comment 4:

Does it consume 100% CPU when hangs?

@dvyukov
Copy link
Member

dvyukov commented Sep 23, 2013

Comment 5:

I suspect that it's infinitely looping inside of fdMutex. Is it possible to attach gdb
and see where exactly it loops? and also what is the state of fdMutex?

@minux
Copy link
Member

minux commented Sep 23, 2013

Comment 6:

perhaps related, issue #6440.

@minux
Copy link
Member

minux commented Sep 23, 2013

Comment 7:

what's your kernel version?
could you please cross compile test for sync/atomic and run it like this on your arm
platform?
./atomic.test -test.v 
does it immediate hang without much output?

@josharian
Copy link
Contributor Author

Comment 8:

> Does it consume 100% CPU when hangs?
Yes.
> I suspect that it's infinitely looping inside of fdMutex. Is it possible to attach gdb
and see where exactly it loops? and also what is the state of fdMutex?
I don't have gdb available in this environment. I can arrange it, but it'll take a
while. Let me know if it is worth doing.

@josharian
Copy link
Contributor Author

Comment 9:

> what's your kernel version?
$ uname -a
Linux ppbeacon 2.6.35.3-670-g914558e #1 PREEMPT Thu Aug 15 09:55:00 PDT 2013 armv5tejl
GNU/Linux
> could you please cross compile test for sync/atomic and run it like this on your arm
platform?
> does it immediate hang without much output?
Yes, it hangs immediately and pegs the CPU.

@minux
Copy link
Member

minux commented Sep 23, 2013

Comment 10:

thank you, it is indeed issue #6440.

Status changed to Duplicate.

Merged into issue #6440.

@dvyukov
Copy link
Member

dvyukov commented Sep 23, 2013

Comment 11:

phew!

@josharian
Copy link
Contributor Author

Comment 12:

:)

@minux
Copy link
Member

minux commented Sep 23, 2013

Comment 13:

ps: one workaround is to upgrade the kernel to 3.x series so that it will use kernel
provided 64-bit atomic op.

@josharian
Copy link
Contributor Author

Comment 14:

Good to know; thanks. Alas, that's not viable for me right now for other reasons, but
hopefully that'll help future issue spelunkers.

@golang golang locked and limited conversation to collaborators Jun 25, 2016
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants