Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: go get hangs #16104

Closed
phemmer opened this issue Jun 18, 2016 · 17 comments
Closed

cmd/go: go get hangs #16104

phemmer opened this issue Jun 18, 2016 · 17 comments

Comments

@phemmer
Copy link

phemmer commented Jun 18, 2016

  1. What version of Go are you using (go version)?
# go version
go version go1.6 linux/amd64
  1. What operating system and processor architecture are you using (go env)?
# go env
GOARCH="amd64"
GOBIN="/home/phemmer/.go/bin"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/phemmer/.go:/tmp/go"
GORACE=""
GOROOT="/usr/lib/go"
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
GO15VENDOREXPERIMENT="1"
CC="x86_64-pc-linux-gnu-gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0"
CXX="x86_64-pc-linux-gnu-g++"
CGO_ENABLED="1"
  1. What did you do?
go get -t -d -v github.com/influxdata/telegraf 
  1. What did you expect to see?

success

  1. What did you see instead?
github.com/influxdata/telegraf (download)

hanging indefinitely

Additional info:

This has been happening to me for years (obviously on older versions of go), but it's mostly just an annoyance as I can usually just retry the go get once or twice and it'll succeed. Meaning this doesn't happen every time.

When I send a SIGQUIT to the process, this is the output:

SIGQUIT: quit
PC=0x4e9df1 m=0

goroutine 0 [idle]:
runtime.futex(0xc8ff08, 0x0, 0x0, 0x0, 0x0, 0xc8f290, 0x0, 0x0, 0x49b294, 0xc8ff08, ...)
    /usr/lib/go/src/runtime/sys_linux_amd64.s:302 +0x21
runtime.futexsleep(0xc8ff08, 0x0, 0xffffffffffffffff)
    /usr/lib/go/src/runtime/os1_linux.go:40 +0x53
runtime.notesleep(0xc8ff08)
    /usr/lib/go/src/runtime/lock_futex.go:145 +0xa4
runtime.stopm()
    /usr/lib/go/src/runtime/proc.go:1535 +0x10b
runtime.findrunnable(0xc820020a00, 0x0)
    /usr/lib/go/src/runtime/proc.go:1973 +0x739
runtime.schedule()
    /usr/lib/go/src/runtime/proc.go:2072 +0x24f
runtime.park_m(0xc820000180)
    /usr/lib/go/src/runtime/proc.go:2137 +0x18b
runtime.mcall(0x7ffdd42511f0)
    /usr/lib/go/src/runtime/asm_amd64.s:233 +0x5b

goroutine 1 [chan receive]:
os/exec.(*Cmd).Wait(0xc820170640, 0x0, 0x0)
    /usr/lib/go/src/os/exec/exec.go:401 +0x2c9
os/exec.(*Cmd).Run(0xc820170640, 0x0, 0x0)
    /usr/lib/go/src/os/exec/exec.go:262 +0x64
main.(*vcsCmd).run1(0xc89340, 0x9df678, 0x1, 0xa3e0d0, 0x12, 0xc820113370, 0x4, 0x4, 0xc820161501, 0x0, ...)
    /usr/lib/go/src/cmd/go/vcs.go:365 +0xba0
main.(*vcsCmd).run(0xc89340, 0x9df678, 0x1, 0xa3e0d0, 0x12, 0xc820113370, 0x4, 0x4, 0x0, 0x0)
    /usr/lib/go/src/cmd/go/vcs.go:312 +0x97
main.(*vcsCmd).create(0xc89340, 0xc820165580, 0x34, 0xc8201613b0, 0x26, 0x0, 0x0)
    /usr/lib/go/src/cmd/go/vcs.go:389 +0x28a
main.downloadPackage(0xc82017ac00, 0x0, 0x0)
    /usr/lib/go/src/cmd/go/get.go:436 +0x1231
main.download(0x7ffdd4252479, 0x1e, 0x0, 0xc820113ba0, 0x2)
    /usr/lib/go/src/cmd/go/get.go:225 +0xf1a
main.runGet(0xc881a0, 0xc82000a1d0, 0x1, 0x1)
    /usr/lib/go/src/cmd/go/get.go:107 +0x1de
main.main()
    /usr/lib/go/src/cmd/go/main.go:181 +0x783

goroutine 17 [syscall, locked to thread]:
runtime.goexit()
    /usr/lib/go/src/runtime/asm_amd64.s:1998 +0x1

goroutine 5 [syscall]:
os/signal.signal_recv(0x0)
    /usr/lib/go/src/runtime/sigqueue.go:116 +0x132
os/signal.loop()
    /usr/lib/go/src/os/signal/signal_unix.go:22 +0x18
created by os/signal.init.1
    /usr/lib/go/src/os/signal/signal_unix.go:28 +0x37

goroutine 7 [syscall]:
syscall.Syscall(0x0, 0x4, 0xc82019c047, 0x5b9, 0x47, 0x200, 0x49cea9)
    /usr/lib/go/src/syscall/asm_linux_amd64.s:18 +0x5
syscall.read(0x4, 0xc82019c047, 0x5b9, 0x5b9, 0x47, 0x0, 0x0)
    /usr/lib/go/src/syscall/zsyscall_linux_amd64.go:783 +0x5f
syscall.Read(0x4, 0xc82019c047, 0x5b9, 0x5b9, 0x0, 0x0, 0x0)
    /usr/lib/go/src/syscall/syscall_unix.go:161 +0x4d
os.(*File).read(0xc8200321f0, 0xc82019c047, 0x5b9, 0x5b9, 0xc82019c000, 0x0, 0x0)
    /usr/lib/go/src/os/file_unix.go:228 +0x53
os.(*File).Read(0xc8200321f0, 0xc82019c047, 0x5b9, 0x5b9, 0x47, 0x0, 0x0)
    /usr/lib/go/src/os/file.go:95 +0x8a
bytes.(*Buffer).ReadFrom(0xc820182a10, 0x7fea392042d0, 0xc8200321f0, 0x47, 0x0, 0x0)
    /usr/lib/go/src/bytes/buffer.go:176 +0x23f
io.copyBuffer(0x7fea39208da0, 0xc820182a10, 0x7fea392042d0, 0xc8200321f0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
    /usr/lib/go/src/io/io.go:374 +0x180
io.Copy(0x7fea39208da0, 0xc820182a10, 0x7fea392042d0, 0xc8200321f0, 0x0, 0x0, 0x0)
    /usr/lib/go/src/io/io.go:350 +0x64
os/exec.(*Cmd).writerDescriptor.func1(0x0, 0x0)
    /usr/lib/go/src/os/exec/exec.go:236 +0x8b
os/exec.(*Cmd).Start.func1(0xc820170640, 0xc82016d000)
    /usr/lib/go/src/os/exec/exec.go:344 +0x1d
created by os/exec.(*Cmd).Start
    /usr/lib/go/src/os/exec/exec.go:345 +0x967

rax    0xca
rbx    0x0
rcx    0x4e9df3
rdx    0x0
rdi    0xc8ff08
rsi    0x0
rbp    0x1
rsp    0x7ffdd4251050
r8     0x0
r9     0x0
r10    0x0
r11    0x286
r12    0x3
r13    0xad30bb
r14    0x4
r15    0x8
rip    0x4e9df1
rflags 0x286
cs     0x33
fs     0x0
gs     0x0

I also have an strace available.

@davecheney
Copy link
Contributor

davecheney commented Jun 19, 2016

Can you run git clone https://github.com/influxdata/telegraf directly? How long does it take? Does your environment mandate a proxy, specially one requiring authentication? Do you have any entries in your .gitconfig to redirect https urls to ssh which may be interfering ?

@phemmer
Copy link
Author

phemmer commented Jun 19, 2016

Can you run git clone https://github.com/influxdata/telegraf directly? How long does it take?

# time git clone https://github.com/influxdata/telegraf 
Cloning into 'telegraf'...
remote: Counting objects: 13420, done.
remote: Compressing objects: 100% (28/28), done.
remote: Total 13420 (delta 20), reused 13 (delta 13), pack-reused 13378
Receiving objects: 100% (13420/13420), 10.82 MiB | 3.61 MiB/s, done.
Resolving deltas: 100% (6901/6901), done.
Checking connectivity... done.
git clone https://github.com/influxdata/telegraf  3.88s user 0.88s system 63% cpu 7.525 total

Does your environment mandate a proxy, specially one requiring authentication?

No

Do you have any entries in your .gitconfig to redirect https urls to ssh which may be interfering ?

Yes

[url "git@github.com:"]
    insteadOf = https://github.com/

But this should be transparent to go get. And why would it only fail sometimes? It works 100% of the time when I run git clone by hand, where as I have about a 30% failure rate with go get.

@davecheney
Copy link
Contributor

go get calls git clone.

I don't know why it works some times and not others, it's probably related to whatever is on disk at the time.

If you remove the .gitconfig redirect, does this fix the issue ?

@davecheney
Copy link
Contributor

If I had to guess, because you've redirected https to ssh, something is hanging with the usual ssh prompt to accept an ssh fingerprint. Just a guess.

@phemmer
Copy link
Author

phemmer commented Jun 19, 2016

I don't know why it works some times and not others, it's probably related to whatever is on disk at the time.

I can wipe out the $GOPATH/src/github.com/influxdata/telegraf directory and still get the inconsistent behavior.

If you remove the .gitconfig redirect, does this fix the issue ?

Yes

If I had to guess, because you've redirected https to ssh, something is hanging with the usual ssh prompt to accept an ssh fingerprint. Just a guess.

Not the case.

  1. When go get is hung, ps shows it has no child processes.
  2. I have the ssh fingerprint for github.
  3. It hangs after the repo is cloned as $GOPATH/src/github.com/influxdata/telegraf exists, with all the source files.

(and just for clarification, this happens with all repos, not just this one)

@davecheney
Copy link
Contributor

Can you please try again with the -x flag and post the complete output.
Thanks

On Sun, Jun 19, 2016 at 3:34 PM, Patrick Hemmer notifications@github.com
wrote:

I don't know why it works some times and not others, it's probably related
to whatever is on disk at the time.

I can wipe out the $GOPATH/src/github.com/influxdata/telegraf directory
and still get the inconsistent behavior.

If you remove the .gitconfig redirect, does this fix the issue ?

Yes

If I had to guess, because you've redirected https to ssh, something is
hanging with the usual ssh prompt to accept an ssh fingerprint. Just a
guess.

Not the case.

  1. When go get is hung, ps shows it has no child processes.
  2. I have the ssh fingerprint for github.
  3. It hangs after the repo is cloned as $GOPATH/src/
    github.com/influxdata/telegraf exists, with all the source files.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#16104 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AAAcAwHpgF1Rkdv5cM1ekhtfiqvZc-Pxks5qNNTJgaJpZM4I42Do
.

@phemmer
Copy link
Author

phemmer commented Jun 19, 2016

# go get -t -d -v -x github.com/influxdata/telegraf
github.com/influxdata/telegraf (download)
cd .
git clone https://github.com/influxdata/telegraf /home/phemmer/.go/src/github.com/influxdata/telegraf

Just sits there after that last line.

@davecheney
Copy link
Contributor

Maybe this is a git bug, if git doesn't return then there is nothing the go
tool can do.

On Sun, 19 Jun 2016, 15:38 Patrick Hemmer notifications@github.com wrote:

go get -t -d -v -x github.com/influxdata/telegrafgithub.com/influxdata/telegraf (download)

cd .
git clone https://github.com/influxdata/telegraf /home/phemmer/.go/src/github.com/influxdata/telegraf

Just sits there after that last line.


You are receiving this because you commented.

Reply to this email directly, view it on GitHub
#16104 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AAAcA_eqVqAZ1OxzJtt9zhtvUSwmyVoPks5qNNXXgaJpZM4I42Do
.

@phemmer
Copy link
Author

phemmer commented Jun 19, 2016

But it does return. As mentioned earlier, go get has no child processes. There is no git process running. And if it was a git issue, I would expect to experience this when git clone is run manually as well, but also as mentioned, it works fine 100% of the time.

# ps -ef | grep 'go get'
phemmer  23977  8359  0 01:57 pts/16   00:00:00 /usr/bin/go get -t -d -v -x github.com/influxdata/telegraf

# ps --ppid 23977 
USER       PID PRI  NI %CPU %MEM STAT     ELAPSED COMMAND

# ps -ef|grep git
phemmer  23977  8359  0 01:57 pts/16   00:00:00 /usr/bin/go get -t -d -v -x github.com/influxdata/telegraf
phemmer  23991     1  0 01:57 ?        00:00:00 ssh: /tmp/.ssh_control-github.com-22-git [mux]

Though that last line did lead me down another route of investigation. Killing the ssh control process (PID 23991) does make go get resume. But as you can see from the ps -ef output (3rd column), the PPID is 1, meaning it's a background process. Why is go get waiting for it? How is it even waiting for it? The wait system call only lets you listen to processes which are your child, so how is this process which is not a child of go get blocking go get?

@ianlancetaylor ianlancetaylor added this to the Go1.7Maybe milestone Jun 20, 2016
@ianlancetaylor
Copy link
Contributor

The go get command is hanging because it is waiting for the child process standard output or standard error to be closed. In your original stack trace you can see that goroutine 7 is hanging in the read system call.

I don't know why ssh is hanging for you, but it's not clear to me that there is anything that the go command can do here.

@ianlancetaylor ianlancetaylor modified the milestones: Unplanned, Go1.7Maybe Jun 20, 2016
@phemmer
Copy link
Author

phemmer commented Jun 20, 2016

ssh isn't hanging, this is how it behaves when using the ControlMaster auto setting. It forks off a background process to maintain a persistent connection.
The control process detaches stdin & stdout, but leaves stderr open. Is there any reason why go is grabbing stderr? Shouldn't it leave it pointing to the TTY?

@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Jun 20, 2016

Is there some way that go get can tell git to not use ssh in this mode?

In general go get collects standard output and standard error from the child process. That seems reasonable to me when go is executing a subprocess. I don't see why it's obviously correct for go get to leave standard error connected to the go command's standard error.

@phemmer
Copy link
Author

phemmer commented Jun 20, 2016

Unless go is going to suppress mesages that come across stderr, I don't see any reason for capturing it. This is the way most other tools, including git work.

However you could probably set the environment variable GIT_SSH_COMMAND='ssh -o ControlMaster=no'.
But this seems like it might be a bit too specific of a fix. It's possible the user might have a git hook, or something else which also launches a background process with stderr still open. Though maybe this is rare enough that we just ignore the issue.

@0xmohit
Copy link
Contributor

0xmohit commented Jun 20, 2016

Also see #13453.

@phemmer
Copy link
Author

phemmer commented Jun 20, 2016

Ahha! this can probably be closed as a duplicate then.

@ianlancetaylor
Copy link
Contributor

It does look like the same underlying issue, so closing as dup.

go get does in fact currently suppress standard error output from git clone.

@gopherbot
Copy link

CL https://golang.org/cl/31353 mentions this issue.

gopherbot pushed a commit that referenced this issue Oct 19, 2016
Fixes #13453.
Fixes #16104.

Change-Id: I4e94f606df786af8143f8649c9afde570f346301
Reviewed-on: https://go-review.googlesource.com/31353
Reviewed-by: Quentin Smith <quentin@golang.org>
@golang golang locked and limited conversation to collaborators Oct 18, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants