Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: go1.8 build fails when run as a Jenkins job (was #18551) #19203

Closed
hartzell opened this issue Feb 20, 2017 · 17 comments
Closed

runtime: go1.8 build fails when run as a Jenkins job (was #18551) #19203

hartzell opened this issue Feb 20, 2017 · 17 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Milestone

Comments

@hartzell
Copy link

This is a continuation of #18551. I can now reproduce the problem on a smaller system and w/out involving Spack.

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

Using go-1.4 (built via Spack's go-bootstrap package) to build v1.8.

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH=""
GORACE=""
GOROOT="/isilon/Analysis/scratch/hartzelg/linuxbrew/Cellar/go/1.7.5/libexec"
GOTOOLDIR="/isilon/Analysis/scratch/hartzelg/linuxbrew/Cellar/go/1.7.5/libexec/pkg/tool/linux_amd64"
CC="/usr/bin/cc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build542396042=/tmp/go-build -gno-record-gcc-switches"
CXX="/usr/bin/c++"
CGO_ENABLED="1"

What did you do?

I had been using Spack's go package to build go and seeing the crashes reported in #18551. I initially attributed it to the size of the machine and considered Spack a second candidate culprit.

I have replace Spack with this shell script (stop giggling...):

#!/bin/bash

set -x
set -e
set -u

cd /path/to/tmp
rm -rf poodle
mkdir poodle
cd poodle
wget https://storage.googleapis.com/golang/go1.8.src.tar.gz
tar -xzf go1.8.src.tar.gz
cd go/src
export GOROOT_BOOTSTRAP=/path/to/go-bootstrap-1.4
./all.bash

I can run this command from the command line successfully.

I have a Jenkins job set up with this snippet of Jobs DSL

job('go-shell-script') {
  concurrentBuild(true)
  label(themachine')
  steps {
    shell '''#!/bin/bash
~me/build-go.sh
'''
  }
  publishers {
    mailer('me@work.com', false, true)
  }
}

The Jenkins master->slave connection is via SSH and it uses my account.

When I build it, I see the same symptoms I've been seeing in #18551. The output is in this gist.

What did you expect to see?

All tests passing in both command line and Jenkins build.

What did you see instead?

See the gist referenced above.

Conclusions

It seems that running the build remotely via Jenkins is my new primary candidate.

@ALTree
Copy link
Member

ALTree commented Feb 20, 2017

Another runtime test timeout: #18442 (this one has more details regarding the probable cause of the issue)
Another runtime test timeout: #19196 (also jenkins)

We probably should keep the first one and close the other (and this one)

@hartzell
Copy link
Author

@ALTree -- what convinces you that the Gentoo sandboxes have the same root problem as running via an ssh connection from Jenkins?

The Common Pitfalls section of the SSH Slaves plugin page says that in the end the Jenkins job script ends up being run like:

[...] but OpenSSH runs this with "bash -c command ..." (or whatever your login shell is.)

#18442 seems to be fixated on the sandboxing restrictions and/but I don't think the shell invocation above is sandboxing anything, though I'm still trying to understand what is happening...

@ALTree
Copy link
Member

ALTree commented Feb 20, 2017

shell invocation above is sandboxing anything

My suspect was that jenkins was transparently sandboxing executions in a similar manner, but I'm absolutely not a jenkins expert so please disregard my comment if you're convinced it's not useful.

@ALTree ALTree changed the title go v1.8 build fails when run as a Jenkins job (was #18551) runtime: go1.8 build fails when run as a Jenkins job (was #18551) Feb 20, 2017
@ALTree ALTree added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Feb 20, 2017
@ALTree ALTree added this to the Go1.9 milestone Feb 20, 2017
@hartzell
Copy link
Author

I'm not convinced either way, just worried. I'd be surprised if Jenkins SSH slaves were intentionally sandboxing things (you'd think they'd brag about it).

@ALTree
Copy link
Member

ALTree commented Feb 20, 2017

Can you add a t.Skip() to TestCrashDumpsAllThreads in go/src/runtime/crash_unix_test.go and try to build?

@ALTree
Copy link
Member

ALTree commented Feb 20, 2017

Yeah this is a dup of #19196. Closing since that one is older.

@ALTree ALTree closed this as completed Feb 20, 2017
@hartzell
Copy link
Author

@ALTree -- Testing as you requested.

The stack traces that I get don't always mention TestCrashDumpAllThreads, e.g. this one (from #18551):

https://gist.github.com/hartzell/232cf1a1624be49e278ea4f641eea01f

@ALTree ALTree reopened this Feb 20, 2017
@hartzell
Copy link
Author

Skipping that test gave me two successful builds in a row from Jenkins.

@ALTree
Copy link
Member

ALTree commented Feb 20, 2017

OK leaving this open until we're sure that the underling issue is the same as the other threads that I linked here.

@hartzell
Copy link
Author

The gist that does not mention TestCrashDumpAllThreads does mention TestGdbBacktrace, which seems to be involved in #18442.

@aclements
Copy link
Member

@hartzell, are you still having this issue? If so, I'd like to figure out if the test is actually hanging or just being slow. Could you set the environment variable GO_TEST_TIMEOUT_SCALE=10 and see if you can still reproduce?

The gist that does not mention TestCrashDumpAllThreads does mention TestGdbBacktrace, which seems to be involved in #18442.

It mentions TestGdbBacktrace, but TestGdbBacktrace isn't actually the running test. It's just blocked on testing-internal mechanisms waiting to be skipped.

@aclements aclements added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jun 8, 2017
@aclements aclements self-assigned this Jun 8, 2017
@hartzell
Copy link
Author

hartzell commented Jun 9, 2017

@aclements -- Repeating comment from another thread, I'm on vacation and miles/worlds away from work and access to this. I'll see what I can do when I return in a couple of Mondays.

@bradfitz bradfitz modified the milestones: Go1.10, Go1.9 Jun 9, 2017
@bradfitz
Copy link
Contributor

bradfitz commented Jun 9, 2017

Punting to Go 1.10. This isn't a regression from Go 1.8 to Go 1.9 anyway.

@aclements
Copy link
Member

@hartzell, no worries. Thanks in advance!

@hartzell
Copy link
Author

@aclements -- I have access to this large machine again, but at the moment it is involved in a bug hunt. I'll keep looking for an opportunity to recreate the symptoms and follow up.

@chlunde
Copy link
Contributor

chlunde commented Jul 16, 2017

I think this can be closed as a duplicate of #19196, as this is also when spawning from a Java process.

@chlunde chlunde marked this as a duplicate of #19196 Jul 16, 2017
@hartzell
Copy link
Author

Closing it seems reasonable to me too.

@chlunde has a more amenable test case too.

@golang golang locked and limited conversation to collaborators Jul 16, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

6 participants