Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: syscall on macOS hangs and causes 100% CPU #58814

Closed
myitcv opened this issue Mar 1, 2023 · 15 comments
Closed

runtime: syscall on macOS hangs and causes 100% CPU #58814

myitcv opened this issue Mar 1, 2023 · 15 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.

Comments

@myitcv
Copy link
Member

myitcv commented Mar 1, 2023

What version of Go are you using (go version)?

$ go version
go1.19.5

The problem occurs with a pre-built version of Hugo.

Does this issue reproduce with the latest release?

Not yet tested (the prebuilt releases of Hugo us go1.19.5).

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/Users/pauljolly/Library/Caches/go-build"
GOENV="/Users/pauljolly/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/pauljolly/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/pauljolly/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/opt/homebrew/Cellar/go/1.20.1/libexec"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/opt/homebrew/Cellar/go/1.20.1/libexec/pkg/tool/darwin_arm64"
GOVCS=""
GOVERSION="go1.20.1"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/Users/pauljolly/tmp/cuelang.org/go.mod"
GOWORK=""
CGO_CFLAGS="-O2 -g"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-O2 -g"
CGO_FFLAGS="-O2 -g"
CGO_LDFLAGS="-O2 -g"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/gs/_w_ys5tx43b255wkxwf67qy40000gp/T/go-build568172395=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

Instead Hugo v0.108.0 extended from https://github.com/gohugoio/hugo/releases/download/v0.108.0/hugo_extended_0.108.0_darwin-universal.tar.gz.

Ran hugo to build https://github.com/cue-lang/cuelang.org/tree/alpha.

What did you expect to see?

Hugo outputting to stdout that it is serving on localhost.

What did you see instead?

Process hanging, stuck at 100% CPU.

Here is a trace of the hugo process:

https://gist.github.com/myitcv/3e5ad9823c202082949f8c72c230122c

Next steps on our side are to try and rebuild Hugo extended using Go tip, but any pointers/ideas in the meantime would be much appreciated.

@myitcv
Copy link
Member Author

myitcv commented Mar 1, 2023

cc @prattmic based on other compiler/runtime issues.

@prattmic
Copy link
Member

prattmic commented Mar 1, 2023

Could you collect a core file (with GOTRACEBACK=crash) and then open it in gdb and run thread apply all bt to see where all of the system threads are? (edit: or the equivalent in lldb)

Does this application have CPU profiling enabled? If so, please test using a toolchain built from master or release-branch.go1.20, which includes a fix for hung traceback (#58513).

@prattmic
Copy link
Member

prattmic commented Mar 1, 2023

cc @golang/runtime

@prattmic prattmic added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 1, 2023
@myitcv
Copy link
Member Author

myitcv commented Mar 1, 2023

Could you collect a core file (with GOTRACEBACK=crash) and then open it in gdb and run thread apply all bt to see where all of the system threads are? (edit: or the equivalent in lldb)

Is gdb available on macOS arm?

I couldn't find anything suggesting it was, only this thread from last year saying it wasn't available (yet): https://inbox.sourceware.org/gdb/3185c3b8-8a91-4beb-a5d5-9db6afb93713@Spark/.

Apologies if I'm missing something obvious. I actually develop on Linux - I'm just reproducing and tracking this down for others who develop on mac.

Does this application have CPU profiling enabled?

Please can you let me know how I determine this?

@prattmic
Copy link
Member

prattmic commented Mar 1, 2023

Could you collect a core file (with GOTRACEBACK=crash) and then open it in gdb and run thread apply all bt to see where all of the system threads are? (edit: or the equivalent in lldb)

Is gdb available on macOS arm?

I'm not sure, lldb should work fine too. bt all is the equivalent command in lldb.

Does this application have CPU profiling enabled?

Please can you let me know how I determine this?

If you aren't sure, but can readily reproduce this issue, it is probably simplest to just test with a Go toolchain built from release-branch.go1.20. To build one:

$ git clone https://go.googlesource.com/go
$ cd go/src
$ git checkout release-branch.go1.20
$ ./make.bash
$ # Now build hugo with `../bin/go`.

@prattmic
Copy link
Member

prattmic commented Mar 1, 2023

Oops, I forgot to add git checkout of the release branch to the instructions above. I've edited to add it.

@myitcv
Copy link
Member Author

myitcv commented Mar 1, 2023

Thanks.

I'm not sure, lldb should work fine too. bt all is the equivalent command in lldb.

A fresh stack trace and lldb output here:

https://gist.github.com/myitcv/32f362f6c12e829c348383b2617815c5

If you aren't sure, but can readily reproduce this issue, it is probably simplest to just test with a Go toolchain built from release-branch.go1.20

Doing so now.

@myitcv
Copy link
Member Author

myitcv commented Mar 1, 2023

Oops, I forgot to add git checkout of the release branch to the instructions above. I've edited to add it.

No problems. I actually have go version go1.20.1 darwin/arm64 available to perform a build.

Is that good enough?

@prattmic
Copy link
Member

prattmic commented Mar 1, 2023

Oops, I forgot to add git checkout of the release branch to the instructions above. I've edited to add it.

No problems. I actually have go version go1.20.1 darwin/arm64 available to perform a build.

Is that good enough?

No, the CL is merged to release-branch.go1.20 for 1.20.2, which hasn't been released yet.

@prattmic
Copy link
Member

prattmic commented Mar 1, 2023

* thread #8, stop reason = ESR_EC_DABORT_EL0 (fault address: 0x16c107b50)
  * frame #0: 0x000000010411f8dc hugo`runtime.(*sigctxt).preparePanic + 76
    frame #1: 0x0000000104120a7c hugo`runtime.sighandler + 556
    frame #2: 0x000000010412051c hugo`runtime.sigtrampgo + 524
    frame #3: 0x000000010413f7bc hugo`runtime.sigtrampgo.abi0 + 28
    frame #4: 0x000000010413ec3c hugo`runtime.sigtramp.abi0 + 76
    frame #5: 0x00000001afbcc2a4 libsystem_platform.dylib`_sigtramp + 56
    frame #6: 0x00000001afbcc2a4 libsystem_platform.dylib`_sigtramp + 56
    frame #7: 0x000000010571d79c hugo`Sass::Eval::operator()(Sass::Binary_Expression*) + 4020
    frame #8: 0x000000010571d79c hugo`Sass::Eval::operator()(Sass::Binary_Expression*) + 4020
    frame #9: 0x0000000105728fa4 hugo`Sass::Eval::operator()(Sass::Argument*) + 72
    frame #10: 0x00000001057294e0 hugo`Sass::Eval::operator()(Sass::Arguments*) + 232
    frame #11: 0x0000000105722440 hugo`Sass::Eval::operator()(Sass::Function_Call*) + 1524
    frame #12: 0x0000000105728fa4 hugo`Sass::Eval::operator()(Sass::Argument*) + 72
    ... lots more frames (798 total) ...

This thread is panicking. I'm actually not 100% sure if this is just from the signal you sent to generate the core file, or it was actually stuck here before the core dump.

If the latter, it might be #58513, as that can get stuck in panic. OTOH, typically that bug should appear in traceback code in the panic, not in preparePanic.

That makes me think it is the former, in which case it looks like the C++ library was just stuck executing forever itself.

I guess another way to look at this would be to execute the program directly under lldb. Let it run until it seems stuck, then stop it and check bt all to see if it is in a panic, or just in the C++ library.

@myitcv
Copy link
Member Author

myitcv commented Mar 1, 2023

Thanks for this.

If the latter, it might be #58513, as that can get stuck in panic. OTOH, typically that bug should appear in traceback code in the panic, not in preparePanic.

In which case I will continue with my efforts to try and repro this building hugo with a later Go version, to conclusively rule that in/out.

@myitcv
Copy link
Member Author

myitcv commented Mar 1, 2023

No, the CL is merged to release-branch.go1.20 for 1.20.2, which hasn't been released yet.

Noted, thanks.

@myitcv
Copy link
Member Author

myitcv commented Mar 1, 2023

Ok, I was unable to reproduce this with the latest commit on the go1.20 branch. Which led me to bisect to find the fix.

Turns out the scenario I was running into was fixed by 7d3a5a5.

@myitcv
Copy link
Member Author

myitcv commented Mar 1, 2023

@prattmic - if you're happy 7d3a5a5 can explain what we've seen here, please feel free to close this issue.

@prattmic
Copy link
Member

prattmic commented Mar 2, 2023

Huh, well you certainly had a deep cgo stack, so that makes sense. I don't fully understand why the program was hanging rather than crashing with a SIGSEGV on stack overflow, but that is OK.

@prattmic prattmic closed this as completed Mar 2, 2023
@golang golang locked and limited conversation to collaborators Mar 1, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

3 participants