Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: random crashes on macOS 13 Ventura Public Beta #53800

Closed
ismail opened this issue Jul 11, 2022 · 12 comments
Closed

runtime: random crashes on macOS 13 Ventura Public Beta #53800

ismail opened this issue Jul 11, 2022 · 12 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Milestone

Comments

@ismail
Copy link

ismail commented Jul 11, 2022

What version of Go are you using (go version)?

❯ go version
go version go1.18.3 darwin/arm64

What operating system and processor architecture are you using (go env)?

go env Output
❯ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/Users/ismail/Library/Caches/go-build"
GOENV="/Users/ismail/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/ismail/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/ismail/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/opt/homebrew/Cellar/go/1.18.3/libexec"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/opt/homebrew/Cellar/go/1.18.3/libexec/pkg/tool/darwin_arm64"
GOVCS=""
GOVERSION="go1.18.3"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/68/1pjj14514sl3vj7lbt69lst40000gn/T/go-build2074291119=/tmp/go-build -gno-record-gcc-switches -fno-common"

Build failure

❯ gotip download
Updating the go development tree...
remote: Finding sources: 100% (79/79)
remote: Total 79 (delta 33), reused 77 (delta 33)
Receiving objects: 100% (79/79), 426.63 KiB | 6.56 MiB/s, done.
Resolving deltas: 100% (33/33), completed with 20 local objects.
From https://go.googlesource.com/go
 * branch              master     -> FETCH_HEAD
   59ab6f35..123a6328  master     -> origin/master
Previous HEAD position was 59ab6f35 net/http: remove Content-Encoding in writeNotModified
HEAD is now at 123a6328 internal/trace: don't report regions on system goroutines
Building Go cmd/dist using /opt/homebrew/Cellar/go/1.18.3/libexec. (go1.18.3 darwin/arm64)
Building Go toolchain1 using /opt/homebrew/Cellar/go/1.18.3/libexec.
Building Go bootstrap cmd/go (go_bootstrap) using Go toolchain1.Building Go toolchain2 using go_bootstrap and Go toolchain1.
Building Go toolchain3 using go_bootstrap and Go toolchain2.
Building packages and commands for darwin/arm64.
# html/template
/Users/ismail/sdk/gotip/src/html/template/escape.go:315:36: internal compiler error: missed typecheck:
.   NE tc(2) # escape.go:362:36,escape.go:315:36
.   .   LEN int tc(1) # escape.go:362:36,escape.go:315:36
.   .   .   NAME-template.norm esc(no) Class:PAUTO Offset:0 InlLocal OnStack Used string tc(1) # escape.go:362:5,escape.go:315:36
.   .   LITERAL-0 untyped int tc(1) # escape.go:362:36,escape.go:315:36

goroutine 1 [running]:
runtime/debug.Stack()
	/Users/ismail/sdk/gotip/src/runtime/debug/stack.go:24 +0x64
cmd/compile/internal/base.FatalfAt({0xf011a8?, 0x1?}, {0x100cccffb, 0x15}, {0x14000cd2028, 0x1, 0x1})
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/base/print.go:227 +0x224
cmd/compile/internal/base.Fatalf(...)
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/base/print.go:196
cmd/compile/internal/walk.walkExpr({0x100f011a8, 0x14000572780}, 0x14000f74be0)
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/expr.go:49 +0x2c0
cmd/compile/internal/walk.walkConv(0x14000575a90, 0x1006a5288?)
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/convert.go:23 +0x3c
cmd/compile/internal/walk.walkExpr1({0x100f01820, 0x14000575a90}, 0x14000575a90?)
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/expr.go:220 +0x450
cmd/compile/internal/walk.walkExpr({0x100f01820, 0x14000575a90}, 0x14000f74be0)
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/expr.go:56 +0x370
cmd/compile/internal/walk.finishCompare(0x14000f77920, {0x100f011a8?, 0x14000572780?}, 0x14000cd2c10?)
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/compare.go:450 +0x4c
cmd/compile/internal/walk.walkCompareString(0x14000f77920, 0x14000f74be0)
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/compare.go:415 +0x1000
cmd/compile/internal/walk.walkCompare(0x14000f77920, 0x14000f74be0)
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/compare.go:48 +0xf9c
cmd/compile/internal/walk.walkExpr1({0x100f011a8, 0x14000f77920}, 0x14000f77920?)
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/expr.go:156 +0x5cc
cmd/compile/internal/walk.walkExpr({0x100f011a8, 0x14000f77920}, 0x14000f74be0)
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/expr.go:56 +0x370
cmd/compile/internal/walk.walkIf(0x14000f74bd0)
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/stmt.go:229 +0x30
cmd/compile/internal/walk.walkStmt({0x100f01de0, 0x14000f74bd0?})
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/stmt.go:133 +0x344
cmd/compile/internal/walk.walkStmtList({0x1400119a000, 0xb, 0x14000817b30?})
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/stmt.go:177 +0x68
cmd/compile/internal/walk.walkExpr({0x100f032b8, 0x14000817b30}, 0x14000818d30)
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/expr.go:38 +0xc4
cmd/compile/internal/walk.walkIf(0x14000818d20)
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/stmt.go:229 +0x30
cmd/compile/internal/walk.walkStmt({0x100f01de0, 0x14000818d20?})
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/stmt.go:133 +0x344
cmd/compile/internal/walk.walkStmtList(...)
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/stmt.go:177
cmd/compile/internal/walk.walkFor(0x14001200240)
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/stmt.go:194 +0x250
cmd/compile/internal/walk.walkStmt({0x100f01b00, 0x14001200240?})
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/stmt.go:129 +0x300
cmd/compile/internal/walk.walkStmtList(...)
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/stmt.go:177
cmd/compile/internal/walk.walkIf(0x14001206bd0)
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/stmt.go:230 +0xc4
cmd/compile/internal/walk.walkStmt({0x100f01de0, 0x14001206bd0?})
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/stmt.go:133 +0x344
cmd/compile/internal/walk.walkRange(0x14000a65440)
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/range.go:311 +0x3d10
cmd/compile/internal/walk.walkStmt({0x100f027f0, 0x14000a65440?})
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/stmt.go:167 +0x3a0
cmd/compile/internal/walk.walkStmtList(...)
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/stmt.go:177
cmd/compile/internal/walk.Walk(0x14000cb23c0)
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/walk/walk.go:43 +0x194
cmd/compile/internal/gc.prepareFunc(0x14000cb23c0)
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/gc/compile.go:92 +0x80
cmd/compile/internal/gc.enqueueFunc(0x14000cb23c0)
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/gc/compile.go:66 +0x29c
cmd/compile/internal/gc.Main(0x100ef6d80)
	/Users/ismail/sdk/gotip/src/cmd/compile/internal/gc/main.go:300 +0x10f8
main.main()
	/Users/ismail/sdk/gotip/src/cmd/compile/main.go:57 +0xf4

go tool dist: FAILED: /Users/ismail/sdk/gotip/pkg/tool/darwin_arm64/go_bootstrap install std cmd: exit status 2
gotip: failed to build go: exit status 2

@ismail ismail changed the title gotip build fails on macOS Ventura Public Veta gotip build fails on macOS Ventura Public Beta Jul 11, 2022
@mknyszek mknyszek changed the title gotip build fails on macOS Ventura Public Beta cmd/compile: ICE on macOS Ventura Public Beta Jul 11, 2022
@mknyszek mknyszek added compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jul 11, 2022
@mknyszek mknyszek added this to the Go1.20 milestone Jul 11, 2022
@mknyszek
Copy link
Contributor

Putting it in the Go 1.20 milestone for now.

CC @griesemer @mdempsky

@mdempsky
Copy link
Member

I failed to reproduce this. I checked out 123a632 on my linux/amd64 laptop, ran make.bash, and then ran GOOS=darwin GOARCH=arm64 go build html/template, and it exited with success, not an ICE.

Is anyone else able to reproduce the failure?

@mdempsky mdempsky added help wanted WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Jul 11, 2022
@cherrymui
Copy link
Member

cherrymui commented Jul 12, 2022

I installed macOS 13 beta (22A5295i) on an M1 machine, and I got random crashes, seg faults, GC errors, weird errors in the compiler, linker, go command, etc. They look different each time. Looks like memory corruptions. I'll keep looking into it.

Given that Go has been working well on the M1 since macOS 11, and it seems to work well in a previous version of macOS 13 beta, there is probably something unexpected in the new beta of the OS.

@cherrymui cherrymui changed the title cmd/compile: ICE on macOS Ventura Public Beta runtime: random crashes on macOS 13 Ventura Public Beta Jul 12, 2022
@cherrymui
Copy link
Member

I found that running with low parallelism (GOMAXPROCS=1) seems to work.

Further, running multiple Go processes in parallel is okay, as long as each Go process uses only 1 thread. GOMAXPROCS=1 go test -p=20 -short std cmd passes multiple times in a row.

On the other hand, running with a single process but high thread number within a process seems to make it fairly likely to fail. GOMAXPROCS=20 go test -p=1 -short std cmd fails very easily.

So it seems the issue is thread switch and/or synchronization within a process.

For one attempt, for synchronization, I tried to add a memory barrier instruction (DMB $0xe) around all our atomic operations, but it doesn't help. This is probably expected. It is an OS update, not a hardware update, so hardware synchronization instructions shouldn't change.

Maybe something changed for thread switch and/or synchronization in the kernel, or libSystem?

@ismail
Copy link
Author

ismail commented Jul 13, 2022

Reported to Apple via Feedback tool, got assigned FB10667468.

@cherrymui
Copy link
Member

I found that disabling memory reuse -- GOGC=off to turn off the GC, and also applying the following patch to stop recycling stack memory -- it no longer crashes.

diff --git a/src/runtime/stack.go b/src/runtime/stack.go
index 2a7f0bd1c3..9df845c6cc 100644
--- a/src/runtime/stack.go
+++ b/src/runtime/stack.go
@@ -445,6 +445,10 @@ func stackalloc(n uint32) stack {
 //
 //go:systemstack
 func stackfree(stk stack) {
+	if gcController.gcPercent.Load() < 0 {
+		return // XXX don't recycle stack memory if GOGC=off
+	}
+
 	gp := getg()
 	v := unsafe.Pointer(stk.lo)
 	n := stk.hi - stk.lo

It is possible that we miss some sort of barrier when reusing memory. But as it works fine on a variety of weak-memory-model architectures, as well as previous versions of macOS, it is still weird.

I also tried allowing Go GC but disabling returning memory to the OS, i.e. comment out all the code in sysUnusedOS, sysUsedOS, and sysFreeOS in runtime/mem_darwin.go, but it doesn't help.

@kaorihinata
Copy link

kaorihinata commented Jul 22, 2022

Just to add more logs to the pile as I've been experiencing it since Developer Beta 2. On Developer Beta 1 you could still use Xcode 13, as Xcode 14 Beta 1 was broken in a different way. I didn't start getting these random crashes until Developer Beta 2 forward. I've also had other things crashing out with error messages regarding unaligned memory access, if that gives any hints.

I can confirm that dropping down concurrency to 1 does work pretty consistently for me though (at least with the Go build process.)

I submitted this to Apple on the 6th of July. ID is FB10564832. I updated it to mention that it's probably the same issue discussed here.

error_1.log
error_2.log
error_3.log
error_4.log
error_5.log

Edit: Also, yes. It was on arm64. I've not tested the build process with x86_64 as the arch yet.

@meta-github
Copy link

Does this bug only affect the golang compiler or does it also affect applications created by the golang compiler?

@kaorihinata
Copy link

While I would need someone more familiar with Go errors/backtraces to weigh in, I'm tempted to say applications produced by the Go compiler are also affected. I started receiving random crashes relating to unexpected values from Terraform (written in Go) around the same time that I started receiving the above compiler crashes. In the case of the Go compiler, I would assume that it just so happens to create the ideal conditions to trigger this issue. Backtrace of an example Terraform crash attached, reproduced today.

error_6.log

@cherrymui
Copy link
Member

It is not limited to the compiler. It also affects also programs generated by the Go compiler, and as an OS bug it may affect non-Go programs as well. Apple is working on a fix. Thanks.

@kaorihinata
Copy link

As of macOS 13 Beta 6 I am no longer able to replicate this issue either by building go, or terraform-provider-aws (which was one of the hardest to get to build in Beta 5) repeatedly. It appears that the fix we were waiting for is in Beta 6. I may repeat the builds a few more times, but I've already built both 3 or 4 times without issue.

@cherrymui
Copy link
Member

Yeah, I think beta 6 should fix this, although I haven't been able to test myself. Thanks for confirming.

Closing for now. We can reopen if we see it again.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Darwin WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

8 participants