Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: fatal error: acquireSudog: found s.elem != nil in cache #48188

Closed
AdamKorcz opened this issue Sep 4, 2021 · 8 comments
Closed

runtime: fatal error: acquireSudog: found s.elem != nil in cache #48188

AdamKorcz opened this issue Sep 4, 2021 · 8 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Milestone

Comments

@AdamKorcz
Copy link

What version of Go are you using (go version)?

$ go version go1.17 linux/amd64

Does this issue reproduce with the latest release?

Reproducer is not yet created.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/root/.cache/go-build"
GOENV="/root/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/root/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/root/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.17"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build3562792503=/tmp/go-build -gno-record-gcc-switches"

What did you do?

I encountered this bug after running a go-fuzz fuzzer for about 1 week. The fuzzer is currently not publicly available.

What did you expect to see?

I expected not to see this crash.

What did you see instead?

fatal error: acquireSudog: found s.elem != nil in cache

goroutine 1355056023 [running]:
runtime.throw(0x2319a0b, 0x2a)
	runtime/panic.go:1117 +0x74 fp=0x10c0009b4468 sp=0x10c0009b4438 pc=0x58e7b4
runtime.acquireSudog(0x10c000a91bc0)
	runtime/proc.go:383 +0x376 fp=0x10c0009b44d8 sp=0x10c0009b4468 pc=0x591af6
runtime.selectgo(0x10c0009b4768, 0x10c0009b46c0, 0x0, 0x0, 0x4, 0x10c0009b4701, 0xf06665, 0x10c0001581c0)
	runtime/select.go:298 +0xbf7 fp=0x10c0009b4610 sp=0x10c0009b44d8 pc=0x5a2957
google.golang.org/grpc/internal/transport.(*http2Server).keepalive(0x10c0006b9380)
	google.golang.org/grpc@v1.38.0/internal/transport/http2_server.go:993 +0x2c5 fp=0x10c0009b47d8 sp=0x10c0009b4610 pc=0xe46d45
runtime.goexit()
	runtime/asm_amd64.s:1371 +0x1 fp=0x10c0009b47e0 sp=0x10c0009b47d8 pc=0x5ca1a1
created by google.golang.org/grpc/internal/transport.newHTTP2Server
	google.golang.org/grpc@v1.38.0/internal/transport/http2_server.go:300 +0x1545

goroutine 17 [select, locked to thread]:
google.golang.org/grpc/internal/transport.(*Stream).waitOnHeader(0x10c000778120)
	google.golang.org/grpc@v1.38.0/internal/transport/transport.go:322 +0xbf
google.golang.org/grpc/internal/transport.(*Stream).RecvCompress(...)
	google.golang.org/grpc@v1.38.0/internal/transport/transport.go:337
google.golang.org/grpc.(*csAttempt).recvMsg(0x10c0007b9c00, 0x2a66de0, 0x10c0006f6c80, 0x0, 0x0, 0x0)
	google.golang.org/grpc@v1.38.0/stream.go:943 +0x911
google.golang.org/grpc.(*clientStream).RecvMsg.func1(0x10c0007b9c00, 0x4124330, 0x0)
	google.golang.org/grpc@v1.38.0/stream.go:808 +0x57
google.golang.org/grpc.(*clientStream).withRetry(0x10c000133e60, 0x10c000759830, 0x10c000759800, 0x10c000985970, 0xf0a750)
	google.golang.org/grpc@v1.38.0/stream.go:666 +0x130
google.golang.org/grpc.(*clientStream).RecvMsg(0x10c000133e60, 0x2a66de0, 0x10c0006f6c80, 0x0, 0x0)
	google.golang.org/grpc@v1.38.0/stream.go:807 +0x145
google.golang.org/grpc.invoke(0x2b99920, 0x10c00032de60, 0x2319327, 0x2a, 0x2a66ca0, 0x10c0006f6c00, 0x2a66de0, 0x10c0006f6c80, 0x10c0004b0700, 0x10c00089f540, ...)
	google.golang.org/grpc@v1.38.0/call.go:73 +0x17f
google.golang.org/grpc.(*ClientConn).Invoke(0x10c0004b0700, 0x2b99920, 0x10c00032de60, 0x2319327, 0x2a, 0x2a66ca0, 0x10c0006f6c00, 0x2a66de0, 0x10c0006f6c80, 0x0, ...)
	google.golang.org/grpc@v1.38.0/call.go:37 +0x22a
github.com/containerd/containerd/api/services/images/v1.(*imagesClient).List(0x10c0001b65d0, 0x2b99920, 0x10c00032de60, 0x10c0006f6c00, 0x0, 0x0, 0x0, 0x10c0001b65d0, 0x10c0008b2878, 0x2b36230)
	github.com/containerd/containerd/api@v0.0.0/services/images/v1/images.pb.go:573 +0xe5
github.com/containerd/containerd.(*remoteImages).List(0x10c00009eba0, 0x2b99920, 0x10c00032de60, 0x0, 0x0, 0x0, 0x30, 0x10c00032de60, 0x10c000064000, 0x10c000064000, ...)
	github.com/containerd/containerd/image_store.go:53 +0xb7
github.com/containerd/containerd.(*Client).ListImages(0x10c0008b27e0, 0x2b99920, 0x10c00032de60, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	github.com/containerd/containerd/client.go:490 +0xa9
github.com/containerd/containerd/contrib/fuzz.FuzzContainerdImport(0x6020000080f0, 0x1, 0x1, 0x0)
	github.com/containerd/containerd/contrib/fuzz/containerd_import_fuzzer.go:188 +0x553
main.LLVMFuzzerTestOneInput(...)
	command-line-arguments/main.038934106.go:21

goroutine 20 [chan receive]:
k8s.io/klog/v2.(*loggingT).flushDaemon(0x40f3b20)
	k8s.io/klog/v2@v2.9.0/klog.go:1169 +0xab
created by k8s.io/klog/v2.init.0
	k8s.io/klog/v2@v2.9.0/klog.go:420 +0xff

goroutine 62 [chan receive, 13388 minutes]:
github.com/containerd/containerd/cmd/containerd/command.App.func1(0x10c0004c8000, 0x0, 0x0)
	github.com/containerd/containerd/cmd/containerd/command/main.go:253 +0xeec
github.com/urfave/cli.HandleAction(0x287ff80, 0x2b318c8, 0x10c0004c8000, 0x0, 0x0)
	github.com/urfave/cli@v1.22.2/app.go:523 +0x1ab
github.com/urfave/cli.(*App).Run(0x10c000158000, 0x10c000051a40, 0x2, 0x2, 0x0, 0x0)
	github.com/urfave/cli@v1.22.2/app.go:285 +0x7a6
github.com/containerd/containerd/cmd/containerd/command.StartDaemonForFuzzing(0x10c000051a40, 0x2, 0x2)
	github.com/containerd/containerd/cmd/containerd/command/main.go:375 +0x56
github.com/containerd/containerd/contrib/fuzz.init.0.func1(0x10c000051a40, 0x2, 0x2)
	github.com/containerd/containerd/contrib/fuzz/containerd_import_fuzzer.go:48 +0x50
created by github.com/containerd/containerd/contrib/fuzz.init.0
	github.com/containerd/containerd/contrib/fuzz/containerd_import_fuzzer.go:43 +0x9e

goroutine 7 [select]:
github.com/containerd/containerd/cmd/containerd/command.handleSignals.func1(0x10c0005a4420, 0x10c0005a43c0, 0x2b99920, 0x10c0004642d0, 0x10c0000b6000)
	github.com/containerd/containerd/cmd/containerd/command/main_unix.go:43 +0xee
created by github.com/containerd/containerd/cmd/containerd/command.handleSignals
	github.com/containerd/containerd/cmd/containerd/command/main_unix.go:40 +0x9a

goroutine 41 [syscall]:
os/signal.signal_recv(0x2b761e8)
	runtime/sigqueue.go:168 +0xa5
os/signal.loop()
	os/signal/signal_unix.go:23 +0x45
created by os/signal.Notify.func1.1
	os/signal/signal.go:151 +0x78

goroutine 203 [sleep]:
time.Sleep(0x2540be400)
	runtime/time.go:193 +0xd8
github.com/containerd/containerd/runtime/restart/monitor.(*monitor).run(0x10c0005881d0, 0x2540be400)
	github.com/containerd/containerd/runtime/restart/monitor/monitor.go:166 +0x85
created by github.com/containerd/containerd/runtime/restart/monitor.init.0.func1
	github.com/containerd/containerd/runtime/restart/monitor/monitor.go:87 +0x167

goroutine 201 [syscall, 13388 minutes]:
syscall.Syscall6(0xe8, 0x7, 0x10c0006419a8, 0x80, 0xffffffffffffffff, 0x0, 0x0, 0x6c80d8, 0x10c00025e247, 0x1)
	syscall/asm_linux_amd64.s:43 +0x5
golang.org/x/sys/unix.EpollWait(0x7, 0x10c0006419a8, 0x80, 0x80, 0xffffffffffffffff, 0x10c000044420, 0x2, 0x0)
	golang.org/x/sys@v0.0.0-20210630005230-0f9fa26af87c/unix/zsyscall_linux_amd64.go:77 +0xaf
github.com/containerd/containerd/metrics/cgroups/v1.(*oomCollector).start(0x10c0000feb80)
	github.com/containerd/containerd/metrics/cgroups/v1/oom.go:114 +0x9b
created by github.com/containerd/containerd/metrics/cgroups/v1.newOOMCollector
	github.com/containerd/containerd/metrics/cgroups/v1/oom.go:50 +0x11e

goroutine 225 [IO wait]:
internal/poll.runtime_pollWait(0x7f4fc2ff0da0, 0x72, 0x46fcd0)
	runtime/netpoll.go:222 +0x65
internal/poll.(*pollDesc).wait(0x10c000240998, 0x72, 0x0, 0x0, 0x22d349e)
	internal/poll/fd_poll_runtime.go:87 +0x89
internal/poll.(*pollDesc).waitRead(...)
	internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Accept(0x10c000240980, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
	internal/poll/fd_unix.go:401 +0x389
net.(*netFD).accept(0x10c000240980, 0x2, 0x10c000116af0, 0x5d46cc)
	net/fd_unix.go:172 +0x53
net.(*UnixListener).accept(0x10c0001e23c0, 0x10c000116b08, 0x10c000116b10, 0x28)
	net/unixsock_posix.go:162 +0x45
net.(*UnixListener).Accept(0x10c0001e23c0, 0x2b339c0, 0x10c0001581c0, 0x2b92370, 0x10c0001e23c0)
	net/unixsock.go:260 +0xa5
google.golang.org/grpc.(*Server).Serve(0x10c0001581c0, 0x2b92370, 0x10c0001e23c0, 0x0, 0x0)
	google.golang.org/grpc@v1.38.0/server.go:786 +0x2bd
github.com/containerd/containerd/services/server.(*Server).ServeGRPC(0x10c0003d2a00, 0x2b92370, 0x10c0001e23c0, 0x18, 0x22dfa16)
	github.com/containerd/containerd/services/server/server.go:272 +0x94
github.com/containerd/containerd/cmd/containerd/command.serve.func1(0x2b92370, 0x10c0001e23c0, 0x10c0005d93e0, 0x2b99920, 0x10c0001e2450, 0x10c0005dc4c0, 0x1f)
	github.com/containerd/containerd/cmd/containerd/command/main.go:267 +0x94
created by github.com/containerd/containerd/cmd/containerd/command.serve
	github.com/containerd/containerd/cmd/containerd/command/main.go:265 +0x32e

goroutine 200 [select]:
github.com/containerd/containerd/gc/scheduler.(*gcScheduler).run(0x10c00008acc0, 0x2b99920, 0x10c0004642d0)
	github.com/containerd/containerd/gc/scheduler/scheduler.go:268 +0x1d1
created by github.com/containerd/containerd/gc/scheduler.init.0.func1
	github.com/containerd/containerd/gc/scheduler/scheduler.go:132 +0x485

goroutine 222 [syscall, 13388 minutes]:
syscall.Syscall6(0xe8, 0xa, 0x10c00072bbe4, 0x7, 0xffffffffffffffff, 0x0, 0x0, 0x0, 0x0, 0x0)
	syscall/asm_linux_amd64.s:43 +0x5
golang.org/x/sys/unix.EpollWait(0xa, 0x10c00072bbe4, 0x7, 0x7, 0xffffffffffffffff, 0x0, 0x0, 0x0)
	golang.org/x/sys@v0.0.0-20210630005230-0f9fa26af87c/unix/zsyscall_linux_amd64.go:77 +0xaf
github.com/fsnotify/fsnotify.(*fdPoller).wait(0x10c0005dc3e0, 0x0, 0x0, 0x0)
	github.com/fsnotify/fsnotify@v1.4.9/inotify_poller.go:86 +0xbc
github.com/fsnotify/fsnotify.(*Watcher).readEvents(0x10c0001c8b40)
	github.com/fsnotify/fsnotify@v1.4.9/inotify.go:192 +0x229
created by github.com/fsnotify/fsnotify.NewWatcher
	github.com/fsnotify/fsnotify@v1.4.9/inotify.go:59 +0x20d

goroutine 224 [IO wait, 13388 minutes]:
internal/poll.runtime_pollWait(0x7f4fc2ff0e88, 0x72, 0x46fcd0)
	runtime/netpoll.go:222 +0x65
internal/poll.(*pollDesc).wait(0x10c000240818, 0x72, 0x0, 0x0, 0x22d349e)
	internal/poll/fd_poll_runtime.go:87 +0x89
internal/poll.(*pollDesc).waitRead(...)
	internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Accept(0x10c000240800, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
	internal/poll/fd_unix.go:401 +0x389
net.(*netFD).accept(0x10c000240800, 0x10c000118b00, 0x10d7be5, 0x10c0005ca160)
	net/fd_unix.go:172 +0x53
net.(*UnixListener).accept(0x10c0001e2300, 0x34320000, 0x2b92370, 0x10c0001e2300)
	net/unixsock_posix.go:162 +0x45
net.(*UnixListener).Accept(0x10c0001e2300, 0x2b92370, 0x10c0001e2300, 0xa, 0x10c000118be8)
	net/unixsock.go:260 +0xa5
github.com/containerd/ttrpc.(*Server).Serve(0x10c0005ca150, 0x2b998b0, 0x10c0001b0000, 0x2b92370, 0x10c0001e2300, 0x0, 0x0)
	github.com/containerd/ttrpc@v1.0.2/server.go:87 +0x13b
github.com/containerd/containerd/services/server.(*Server).ServeTTRPC(0x10c0003d2a00, 0x2b92370, 0x10c0001e2300, 0x19, 0x22cf35c)
	github.com/containerd/containerd/services/server/server.go:277 +0x70
github.com/containerd/containerd/cmd/containerd/command.serve.func1(0x2b92370, 0x10c0001e2300, 0x10c0005d9350, 0x2b99920, 0x10c0001e2390, 0x10c0005d0720, 0x25)
	github.com/containerd/containerd/cmd/containerd/command/main.go:267 +0x94
created by github.com/containerd/containerd/cmd/containerd/command.serve
	github.com/containerd/containerd/cmd/containerd/command/main.go:265 +0x32e

goroutine 259 [chan receive]:
github.com/containerd/containerd/pkg/cri/server.(*snapshotsSyncer).start.func1(0x10c0001c94f0, 0x10c0004d1b20)
	github.com/containerd/containerd/pkg/cri/server/snapshots.go:65 +0x76
created by github.com/containerd/containerd/pkg/cri/server.(*snapshotsSyncer).start
	github.com/containerd/containerd/pkg/cri/server/snapshots.go:57 +0x71

goroutine 261 [IO wait, 13388 minutes]:
internal/poll.runtime_pollWait(0x7f4fc2ff0cb8, 0x72, 0x46fcd0)
	runtime/netpoll.go:222 +0x65
internal/poll.(*pollDesc).wait(0x10c000240d18, 0x72, 0x0, 0x0, 0x22d349e)
	internal/poll/fd_poll_runtime.go:87 +0x89
internal/poll.(*pollDesc).waitRead(...)
	internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Accept(0x10c000240d00, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
	internal/poll/fd_unix.go:401 +0x389
net.(*netFD).accept(0x10c000240d00, 0x10c0001c9810, 0x7f4fc1c8d100, 0x30)
	net/fd_unix.go:172 +0x53
net.(*TCPListener).accept(0x10c00027e8b8, 0x10c00011dd70, 0x56447a, 0x30)
	net/tcpsock_posix.go:139 +0x45
net.(*TCPListener).Accept(0x10c00027e8b8, 0x2a16ba0, 0x10c000287d70, 0x28c9280, 0x36f59b0)
	net/tcpsock.go:261 +0xa5
net/http.(*Server).Serve(0x10c0005ae380, 0x2b92340, 0x10c00027e8b8, 0x0, 0x0)
	net/http/server.go:2981 +0x31e
github.com/containerd/containerd/pkg/cri/streaming.(*server).Start(0x10c000310240, 0x1, 0x0, 0x0)
	github.com/containerd/containerd/pkg/cri/streaming/server.go:265 +0x1a5
github.com/containerd/containerd/pkg/cri/server.(*criService).Run.func2(0x10c0005cf440, 0x10c00032e000)
	github.com/containerd/containerd/pkg/cri/server/service.go:216 +0x90
created by github.com/containerd/containerd/pkg/cri/server.(*criService).Run
	github.com/containerd/containerd/pkg/cri/server/service.go:214 +0x538

goroutine 226 [select]:
github.com/containerd/containerd/pkg/cri/server.(*eventMonitor).start.func1(0x10c0005cf3e0, 0x10c000552b00, 0x10c000279800)
	github.com/containerd/containerd/pkg/cri/server/events.go:252 +0x136
created by github.com/containerd/containerd/pkg/cri/server.(*eventMonitor).start
	github.com/containerd/containerd/pkg/cri/server/events.go:249 +0xd3

goroutine 260 [select, 13388 minutes]:
github.com/containerd/containerd/pkg/cri/server.(*cniNetConfSyncer).syncLoop(0x10c000274e70, 0x0, 0x0)
	github.com/containerd/containerd/pkg/cri/server/cni_conf_syncer.go:75 +0xf5
github.com/containerd/containerd/pkg/cri/server.(*criService).Run.func1(0x10c0002a6300, 0x10c00032e000)
	github.com/containerd/containerd/pkg/cri/server/service.go:208 +0x6f
created by github.com/containerd/containerd/pkg/cri/server.(*criService).Run
	github.com/containerd/containerd/pkg/cri/server/service.go:206 +0x490

goroutine 199 [select, 13388 minutes]:
github.com/docker/go-events.(*Broadcaster).run(0x10c0001468c0)
	github.com/docker/go-events@v0.0.0-20190806004212-e31b211e4f1c/broadcast.go:117 +0x19a
created by github.com/docker/go-events.NewBroadcaster
	github.com/docker/go-events@v0.0.0-20190806004212-e31b211e4f1c/broadcast.go:39 +0x1d5

goroutine 246 [sync.Cond.Wait, 13388 minutes]:
sync.runtime_notifyListWait(0x10c00003d490, 0x10c000000000)
	runtime/sema.go:513 +0xfc
sync.(*Cond).Wait(0x10c00003d480)
	sync/cond.go:56 +0xf2
github.com/docker/go-events.(*Queue).next(0x10c00057f9b0, 0x0, 0x0)
	github.com/docker/go-events@v0.0.0-20190806004212-e31b211e4f1c/queue.go:103 +0xba
github.com/docker/go-events.(*Queue).run(0x10c00057f9b0)
	github.com/docker/go-events@v0.0.0-20190806004212-e31b211e4f1c/queue.go:68 +0x65
created by github.com/docker/go-events.NewQueue
	github.com/docker/go-events@v0.0.0-20190806004212-e31b211e4f1c/queue.go:29 +0x11e

goroutine 223 [select, 13388 minutes]:
github.com/containerd/containerd/pkg/cri/server.(*criService).Run(0x10c00032e000, 0x2b998b0, 0x10c0001b0000)
	github.com/containerd/containerd/pkg/cri/server/service.go:227 +0x616
github.com/containerd/containerd/pkg/cri.initCRIService.func1(0x2bc15a8, 0x10c00032e000, 0x2b99920, 0x10c0004642d0)
	github.com/containerd/containerd/pkg/cri/cri.go:107 +0x62
created by github.com/containerd/containerd/pkg/cri.initCRIService
	github.com/containerd/containerd/pkg/cri/cri.go:106 +0x797

goroutine 247 [select, 13388 minutes]:
github.com/containerd/containerd/events/exchange.(*Exchange).Subscribe.func3(0x10c00057fa10, 0x10c0001d9ce0, 0x10c000047c80, 0x2b99878, 0x10c000552ac0, 0x10c0004ffd40)
	github.com/containerd/containerd/events/exchange/exchange.go:171 +0x130
created by github.com/containerd/containerd/events/exchange.(*Exchange).Subscribe
	github.com/containerd/containerd/events/exchange/exchange.go:165 +0x2cc

goroutine 1355056036 [IO wait]:
internal/poll.runtime_pollWait(0x7f4fc2ff0748, 0x72, 0x46fcd0)
	runtime/netpoll.go:222 +0x65
internal/poll.(*pollDesc).wait(0x10c000865718, 0x72, 0x8000, 0x8000, 0xffffffffffffffff)
	internal/poll/fd_poll_runtime.go:87 +0x89
internal/poll.(*pollDesc).waitRead(...)
	internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0x10c000865700, 0x10c00067c000, 0x8000, 0x8000, 0x0, 0x0, 0x0)
	internal/poll/fd_unix.go:166 +0x313
net.(*netFD).Read(0x10c000865700, 0x10c00067c000, 0x8000, 0x8000, 0x9, 0x8, 0x0)
	net/fd_posix.go:55 +0x65
net.(*conn).Read(0x10c000134508, 0x10c00067c000, 0x8000, 0x8000, 0x0, 0x0, 0x0)
	net/net.go:183 +0xd0
bufio.(*Reader).Read(0x10c000a320c0, 0x10c0008b28f8, 0x9, 0x9, 0x0, 0x0, 0x10c0009c7c68)
	bufio/bufio.go:227 +0x413
io.ReadAtLeast(0x2b5a548, 0x10c000a320c0, 0x10c0008b28f8, 0x9, 0x9, 0x9, 0x100000000000, 0x561e05, 0x46ff60)
	io/io.go:328 +0xc8
io.ReadFull(...)
	io/io.go:347
golang.org/x/net/http2.readFrameHeader(0x10c0008b28f8, 0x9, 0x9, 0x2b5a548, 0x10c000a320c0, 0x0, 0x0, 0x10c0008a88d0, 0x100000000000000)
	golang.org/x/net@v0.0.0-20210520170846-37e1c6afe023/http2/frame.go:237 +0x98
golang.org/x/net/http2.(*Framer).ReadFrame(0x10c0008b28c0, 0x10c0008a88d0, 0x10c0008a8800, 0x0, 0x0)
	golang.org/x/net@v0.0.0-20210520170846-37e1c6afe023/http2/frame.go:492 +0xc5
google.golang.org/grpc/internal/transport.(*http2Client).reader(0x10c00096c1e0)
	google.golang.org/grpc@v1.38.0/internal/transport/http2_client.go:1347 +0x1f5
created by google.golang.org/grpc/internal/transport.newHTTP2Client
	google.golang.org/grpc@v1.38.0/internal/transport/http2_client.go:346 +0xfdd

goroutine 1355055969 [select]:
google.golang.org/grpc.(*ccBalancerWrapper).watcher(0x10c000552940)
	google.golang.org/grpc@v1.38.0/balancer_conn_wrappers.go:69 +0xe5
created by google.golang.org/grpc.newCCBalancerWrapper
	google.golang.org/grpc@v1.38.0/balancer_conn_wrappers.go:60 +0x17e

goroutine 1355056024 [IO wait]:
internal/poll.runtime_pollWait(0x7f4fc2ff0830, 0x72, 0x46fcd0)
	runtime/netpoll.go:222 +0x65
internal/poll.(*pollDesc).wait(0x10c000997c18, 0x72, 0x8000, 0x8000, 0xffffffffffffffff)
	internal/poll/fd_poll_runtime.go:87 +0x89
internal/poll.(*pollDesc).waitRead(...)
	internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0x10c000997c00, 0x10c000696000, 0x8000, 0x8000, 0x0, 0x0, 0x0)
	internal/poll/fd_unix.go:166 +0x313
net.(*netFD).Read(0x10c000997c00, 0x10c000696000, 0x8000, 0x8000, 0x9, 0x8, 0x0)
	net/fd_posix.go:55 +0x65
net.(*conn).Read(0x10c000588428, 0x10c000696000, 0x8000, 0x8000, 0x0, 0x0, 0x0)
	net/net.go:183 +0xd0
bufio.(*Reader).Read(0x10c0000e4600, 0x10c0004ab378, 0x9, 0x9, 0x0, 0x0, 0x10c000655cc8)
	bufio/bufio.go:227 +0x413
io.ReadAtLeast(0x2b5a548, 0x10c0000e4600, 0x10c0004ab378, 0x9, 0x9, 0x9, 0x100000000000, 0x561f85, 0x46ffe0)
	io/io.go:328 +0xc8
io.ReadFull(...)
	io/io.go:347
golang.org/x/net/http2.readFrameHeader(0x10c0004ab378, 0x9, 0x9, 0x2b5a548, 0x10c0000e4600, 0x0, 0x0, 0x0, 0x0)
	golang.org/x/net@v0.0.0-20210520170846-37e1c6afe023/http2/frame.go:237 +0x98
golang.org/x/net/http2.(*Framer).ReadFrame(0x10c0004ab340, 0x10c0008a88a0, 0x40f3760, 0x0, 0x0)
	golang.org/x/net@v0.0.0-20210520170846-37e1c6afe023/http2/frame.go:492 +0xc5
google.golang.org/grpc/internal/transport.(*http2Server).HandleStreams(0x10c0006b9380, 0x10c0005906f0, 0x2b33a00)
	google.golang.org/grpc@v1.38.0/internal/transport/http2_server.go:473 +0xc5
google.golang.org/grpc.(*Server).serveStreams(0x10c0001581c0, 0x2bb0078, 0x10c0006b9380)
	google.golang.org/grpc@v1.38.0/server.go:918 +0x11c
google.golang.org/grpc.(*Server).handleRawConn.func1(0x10c0001581c0, 0x2bb0078, 0x10c0006b9380, 0x10c0005dc4c0, 0x1f)
	google.golang.org/grpc@v1.38.0/server.go:868 +0x50
created by google.golang.org/grpc.(*Server).handleRawConn
	google.golang.org/grpc@v1.38.0/server.go:867 +0x692

goroutine 1355056022 [running]:
	goroutine running on other thread; stack unavailable
created by google.golang.org/grpc/internal/transport.newHTTP2Server
	google.golang.org/grpc@v1.38.0/internal/transport/http2_server.go:289 +0x151f

goroutine 1355055970 [chan receive]:
google.golang.org/grpc.(*addrConn).resetTransport(0x10c0004038c0)
	google.golang.org/grpc@v1.38.0/clientconn.go:1214 +0xb71
created by google.golang.org/grpc.(*addrConn).connect
	google.golang.org/grpc@v1.38.0/clientconn.go:844 +0x2c5

goroutine 1355056037 [runnable]:
google.golang.org/grpc/internal/transport.(*controlBuffer).get(0x10c00008db80, 0x1, 0x0, 0x0, 0x0, 0x0)
	google.golang.org/grpc@v1.38.0/internal/transport/controlbuf.go:407 +0x1cc
google.golang.org/grpc/internal/transport.(*loopyWriter).run(0x10c000a32180, 0x0, 0x0)
	google.golang.org/grpc@v1.38.0/internal/transport/controlbuf.go:527 +0x2fc
google.golang.org/grpc/internal/transport.newHTTP2Client.func3(0x10c00096c1e0)
	google.golang.org/grpc@v1.38.0/internal/transport/http2_client.go:396 +0x8c
created by google.golang.org/grpc/internal/transport.newHTTP2Client
	google.golang.org/grpc@v1.38.0/internal/transport/http2_client.go:394 +0x15f5
@ianlancetaylor
Copy link
Contributor

This may be tough to solve without a reproduction case.

CC @aclements @mknyszek @prattmic

@ianlancetaylor ianlancetaylor added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 5, 2021
@ianlancetaylor ianlancetaylor added this to the Backlog milestone Sep 5, 2021
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 7, 2022
@kuabhish
Copy link

kuabhish commented Aug 8, 2022

getting the same issue.. !!!

@vignesh-codes
Copy link

@ianlancetaylor @AdamKorcz Did you find any solutions to prevent this from happening?
I am facing the same issue in production for my chat service which handles alot of requests per second and fetches data from redis sources and does some processing to the list struct which i created on the fly.

@prattmic
Copy link
Member

prattmic commented Aug 8, 2022

@kuabhish @vignesh-codes Do either of you have a reproducer you can share?

@prattmic prattmic added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Aug 8, 2022
@vignesh-codes
Copy link

vignesh-codes commented Aug 26, 2022

@prattmic
Hey. Apologies for the delay.
We actually did some changes to the function which had for loop that spawned go routine at each iterations.
Each go routine fetches some data from redis and appends to a global var(not read by others).

We removed and it worked somehow.

I have given a pseudocode of the function here which caused the issue. No more crashes in our service

response := []map[string]interface{}{}
for _, j := range entries {
    wg.Add(1)
    go func() {
        defer wg.Done()
        data_bytes, err := fetchFromRedis(j)
        if err != nil {
            return
        }
        var data_mapped map[string]interface{}
        err = json.Unmarshal(data_bytes, data_mapped)
        if err != nil  {
            return
        }
        response.append(response, data_mapped)
    }
    wg.Wait()
    return response
}

I am not exactly sure why this issue was happening. We made use of redis pipe for optimisation here. Our production service has lots of traffic per minute. I hope it can help you debug the issue.

@randall77
Copy link
Contributor

Appending to a global variable without synchronization is a data race. Have you tried running with the race detector?

@vignesh-codes
Copy link

hmmm yeah might be coz of that.. have nt checked with race detector. I did not add mutex there since there were no read happening concurrently and was just appending

@gopherbot
Copy link

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

@golang golang locked and limited conversation to collaborators Sep 8, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

7 participants