Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: execution halts with goroutines stuck in runtime.gopark and runtime.futex #58798

Closed
patrick-ogrady opened this issue Mar 1, 2023 · 17 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Milestone

Comments

@patrick-ogrady
Copy link

patrick-ogrady commented Mar 1, 2023

What version of Go are you using (go version)?

go version go1.20.1 linux/amd64

Does this issue reproduce with the latest release?

Yes, with 1.20.1

What operating system and processor architecture are you using (go env)?

Distributor ID:	Ubuntu
Description:	Ubuntu 20.04.4 LTS
Release:	20.04
Codename:	focal

What did you do?

After upgrading avalanchego to use go1.20.1 instead of 1.19.6, the binary now randomly halts (all goroutines stop and the program becomes unresponsive) after about 20-30 minutes of runtime on go1.20.1 linux/amd64.

Notably, the binary does not panic or exit at this point (as you would expect if it was deadlocked). It just stays pegged to ~30% CPU and spins indefinitely on something (but all async processes halt):
image

The output of grs -r on dlv is attached from the point of stall:
goroutines.txt

What did you expect to see?

The program to work like it did on v1.19.6 (not halt without warning or panic). To run this test (and hopefully avoid wasting your time), we compiled the exact same code in both 1.19.6 and 1.20.1 (this only occurred in 1.20.1).

What did you see instead?

The program halted without emitting a panic or any other info that could be used to diagnose the issue. All metrics-related goroutines that would normally inform of system level processes also stopped running (I.e. global program halt).

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Mar 1, 2023
@patrick-ogrady patrick-ogrady changed the title runtime: execution halts with all goroutines stuck in runtime.gopark and runtime.futex runtime: execution halts with goroutines stuck in runtime.gopark and runtime.futex Mar 1, 2023
@patrick-ogrady
Copy link
Author

patrick-ogrady commented Mar 1, 2023

If there is any other info you need, please let us know. We can reproduce this with our application reliably and have a dlv core dump from 20 minutes after the start of the halt.

@patrick-ogrady
Copy link
Author

patrick-ogrady commented Mar 1, 2023

Upon further examination, this appears where at least 1 goroutine gets stuck (waiting for a new object alloction that triggers gcStart):

* Goroutine 4921 - User: /home/runner/work/avalanchego/avalanchego/message/inbound_msg_builder.go:313 github.com/ava-labs/avalanchego/message.(*outMsgBuilder).Chits (0x92477b) (thread 3356575)
	 0  0x00000000004797e3 in runtime.futex
	     at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555
	 1  0x00000000004756c0 in runtime.systemstack_switch
	     at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:463
	 2  0x0000000000422c7c in runtime.gcStart
	     at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/mgc.go:665
	 3  0x0000000000415297 in runtime.mallocgc
	     at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/malloc.go:1172
	 4  0x0000000000415507 in runtime.newobject
	     at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/malloc.go:1254
	 5  0x000000000092477b in github.com/ava-labs/avalanchego/message.encodeIDs
	     at /home/runner/work/avalanchego/avalanchego/message/inbound_msg_builder.go:313
	 6  0x000000000092477b in github.com/ava-labs/avalanchego/message.(*outMsgBuilder).Chits
	     at /home/runner/work/avalanchego/avalanchego/message/outbound_msg_builder.go:618
	 7  0x000000000092773c in github.com/ava-labs/avalanchego/message.(*creator).Chits
	     at <autogenerated>:1
	 8  0x0000000000c1d5cb in github.com/ava-labs/avalanchego/snow/networking/sender.(*sender).SendChits
	     at /home/runner/work/avalanchego/avalanchego/snow/networking/sender/sender.go:1160
	 9  0x0000000000cd3027 in github.com/ava-labs/avalanchego/snow/engine/snowman.(*Transitive).sendChits
	     at /home/runner/work/avalanchego/avalanchego/snow/engine/snowman/transitive.go:461
	10  0x0000000000cd0445 in github.com/ava-labs/avalanchego/snow/engine/snowman.(*Transitive).PullQuery
	     at /home/runner/work/avalanchego/avalanchego/snow/engine/snowman/transitive.go:198
	(truncated)

@patrick-ogrady
Copy link
Author

patrick-ogrady commented Mar 1, 2023

one theory from a cursory look (I'm not a contributor so pardon my ignorance) is that gcStart is basically blocked on acquiring the stopTheWorldWithSema:

systemstack(stopTheWorldWithSema)

And the total "halt" we see is because most other locks have already returned but gc can't acquire some non-zero number of them. Alternatively, there could be some off-by-one issue within the world lock sema function that causes it to iterate forever in a for loop instead of returning to continue GC (which would explain the CPU spike).

@prattmic
Copy link
Member

prattmic commented Mar 1, 2023

If you have a core dump, could you open it with gdb rather than dlv, and run “thread apply all bt”?

This should show what the system threads are doing rather than the goroutines.

@patrick-ogrady
Copy link
Author

I'm spinning up a linux box to view the dump in gdb right now (gdb is not supported on my Mac M1).

In the meantime, I used dlv to at least pull the threads from it:

  Thread 3354731 at 0x4797e3 /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555 runtime.futex
  Thread 3354732 at 0x4791dd /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:135 runtime.usleep
  Thread 3354733 at 0x409f8e /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/internal/syscall/asm_linux_amd64.s:36 runtime/internal/syscall.Syscall6
  Thread 3354734 at 0x4797e3 /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555 runtime.futex
  Thread 3354735 at 0x4797e3 /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555 runtime.futex
  Thread 3354736 at 0x4797e3 /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555 runtime.futex
  Thread 3354738 at 0x4797e3 /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555 runtime.futex
  Thread 3354743 at 0x4797e3 /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555 runtime.futex
  Thread 3354812 at 0x4797e3 /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555 runtime.futex
  Thread 3354813 at 0x4797e3 /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555 runtime.futex
  Thread 3354861 at 0x45ffdb /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/stubs.go:18 runtime.pcdatavalue
  Thread 3354862 at 0x4797e3 /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555 runtime.futex
* Thread 3356575 at 0x4797e3 /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555 runtime.futex

@patrick-ogrady
Copy link
Author

patrick-ogrady commented Mar 1, 2023

Here you go:
threads.log


Thread 13 (Thread 0x7fdae66e7700 (LWP 3354862)):
#0  runtime.futex () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555
#1  0x000000000043c6b6 in runtime.futexsleep (addr=0xfffffffffffffe00, val=0, ns=4691939) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/os_linux.go:69
#2  0x0000000000413907 in runtime.notesleep (n=0xc00b81a548) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/lock_futex.go:160
#3  0x00000000004471cc in runtime.mPark () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:2334
#4  runtime.stopm () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:2334
#5  0x0000000000448a1c in runtime.findRunnable (gp=<optimized out>, inheritTime=<optimized out>, tryWakeP=<optimized out>) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3007
#6  0x0000000000449851 in runtime.schedule () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3360
#7  0x0000000000449d6d in runtime.park_m (gp=0xc00666f520) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3511
#8  0x00000000004756a3 in runtime.mcall () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:452
#9  0x00007fdb083339ff in ?? ()
#10 0x0100000000475620 in ?? ()
#11 0x0000000000800000 in github.com/ava-labs/avalanchego/ids.ShortFromPrefixedString (idStr=..., prefix=..., ~r0=..., ~r1=...) at /home/runner/work/avalanchego/avalanchego/ids/short.go:45
#12 0x0000000000000000 in ?? ()

Thread 12 (Thread 0x7fdae6ee8700 (LWP 3354861)):
#0  runtime.add (p=0x23cfb6b, x=<optimized out>) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/stubs.go:18
#1  runtime.pcdatastart (table=2, f=...) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/symtab.go:1102
#2  runtime.pcdatavalue (f=..., table=2, targetpc=16160960, cache=0xc00b362d98, ~r0=<optimized out>) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/symtab.go:1109
#3  0x0000000000468085 in runtime.gentraceback (pc0=16160966, sp0=824633974008, lr0=<optimized out>, gp=0xc00f4f2b60, skip=0, pcbuf=0xc00b362ff0, max=64, callback={void (runtime.stkframe *, void *, bool *)} 0xc00b362fd0, v=0x0, flags=6, ~r0=<optimized out>) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/traceback.go:362
#4  0x000000000044ca0e in runtime.sigprof (pc=16160966, sp=<optimized out>, lr=<optimized out>, gp=0xc00f4f2b60, mp=0xc00afd2c00) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:4731
#5  0x0000000000456f39 in runtime.sighandler (sig=<optimized out>, info=<optimized out>, ctxt=<optimized out>, gp=0xf698c0 <github.com/ava-labs/coreth/core.NewHeaderChain.func1>) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/signal_unix.go:631
#6  0x0000000000456494 in runtime.sigtrampgo (sig=27, info=0xc00b3634b0, ctx=0xc00b363380) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/signal_unix.go:479
#7  0x0000000000479546 in runtime.sigtramp () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:354
#8  <signal handler called>
#9  github.com/ava-labs/coreth/core.NewFIFOCache[go.shape.uint64,go.shape.*uint8] (limit=<optimized out>) at /home/runner/go/pkg/mod/github.com/ava-labs/coreth@v0.11.8-rc.1/core/fifo_cache.go:32
#10 github.com/ava-labs/coreth/core.NewHeaderChain.func1 () at /home/runner/go/pkg/mod/github.com/ava-labs/coreth@v0.11.8-rc.1/core/headerchain.go:88
#11 0x0000000000f8567e in github.com/ava-labs/coreth/core.(*BoundedBuffer[go.shape.uint64]).Insert (b=0xc00bbea4c0, h=19386922) at /home/runner/go/pkg/mod/github.com/ava-labs/coreth@v0.11.8-rc.1/core/bounded_buffer.go:42
#12 0x0000000000f85566 in github.com/ava-labs/coreth/core.(*BufferFIFOCache[go.shape.uint64,go.shape.*uint8]).Put (f=0xc00921b260, key=19386922, val=0xc00794a500 "\211\244`\371\374\212@\237\177\322\304X9B\227\062j=\200\226\070G\266\261\005\v\003G\217h") at /home/runner/go/pkg/mod/github.com/ava-labs/coreth@v0.11.8-rc.1/core/fifo_cache.go:47
#13 0x0000000000f857ef in github.com/ava-labs/coreth/core.(*BufferFIFOCache[uint64,*github.com/ava-labs/coreth/core/types.Header]).Put (f=0x127d20a <github.com/ava-labs/avalanchego/proto/pb/http.file_http_http_proto_init+106>, key=<optimized out>, val=<optimized out>) at /home/runner/go/pkg/mod/github.com/ava-labs/coreth@v0.11.8-rc.1/core/fifo_cache.go:43
#14 0x0000000000f4c8b2 in github.com/ava-labs/coreth/core.(*BlockChain).startAcceptor (bc=0xc00b972000) at /home/runner/go/pkg/mod/github.com/ava-labs/coreth@v0.11.8-rc.1/core/blockchain.go:567
#15 0x0000000000f4b226 in github.com/ava-labs/coreth/core.NewBlockChain.func4 () at /home/runner/go/pkg/mod/github.com/ava-labs/coreth@v0.11.8-rc.1/core/blockchain.go:384
#16 0x00000000004778e1 in runtime.goexit () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:1598
#17 0x7d4800007cf50000 in ?? ()
#18 0x7dee00007d9b0000 in ?? ()
#19 0x7e9400007e410000 in ?? ()
#20 0x7f3a00007ee70000 in ?? ()
#21 0x00000000017805e6 in go:string.* ()
#22 0x0000000000000006 in ?? ()
#23 0x0000000000000019 in ?? ()
#24 0x0000000000000000 in ?? ()

Thread 11 (Thread 0x7fdae7fff700 (LWP 3354812)):
#0  runtime.futex () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555
#1  0x000000000043c6b6 in runtime.futexsleep (addr=0xfffffffffffffe00, val=0, ns=4691939) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/os_linux.go:69
#2  0x0000000000413907 in runtime.notesleep (n=0xc00118fd48) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/lock_futex.go:160
#3  0x00000000004471cc in runtime.mPark () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:2334
#4  runtime.stopm () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:2334
#5  0x0000000000448a1c in runtime.findRunnable (gp=<optimized out>, inheritTime=<optimized out>, tryWakeP=<optimized out>) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3007
#6  0x0000000000449851 in runtime.schedule () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3360
#7  0x0000000000449d6d in runtime.park_m (gp=0xc006af0b60) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3511
#8  0x00000000004756a3 in runtime.mcall () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:452
#9  0x00007fdb0582ea9f in ?? ()
#10 0x0100000000475620 in ?? ()
#11 0x0000000000800000 in github.com/ava-labs/avalanchego/ids.ShortFromPrefixedString (idStr=..., prefix=..., ~r0=..., ~r1=...) at /home/runner/work/avalanchego/avalanchego/ids/short.go:45
#12 0x0000000000000000 in ?? ()

Thread 10 (Thread 0x7fdb062f0700 (LWP 3354736)):
#0  runtime.futex () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555
#1  0x000000000043c6b6 in runtime.futexsleep (addr=0xfffffffffffffe00, val=0, ns=4691939) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/os_linux.go:69
#2  0x0000000000413907 in runtime.notesleep (n=0x338a0b8 <runtime.newmHandoff+24>) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/lock_futex.go:160
#3  0x00000000004470b1 in runtime.templateThread () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:2312
#4  0x0000000000445bf3 in runtime.mstart1 () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:1498
#5  0x0000000000445b3a in runtime.mstart0 () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:1455
#6  0x0000000000475625 in runtime.mstart () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:394
#7  0x00000000013b9a7c in crosscall_amd64 ()
#8  0x00007fdb062effc0 in ?? ()
#9  0x00007ffca558f9c0 in ?? ()
#10 0x00007ffca558f7cf in ?? ()
#11 0x00007ffca558f7ce in ?? ()
#12 0x000000c000007860 in ?? ()
#13 0x0000000000475620 in ?? ()
#14 0x00000000013b9489 in threadentry ()
#15 0x00007fdb2f8c7609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#16 0x00007fdb2f7ec133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 9 (Thread 0x7fdb07b33700 (LWP 3354733)):
#0  runtime/internal/syscall.Syscall6 () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/internal/syscall/asm_linux_amd64.s:36
#1  0x0000000000409f73 in syscall.RawSyscall6 (num=18446744073709551612, a1=620, a2=4235150, a3=620, a4=824856670068, a5=0, a6=0, r1=<optimized out>, r2=<optimized out>, errno=<optimized out>) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/internal/syscall/syscall_linux.go:38
#2  0x00000000004d11be in syscall.Syscall6 (trap=232, a1=620, a2=824856670068, a3=1, a4=18446744073709551615, a5=0, a6=0, r1=<optimized out>, r2=<optimized out>, err=<optimized out>) at /opt/hostedtoolcache/go/1.20.1/x64/src/syscall/syscall_linux.go:92
#3  0x00000000004d1be5 in syscall.Syscall6 (trap=232, a1=620, a2=824856670068, a3=1, a4=18446744073709551615, a5=0, a6=0, r1=<optimized out>, r2=<optimized out>, err=<optimized out>) at <autogenerated>:1
#4  0x00000000005a67d8 in golang.org/x/sys/unix.EpollWait (epfd=-4, events=..., msec=824856670068, n=<optimized out>, err=...) at /home/runner/go/pkg/mod/golang.org/x/sys@v0.5.0/unix/zsyscall_linux_amd64.go:56
#5  0x0000000000e7059d in github.com/rjeczalik/notify.(*inotify).loop (i=0xc000f00000, esch=0xc00cf16480) at /home/runner/go/pkg/mod/github.com/rjeczalik/notify@v0.9.3/watcher_inotify.go:189
#6  0x0000000000e701ca in github.com/rjeczalik/notify.(*inotify).lazyinit.func2 () at /home/runner/go/pkg/mod/github.com/rjeczalik/notify@v0.9.3/watcher_inotify.go:129
#7  0x00000000004778e1 in runtime.goexit () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:1598
#8  0x0000000000000068 in ?? ()
#9  0x000000c000376690 in ?? ()
#10 0x0000000000000068 in ?? ()
#11 0x0000000000000068 in ?? ()
#12 0x000000c0031dc000 in ?? ()
#13 0x0000000000000000 in ?? ()

Thread 8 (Thread 0x7fdae77fe700 (LWP 3354813)):
#0  runtime.futex () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555
#1  0x000000000043c6b6 in runtime.futexsleep (addr=0xfffffffffffffe00, val=0, ns=4691939) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/os_linux.go:69
#2  0x0000000000413907 in runtime.notesleep (n=0xc000067948) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/lock_futex.go:160
#3  0x00000000004471cc in runtime.mPark () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:2334
#4  runtime.stopm () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:2334
#5  0x0000000000448a1c in runtime.findRunnable (gp=<optimized out>, inheritTime=<optimized out>, tryWakeP=<optimized out>) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3007
#6  0x0000000000449851 in runtime.schedule () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3360
#7  0x0000000000449d6d in runtime.park_m (gp=0xc008ef2680) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3511
#8  0x00000000004756a3 in runtime.mcall () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:452
#9  0x00007fdb062efb3f in ?? ()
#10 0x0100000000475620 in ?? ()
#11 0x0000000000800000 in github.com/ava-labs/avalanchego/ids.ShortFromPrefixedString (idStr=..., prefix=..., ~r0=..., ~r1=...) at /home/runner/work/avalanchego/avalanchego/ids/short.go:45
#12 0x0000000000000000 in ?? ()

Thread 7 (Thread 0x7fdb04f2d700 (LWP 3354743)):
#0  runtime.futex () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555
#1  0x000000000043c6b6 in runtime.futexsleep (addr=0xfffffffffffffe00, val=0, ns=4691939) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/os_linux.go:69
#2  0x0000000000413907 in runtime.notesleep (n=0xc000091148) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/lock_futex.go:160
#3  0x00000000004471cc in runtime.mPark () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:2334
#4  runtime.stopm () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:2334
#5  0x0000000000448a1c in runtime.findRunnable (gp=<optimized out>, inheritTime=<optimized out>, tryWakeP=<optimized out>) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3007
#6  0x0000000000449851 in runtime.schedule () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3360
#7  0x0000000000449d6d in runtime.park_m (gp=0xc00690e820) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3511
#8  0x00000000004756a3 in runtime.mcall () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:452
#9  0x00007fdb07b32a5f in ?? ()
#10 0x0100000000475620 in ?? ()
#11 0x0000000000800000 in github.com/ava-labs/avalanchego/ids.ShortFromPrefixedString (idStr=..., prefix=..., ~r0=..., ~r1=...) at /home/runner/work/avalanchego/avalanchego/ids/short.go:45
#12 0x0000000000000000 in ?? ()

Thread 6 (Thread 0x7fdb0582f700 (LWP 3354738)):
#0  runtime.futex () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555
#1  0x000000000043c6b6 in runtime.futexsleep (addr=0xfffffffffffffe00, val=0, ns=4691939) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/os_linux.go:69
#2  0x0000000000413b05 in runtime.notetsleep_internal (n=0x338a200 <runtime.sig>, ns=-1, ~r0=<optimized out>) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/lock_futex.go:183
#3  0x0000000000413c25 in runtime.notetsleepg (n=0x338a200 <runtime.sig>, ns=-1, ~r0=<optimized out>) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/lock_futex.go:237
#4  0x0000000000473acf in os/signal.signal_recv (~r0=<optimized out>) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sigqueue.go:152
#5  0x00000000005a7a79 in os/signal.loop () at /opt/hostedtoolcache/go/1.20.1/x64/src/os/signal/signal_unix.go:23
#6  0x00000000004778e1 in runtime.goexit () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:1598
#7  0x0000000000000000 in ?? ()

Thread 5 (Thread 0x7fdb06af1700 (LWP 3354735)):
#0  runtime.futex () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555
#1  0x000000000043c6b6 in runtime.futexsleep (addr=0xfffffffffffffe00, val=0, ns=4691939) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/os_linux.go:69
#2  0x0000000000413907 in runtime.notesleep (n=0xc000090148) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/lock_futex.go:160
#3  0x0000000000445ce5 in runtime.mPark () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:1529
#4  0x0000000000447a25 in runtime.stoplockedm () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:2561
#5  0x00000000004497dd in runtime.schedule () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3339
#6  0x0000000000449d6d in runtime.park_m (gp=0xc0030efba0) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3511
#7  0x00000000004756a3 in runtime.mcall () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:452
#8  0x00007fdb07b32a4f in ?? ()
#9  0x0100000000475620 in ?? ()
#10 0x0000000000800000 in github.com/ava-labs/avalanchego/ids.ShortFromPrefixedString (idStr=..., prefix=..., ~r0=..., ~r1=...) at /home/runner/work/avalanchego/avalanchego/ids/short.go:45
#11 0x0000000000000000 in ?? ()

Thread 4 (Thread 0x7fdb072f2700 (LWP 3354734)):
#0  runtime.futex () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555
#1  0x000000000043c6b6 in runtime.futexsleep (addr=0xfffffffffffffe00, val=0, ns=4691939) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/os_linux.go:69
#2  0x0000000000413907 in runtime.notesleep (n=0xc000066d48) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/lock_futex.go:160
#3  0x00000000004471cc in runtime.mPark () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:2334
#4  runtime.stopm () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:2334
#5  0x0000000000448a1c in runtime.findRunnable (gp=<optimized out>, inheritTime=<optimized out>, tryWakeP=<optimized out>) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3007
#6  0x0000000000449851 in runtime.schedule () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3360
#7  0x0000000000449d6d in runtime.park_m (gp=0xc0099c84e0) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3511
#8  0x00000000004756a3 in runtime.mcall () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:452
#9  0x00007ffca558f61f in ?? ()
#10 0x0100000000475620 in ?? ()
#11 0x0000000000800000 in github.com/ava-labs/avalanchego/ids.ShortFromPrefixedString (idStr=..., prefix=..., ~r0=..., ~r1=...) at /home/runner/work/avalanchego/avalanchego/ids/short.go:45
#12 0x0000000000000000 in ?? ()

Thread 3 (Thread 0x7fdb08334700 (LWP 3354732)):
#0  runtime.usleep () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:135
#1  0x000000000044e425 in runtime.sysmon () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:5297
#2  0x0000000000445bf3 in runtime.mstart1 () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:1498
#3  0x0000000000445b3a in runtime.mstart0 () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:1455
#4  0x0000000000475625 in runtime.mstart () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:394
#5  0x00000000013b9a7c in crosscall_amd64 ()
#6  0x00007fdb08333fc0 in ?? ()
#7  0x00007ffca558f920 in ?? ()
#8  0x00007ffca558f72f in ?? ()
#9  0x00007ffca558f72e in ?? ()
#10 0x000000c0000069c0 in ?? ()
#11 0x0000000000475620 in ?? ()
#12 0x00000000013b9489 in threadentry ()
#13 0x00007fdb2f8c7609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#14 0x00007fdb2f7ec133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 2 (Thread 0x7fdb2f6ca740 (LWP 3354731)):
#0  runtime.futex () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555
#1  0x000000000043c6b6 in runtime.futexsleep (addr=0xfffffffffffffe00, val=0, ns=4691939) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/os_linux.go:69
#2  0x0000000000413907 in runtime.notesleep (n=0x33547e8 <runtime.m0+328>) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/lock_futex.go:160
#3  0x00000000004471cc in runtime.mPark () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:2334
#4  runtime.stopm () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:2334
#5  0x0000000000448a1c in runtime.findRunnable (gp=<optimized out>, inheritTime=<optimized out>, tryWakeP=<optimized out>) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3007
#6  0x0000000000449851 in runtime.schedule () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3360
#7  0x0000000000449d6d in runtime.park_m (gp=0xc00690a4e0) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3511
#8  0x00000000004756a3 in runtime.mcall () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:452
#9  0x000000000047a3c5 in runtime.newproc (fn=0x0) at <autogenerated>:1
#10 0x0000000003353ee0 in crypto/tls.emptyConfig ()
#11 0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7fdae57b5700 (LWP 3356575)):
#0  runtime.futex () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555
#1  0x000000000043c72f in runtime.futexsleep (addr=<optimized out>, val=<optimized out>, ns=<optimized out>) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/os_linux.go:75
#2  0x0000000000413a45 in runtime.notetsleep_internal (n=0x3354b90 <runtime.sched+240>, ns=100000, ~r0=<optimized out>) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/lock_futex.go:202
#3  0x0000000000413b94 in runtime.notetsleep (n=0xfffffffffffffdfc, ns=0, ~r0=<optimized out>) at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/lock_futex.go:225
#4  0x00000000004457d6 in runtime.stopTheWorldWithSema () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:1297
#5  0x0000000000475729 in runtime.systemstack () at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:496
#6  0x00007fdb04f2ca5f in ?? ()
#7  0x0100000000475620 in ?? ()
#8  0x0000000000800000 in github.com/ava-labs/avalanchego/ids.ShortFromPrefixedString (idStr=..., prefix=..., ~r0=..., ~r1=...) at /home/runner/work/avalanchego/avalanchego/ids/short.go:45
#9  0x0000000000000000 in ?? ()

@patrick-ogrady
Copy link
Author

Found a detailed thread view in dlv as well:
threads_dlv.log

Thread 3356575 at 0x4797e3 /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555 runtime.futex
 0  0x00000000004797e3 in runtime.futex
    at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555
 1  0x00000000004756c0 in runtime.systemstack_switch
    at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:463
 2  0x0000000000422c7c in runtime.gcStart
    at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/mgc.go:665
 3  0x0000000000415297 in runtime.mallocgc
    at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/malloc.go:1172
 4  0x0000000000415507 in runtime.newobject
    at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/malloc.go:1254
 5  0x000000000092477b in github.com/ava-labs/avalanchego/message.encodeIDs
    at /home/runner/work/avalanchego-internal/avalanchego-internal/message/inbound_msg_builder.go:313
 6  0x000000000092477b in github.com/ava-labs/avalanchego/message.(*outMsgBuilder).Chits
    at /home/runner/work/avalanchego-internal/avalanchego-internal/message/outbound_msg_builder.go:618
 7  0x000000000092773c in github.com/ava-labs/avalanchego/message.(*creator).Chits
    at <autogenerated>:1
 8  0x0000000000c1d5cb in github.com/ava-labs/avalanchego/snow/networking/sender.(*sender).SendChits
    at /home/runner/work/avalanchego-internal/avalanchego-internal/snow/networking/sender/sender.go:1160
 9  0x0000000000cd3027 in github.com/ava-labs/avalanchego/snow/engine/snowman.(*Transitive).sendChits
    at /home/runner/work/avalanchego-internal/avalanchego-internal/snow/engine/snowman/transitive.go:461
10  0x0000000000cd0445 in github.com/ava-labs/avalanchego/snow/engine/snowman.(*Transitive).PullQuery
    at /home/runner/work/avalanchego-internal/avalanchego-internal/snow/engine/snowman/transitive.go:198
11  0x0000000000bbdbad in github.com/ava-labs/avalanchego/snow/networking/handler.(*handler).handleSyncMsg
    at /home/runner/work/avalanchego-internal/avalanchego-internal/snow/networking/handler/handler.go:637
12  0x0000000000bb8c28 in github.com/ava-labs/avalanchego/snow/networking/handler.(*handler).dispatchSync
    at /home/runner/work/avalanchego-internal/avalanchego-internal/snow/networking/handler/handler.go:355
13  0x0000000000bb80e5 in github.com/ava-labs/avalanchego/snow/networking/handler.(*handler).Start.func1
    at /home/runner/work/avalanchego-internal/avalanchego-internal/snow/networking/handler/handler.go:238
14  0x0000000000809ab3 in github.com/ava-labs/avalanchego/utils/logging.(*log).RecoverAndPanic
    at /home/runner/work/avalanchego-internal/avalanchego-internal/utils/logging/log.go:125
15  0x0000000000bb7dcf in github.com/ava-labs/avalanchego/snow/networking/handler.(*handler).Start.func11
15  0x0000000000bb7dcf in github.com/ava-labs/avalanchego/snow/networking/handler.(*handler).Start.func11
    at /home/runner/work/avalanchego-internal/avalanchego-internal/snow/networking/handler/handler.go:257
16  0x00000000004778e1 in runtime.goexit
    at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:1598

Thread 3354862 at 0x4797e3 /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555 runtime.futex
0  0x00000000004797e3 in runtime.futex
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555
1  0x000000000043c6b6 in runtime.futexsleep
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/os_linux.go:69
2  0x0000000000413907 in runtime.notesleep
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/lock_futex.go:160
3  0x00000000004471cc in runtime.mPark
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:1530
4  0x00000000004471cc in runtime.stopm
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:2334
5  0x0000000000448a1c in runtime.findRunnable
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3007
6  0x0000000000449851 in runtime.schedule
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3360
7  0x0000000000449d6d in runtime.park_m
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3511
8  0x00000000004756a3 in runtime.mcall
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:452

Thread 3354861 at 0x45ffdb /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/stubs.go:18 runtime.pcdatavalue
0  0x000000000045ffdb in runtime.add
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/stubs.go:18
1  0x000000000045ffdb in runtime.pcdatastart
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/symtab.go:1102
2  0x000000000045ffdb in runtime.pcdatavalue
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/symtab.go:1109
3  0x00000000004d1107 in syscall.Syscall
   at /opt/hostedtoolcache/go/1.20.1/x64/src/syscall/syscall_linux.go:69
4  0x000000c015f38ae9 in ???
   at ?:-1
5  0x0000000000415128 in runtime.mallocgc
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/malloc.go:1094

Thread 3354813 at 0x4797e3 /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555 runtime.futex
0  0x00000000004797e3 in runtime.futex
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555
1  0x000000000043c6b6 in runtime.futexsleep
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/os_linux.go:69
2  0x0000000000413907 in runtime.notesleep
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/lock_futex.go:160
3  0x00000000004471cc in runtime.mPark
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:1530
4  0x00000000004471cc in runtime.stopm
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:2334
5  0x0000000000448a1c in runtime.findRunnable
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3007
6  0x0000000000449851 in runtime.schedule
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3360
7  0x0000000000449d6d in runtime.park_m
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3511
8  0x00000000004756a3 in runtime.mcall
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:452

Thread 3354812 at 0x4797e3 /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555 runtime.futex
0  0x00000000004797e3 in runtime.futex
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555
1  0x000000000043c6b6 in runtime.futexsleep
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/os_linux.go:69
2  0x0000000000413907 in runtime.notesleep
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/lock_futex.go:160
3  0x00000000004471cc in runtime.mPark
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:1530
4  0x00000000004471cc in runtime.stopm
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:2334
5  0x0000000000448a1c in runtime.findRunnable
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3007
6  0x0000000000449851 in runtime.schedule
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3360
7  0x0000000000449d6d in runtime.park_m
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3511
8  0x00000000004756a3 in runtime.mcall
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:452

Thread 3354743 at 0x4797e3 /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555 runtime.futex
0  0x00000000004797e3 in runtime.futex
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555
1  0x000000000043c6b6 in runtime.futexsleep
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/os_linux.go:69
2  0x0000000000413907 in runtime.notesleep
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/lock_futex.go:160
3  0x00000000004471cc in runtime.mPark
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:1530
4  0x00000000004471cc in runtime.stopm
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:2334
5  0x0000000000448a1c in runtime.findRunnable
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3007
6  0x0000000000449851 in runtime.schedule
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3360
7  0x0000000000449d6d in runtime.park_m
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3511
8  0x00000000004756a3 in runtime.mcall
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:452

Thread 3354738 at 0x4797e3 /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555 runtime.futex
0  0x00000000004797e3 in runtime.futex
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555
1  0x000000000043c6b6 in runtime.futexsleep
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/os_linux.go:69
2  0x0000000000413b05 in runtime.notetsleep_internal
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/lock_futex.go:183
3  0x0000000000413c25 in runtime.notetsleepg
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/lock_futex.go:237
4  0x0000000000473acf in os/signal.signal_recv
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sigqueue.go:152
5  0x00000000005a7a79 in os/signal.loop
   at /opt/hostedtoolcache/go/1.20.1/x64/src/os/signal/signal_unix.go:23
6  0x00000000004778e1 in runtime.goexit
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:1598

Thread 3354736 at 0x4797e3 /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555 runtime.futex
0  0x00000000004797e3 in runtime.futex
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555
1  0x000000000043c6b6 in runtime.futexsleep
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/os_linux.go:69
2  0x0000000000413907 in runtime.notesleep
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/lock_futex.go:160
3  0x00000000004470b1 in runtime.templateThread
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:2312
4  0x0000000000445bf3 in runtime.mstart1
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:1498
5  0x0000000000445b3a in runtime.mstart0
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:1455
6  0x0000000000475625 in runtime.mstart
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:395
7  0x00000000013b9a7c in ???
   at ?:-1
   error: error while reading spliced memory at 0x7fdb05af0288: EOF

Thread 3354735 at 0x4797e3 /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555 runtime.futex
0  0x00000000004797e3 in runtime.futex
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555
1  0x000000000043c6b6 in runtime.futexsleep
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/os_linux.go:69
2  0x0000000000413907 in runtime.notesleep
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/lock_futex.go:160
3  0x0000000000445ce5 in runtime.mPark
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:1530
4  0x0000000000447a25 in runtime.stoplockedm
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:2561
5  0x00000000004497dd in runtime.schedule
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3339
6  0x0000000000449d6d in runtime.park_m
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3511
7  0x00000000004756a3 in runtime.mcall
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:452

Thread 3354734 at 0x4797e3 /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555 runtime.futex
0  0x00000000004797e3 in runtime.futex
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555
1  0x000000000043c6b6 in runtime.futexsleep
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/os_linux.go:69
2  0x0000000000413907 in runtime.notesleep
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/lock_futex.go:160
3  0x00000000004471cc in runtime.mPark
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:1530
4  0x00000000004471cc in runtime.stopm
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:2334
5  0x0000000000448a1c in runtime.findRunnable
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3007
6  0x0000000000449851 in runtime.schedule
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3360
7  0x0000000000449d6d in runtime.park_m
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3511
8  0x00000000004756a3 in runtime.mcall
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:452

Thread 3354733 at 0x409f8e /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/internal/syscall/asm_linux_amd64.s:36 runtime/internal/syscall.Syscall6
0  0x0000000000409f8e in runtime/internal/syscall.Syscall6
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/internal/syscall/asm_linux_amd64.s:36
1  0x0000000000409f73 in syscall.RawSyscall6
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/internal/syscall/syscall_linux.go:38
2  0x00000000004d11be in syscall.Syscall6
   at /opt/hostedtoolcache/go/1.20.1/x64/src/syscall/syscall_linux.go:92
3  0x00000000004d1be5 in syscall.Syscall6
   at :0
4  0x00000000005a67d8 in golang.org/x/sys/unix.EpollWait
   at /home/runner/go/pkg/mod/golang.org/x/sys@v0.5.0/unix/zsyscall_linux_amd64.go:56
5  0x0000000000e7059d in github.com/rjeczalik/notify.(*inotify).loop
   at /home/runner/go/pkg/mod/github.com/rjeczalik/notify@v0.9.3/watcher_inotify.go:189
6  0x0000000000e701ca in github.com/rjeczalik/notify.(*inotify).lazyinit.func2
   at /home/runner/go/pkg/mod/github.com/rjeczalik/notify@v0.9.3/watcher_inotify.go:129
7  0x00000000004778e1 in runtime.goexit
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:1598

Thread 3354732 at 0x4791dd /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:135 runtime.usleep
0  0x00000000004791dd in runtime.usleep
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:135
1  0x000000000044e425 in runtime.sysmon
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:5297
2  0x0000000000445bf3 in runtime.mstart1
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:1498
3  0x0000000000445b3a in runtime.mstart0
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:1455
4  0x0000000000475625 in runtime.mstart
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:395
5  0x00000000013b9a7c in ???
   at ?:-1
   error: error while reading spliced memory at 0x7fdb07b34288: EOF

Thread 3354731 at 0x4797e3 /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555 runtime.futex
0  0x00000000004797e3 in runtime.futex
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/sys_linux_amd64.s:555
1  0x000000000043c6b6 in runtime.futexsleep
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/os_linux.go:69
2  0x0000000000413907 in runtime.notesleep
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/lock_futex.go:160
3  0x00000000004471cc in runtime.mPark
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:1530
4  0x00000000004471cc in runtime.stopm
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:2334
5  0x0000000000448a1c in runtime.findRunnable
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3007
6  0x0000000000449851 in runtime.schedule
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3360
7  0x0000000000449d6d in runtime.park_m
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/proc.go:3511
8  0x00000000004756a3 in runtime.mcall
   at /opt/hostedtoolcache/go/1.20.1/x64/src/runtime/asm_amd64.s:452

@prattmic
Copy link
Member

prattmic commented Mar 1, 2023

Thread 12 is stuck in traceback. This looks like #58513.

The fix for that bug is already committed to release-branch.go1.20 for the next minor release. Could you try building a toolchain from that branch and seeing if the issue goes away?

cc @mdempsky

@patrick-ogrady
Copy link
Author

Are there any 1.20.2 RCs that have this fix in them yet we could use?

If not, it'll just take a bit more leg work to update. Figured I'd ask before embarking on that.

@qiulaidongfeng
Copy link
Contributor

@patrick-ogrady go1.20.2Not yet released
according to https://github.com/golang/go/issues/58513#issuecomment -1430375549, the master branch contains patches, which can be accessed through
go install golang.org/dl/ gotip@latest
gotip download
Then use the latest tool chain through gotip

@patrick-ogrady
Copy link
Author

patrick-ogrady commented Mar 1, 2023

Ended up just implementing a pipeline to build any branch from this repo from source and use that for compilation. We'll run a test tomorrow and report back.

Thanks for the help!

@mknyszek mknyszek added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Mar 1, 2023
@mknyszek mknyszek added this to the Backlog milestone Mar 1, 2023
@patrick-ogrady
Copy link
Author

So far so good with release-branch.go1.20 (bdd86bda09050bdeaf4e812eef69ba13ff57dfee). I have not been able to reproduce the issue within 20-30 minutes like I was able to do so with 1.20.1.

We'll continue running tests for the next day and will report back tomorrow if it continues to stay stable.

Notably, the issue you linked above seems to be related specifically to apply to a panic case, however, I'm assuming the fix you implemented handled a more general issue (as our code does not hit a panic it was blocked on or anything with the fix, it just doesn't get stuck).

@prattmic
Copy link
Member

prattmic commented Mar 1, 2023

That's great!

The issue is with traceback generally. Traceback is used in panic processing, but also in other places. In your case, it is stuck in traceback during CPU profiling.

I'm going to close this as a duplicate of #58513. Please reopen if the issue does reproduce after all.

@prattmic prattmic closed this as completed Mar 1, 2023
@patrick-ogrady
Copy link
Author

patrick-ogrady commented Mar 1, 2023

Thanks for the investigation and explanation. Will reply if I see anything out of the ordinary on that branch.

@patrick-ogrady
Copy link
Author

I ran this change for about ~20 hours without any issue. Based on how frequently it occurred before, I believe this is resolved.

@manav2401
Copy link

Ended up just implementing a pipeline to build any branch from this repo from source and use that for compilation. We'll run a test tomorrow and report back.

Hi @patrick-ogrady, regarding this comment, do you mind sharing any code snippets of the pipeline or ways to build and use an unreleased go version for compilation? Any help would be appreciated. Thanks in advance!

@prattmic
Copy link
Member

prattmic commented Mar 3, 2023

@manav2401 There are a few options:

Build directly:

$ git clone https://go.googlesource.com/go
$ cd go/src
$ git checkout release-branch.go1.20
$ ./make.bash
$ # Now build with `../bin/go build`.

Use gotip:

$ go install golang.org/dl/gotip@latest
$ gotip download release-branch.go1.20
$ # Now build with `gotip build ...`

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

6 participants