Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: missing DWARF location list for function arg; stack spill not used for loclist #61703

Open
andreimatei opened this issue Aug 1, 2023 · 0 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@andreimatei
Copy link
Contributor

What version of Go are you using (go version)?

$ go version
go version go1.21rc3 linux/amd64

Does this issue reproduce with the latest release?

Yes.
It also reproduces with 1.20, 1.19, and maybe older versions too.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/home/andrei/.cache/go-build'
GOENV='/home/andrei/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/andrei/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/andrei/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/home/andrei/sdk/go1.21rc3'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/home/andrei/sdk/go1.21rc3/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.21rc3'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/dev/null'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build1118470848=/tmp/go-build -gno-record-gcc-switches'

What did you do?

Meta-comment: I understand that the production of DWARF location lists for optimized code is not perfect. I'd like to generally form an understanding about which problems are hard and which are tractable. I have found one case where I'm hoping the debug info can be improved, stemming from a real case where a variable that I've tried to read is unavailable where I think it should be available.

With the (optimized) CockroachDB binary, I'm frequently trying to read context.Context function arguments with a debugger, and more often than not they are not available. I have failed to create a small repro (my experimentation is also hindered by #61700), but the problems seem widespread in the CockroachDB binary. For this report, I have selected a Cockroach case that seems the simplest, out of multiple that I've investigated by hand.

In this example, what seems to happen is that a function argument that starts off in registers becomes unavailable once the registers are clobbered by a function call, even though the argument was spilled to the stack (and not just spilled anywhere, but spilled in its designated regabi spill slot, FWIW). In other cases, I see the compiler able to use the spill locations and switch the location lists from registers to the stack; I'd like to understand if there's a good reason why this doesn't happen in this example.

Consider the following method (Github link):

// Init is part of the colexecop.Operator interface.
func (bic *batchInfoCollector) Init(ctx context.Context) {
	bic.ctx = ctx
	bic.stopwatch.Start()
	// Wrap the call to Init() with a panic catcher in order to get the correct
	// execution time (e.g. in the statement bundle).
	err := colexecerror.CatchVectorizedRuntimeError(bic.init)
	bic.stopwatch.Stop()
	if err != nil {
		colexecerror.InternalError(err)
	}
	// Unset the context so that it's not used outside of the init() function.
	bic.ctx = nil
	bic.mu.Lock()
	// If we got here, then Init above succeeded, so the wrapped operator has
	// been properly initialized.
	bic.mu.initialized = true
	bic.mu.Unlock()
}

The location lists for the ctx argument are quite incomplete:

0x02be341f:   DW_TAG_formal_parameter
                DW_AT_name      ("ctx")
                DW_AT_variable_parameter        (0x00)
                DW_AT_decl_line (98)
                DW_AT_type      (0x00000000000ac025 "context.Context")
                DW_AT_location  (0x03ea7218: 
                   [0x0000000002fb8f80, 0x0000000002fb8fc9): DW_OP_reg3 RBX, DW_OP_piece 0x8, DW_OP_reg2 RCX, DW_OP_piece 0x8
                   [0x0000000002fb8fc9, 0x0000000002fb8fd1): DW_OP_reg3 RBX, DW_OP_piece 0x8, DW_OP_piece 0x8)

The 128-bit argument starts off in RBX, RCX. Then, the PC location 2fb8fd1 corresponds to the first function call -> bic.stopwatch.Start(). That's where the location lists stop, so, for the majority of the function's code, ctx is not available.
The 2nd location list (2fb8fc9 - 2fb8fd1) corresponds to only two instructions where RBX continues to be available after RCX was just clobbered; I believe this is not very interesting.

Now, I claim that the compiler should be able to give me a location list for all of this function, based on the fact that ctx is spilled to the stack in the beginning of the function. Again, in other random instances I've looked at, the compiler does sometimes seem to be able to make use of the stack spill.

Here is the function's disassembly with minor commentary: https://gist.github.com/andreimatei/fa1575dbfa8e04fb0271d02587a0bdf6
I'll paste the beginning here:

0000000002fb8f80 <github.com/cockroachdb/cockroach/pkg/sql/colflow.(*batchInfoCollector).Init>:
; github.com/cockroachdb/cockroach/pkg/sql/colflow.(*batchInfoCollector).Init():
; /home/andrei/src/github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:98
 2fb8f80: 49 3b 66 10                  	cmpq	0x10(%r14), %rsp
 2fb8f84: 0f 86 38 01 00 00            	jbe	0x2fb90c2 <github.com/cockroachdb/cockroach/pkg/sql/colflow.(*batchInfoCollector).Init+0x142>
 2fb8f8a: 55                           	pushq	%rbp
 2fb8f8b: 48 89 e5                     	movq	%rsp, %rbp
 2fb8f8e: 48 83 ec 38                  	subq	$0x38, %rsp

============== ctx is spilled to the stack over the next two instructions ==============

 2fb8f92: 48 89 5c 24 50               	movq	%rbx, 0x50(%rsp)
 2fb8f97: 48 89 4c 24 58               	movq	%rcx, 0x58(%rsp)


; /home/andrei/src/github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:99
 2fb8f9c: 48 89 58 60                  	movq	%rbx, 0x60(%rax)
 2fb8fa0: 83 3d 59 41 b8 08 00         	cmpl	$0x0, 0x8b84159(%rip)   # 0xbb3d100 <runtime.writeBarrier>
 2fb8fa7: 74 10                        	je	0x2fb8fb9 <github.com/cockroachdb/cockroach/pkg/sql/colflow.(*batchInfoCollector).Init+0x39>
 2fb8fa9: e8 12 ee 4c fd               	callq	0x487dc0 <runtime.gcWriteBarrier2>
 2fb8fae: 49 89 0b                     	movq	%rcx, (%r11)
 2fb8fb1: 48 8b 50 68                  	movq	0x68(%rax), %rdx
 2fb8fb5: 49 89 53 08                  	movq	%rdx, 0x8(%r11)
; /home/andrei/src/github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:98
 2fb8fb9: 48 89 44 24 48               	movq	%rax, 0x48(%rsp)
; /home/andrei/src/github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:99
 2fb8fbe: 48 89 48 68                  	movq	%rcx, 0x68(%rax)
; /home/andrei/src/github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:100
 2fb8fc2: 48 8b 88 80 00 00 00         	movq	0x80(%rax), %rcx

======= beginning of the second location list; RCX became unavailable just above ==========================

 2fb8fc9: 48 89 c8                     	movq	%rcx, %rax
 2fb8fcc: e8 8f 2d 93 fd               	callq	0x8ebd60 <github.com/cockroachdb/cockroach/pkg/util/timeutil.(*StopWatch).Start>

======= end of the 2nd location list; RBX also became unavailable here =====================
 2fb8fd1: 44 0f 11 7c 24 28            	movups	%xmm15, 0x28(%rsp)
...

Notice that RBX and RCX are spilled in the beginning. And note how the call StopWatch.Start marks the end of the 2nd location list, and there's no further location list.

What did you expect to see?

I expected a location list pointing to the ctx value on the stack.

If this proves to be a tractable problem, myself or one of my colleagues might be interested in putting some effort in improving it.

cc @dr2chase @randall77 @thanm

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Aug 1, 2023
@mknyszek mknyszek added this to the Backlog milestone Aug 2, 2023
@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Development

No branches or pull requests

4 participants