Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: regression in DWARF location lists caused by "issue VarDef only for pointer-ful types" #60479

Open
andreimatei opened this issue May 28, 2023 · 8 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@andreimatei
Copy link
Contributor

andreimatei commented May 28, 2023

What version of Go are you using (go version)?

go1.20. Also tip. Also go1.20-a74e5f584e.

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/andrei/.cache/go-build"
GOENV="/home/andrei/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/andrei/work/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/andrei/work"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/andrei/sdk/go1.19.9"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/andrei/sdk/go1.19.9/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.19.9"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build802006892=/tmp/go-build -gno-record-gcc-switches"

What did you do?

There appears to have been a regression in the quality of debug information between 1.19 and 1.20. I have bisected it to
a74e5f5 (cmd/compile: issue VarDef only for pointer-ful types), which is https://go-review.googlesource.com/c/go/+/419320.

It seems that this patch caused some local variables to no longer get their location lists emitted (i.e. the information telling debuggers the memory location of the variable in relation to different code locations).

Please consider the following program, which is the simplest repro I could produce:
https://go.dev/play/p/u9iYmn8GXVz

package main

type T struct {
	// The size of the array matters; only 1 element is not enough for the demo.
	x [2]byte
	// Adding a pointer makes the problem go away, which seems to track with a74e5f584e.
	// p *int
}

//go:noinline
func foo() T {
	// t1 has a loclist before a74e5f584e
	var t1, t2 T
	_ = t2 // t2 is necessary for the demo; without it t1 is completely optimized out?
	return t1
}

func main() {
	foo()
}

Before a74e5f5, the variable t1 has the location information in debug info (pointing to the stack):

llvm-dwarfdump-16 --debug-info --color  --name=t1 --show-children <binary>
0x00015395: DW_TAG_variable
              DW_AT_name        ("t1")
              DW_AT_decl_line   (13)
              DW_AT_type        (0x0000000000001bf4 "main.T")
              DW_AT_location    (0x00000f61: 
                 [0x000000000045747c, 0x0000000000457490): DW_OP_fbreg -18)

After the patch though, it's no longer there:

0x00015395: DW_TAG_variable
              DW_AT_name        ("t1")
              DW_AT_decl_line   (13)
              DW_AT_type        (0x0000000000001bf4 "main.T")
              DW_AT_location    (<empty>)

Luckily, the generated machine code for main.foo is not changed by a74e5f5 as far as I can tell, so I am hoping that the code-generation benefits of that patch are not at odds with the quality of the debug info.

The problem does not affect non-optimized builds (`-gcflags="all=-N").

What did you expect to see?

I would like to have the location list for the t1 variable.

What did you see instead?

Location list is missing.


cc @randall77 @dr2chase - the author and reviewer of a74e5f5.
Also kindly cc @aarzilli @thanm @neild - folks involved with https://go-review.googlesource.com/c/go/+/433479 which fixed other debug-info problems apparently caused by a74e5f5.

I would like to timidly volunteer myself to work on a fix if that would be useful, provided that someone knowledgeable would enjoy holding my hand a bit and showing me the ropes in the relevant parts of the compiler. I would like to personally take a longer-term interest in Go's debug info.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label May 28, 2023
@dr2chase
Copy link
Contributor

The place that was probably expecting the VarDefs is in cmd/compile/internal/ssa/debug.go.

Do you know about using GOSSAFUNC (e.g GOSSAFUNC=foo go build main.go) to generate ssa.html files so you can see what the code looks like?

If you decide that code looks unpleasant, apologies, I touched it last, but I would love to have more people able to work with Go's debug info.

@andreimatei
Copy link
Contributor Author

Do you know about using GOSSAFUNC (e.g GOSSAFUNC=foo go build main.go) to generate ssa.html files so you can see what the code looks like?

I didn't know about GOSSAFUNC. Groovy tool!
As expected, I believe it tells us that some VarDefs are gone. In particular, (13) = VarDef <mem> {t1} v6 is gone (in the "before" case, that declaration survives until the end of the compiler passes).

before 7ad92e9
before insert phis
b1:-
v1 (?) = InitMem <mem>
v2 (?) = SP <uintptr>
v3 (?) = SB <uintptr>
v4 (?) = LocalAddr <*T> {~r0} v2 v1
v5 (11) = VarDef <mem> {~r0} v1
v6 (11) = Zero <mem> {T} [2] v4 v5
v7 (13) = VarDef <mem> {t1} v6
v8 (13) = LocalAddr <*T> {t1} v2 v7
v9 (13) = Zero <mem> {T} [2] v8 v7
v10 (13) = VarDef <mem> {t2} v9
v11 (13) = LocalAddr <*T> {t2} v2 v10
v12 (13) = Zero <mem> {T} [2] v11 v10
v13 (14) = LocalAddr <*T> {t2} v2 v12
v14 (14) = Load <T> v13 v12
v15 (15) = LocalAddr <*T> {t1} v2 v12
v16 (15) = VarDef <mem> {~r0} v12
v17 (15) = LocalAddr <*T> {~r0} v2 v16
v18 (15) = Move <mem> {T} [2] v17 v15 v16
v20 (15) = LocalAddr <*T> {~r0} v2 v18
v21 (15) = Dereference <T> v20 v18
v19 (15) = MakeResult <T,mem> v21 v18
Ret v19 (+15)
after 7ad92e9
before insert phis
b1:-
v1 (?) = InitMem <mem>
v2 (?) = SP <uintptr>
v3 (?) = SB <uintptr>
v4 (?) = LocalAddr <*T> {~r0} v2 v1
v5 (11) = Zero <mem> {T} [2] v4 v1
v6 (13) = LocalAddr <*T> {t1} v2 v5
v7 (13) = Zero <mem> {T} [2] v6 v5
v8 (13) = LocalAddr <*T> {t2} v2 v7
v9 (13) = Zero <mem> {T} [2] v8 v7
v10 (14) = LocalAddr <*T> {t2} v2 v9
v11 (14) = Load <T> v10 v9
v12 (15) = LocalAddr <*T> {t1} v2 v9
v13 (15) = LocalAddr <*T> {~r0} v2 v9
v14 (15) = Move <mem> {T} [2] v13 v12 v9
v16 (15) = LocalAddr <*T> {~r0} v2 v14
v17 (15) = Dereference <T> v16 v14
v15 (15) = MakeResult <T,mem> v17 v14
Ret v15 (+15)

Like you say, cmd/compile/internal/ssa/debug.go seems to look for VarDef.

7ad92e9 stated that:

VarDef markers are only used for liveness analysis, and that only runs on pointer-ful variables.

It seems to me that that assessment was perhaps not entirely right?

I gather that 7ad92e9 did have some beneficial code generation effects in some cases (right?). My understanding/guess based on @randall77's #53810 (comment) is that that inhibiting the generation of VarDefs was simply an expedient way to allow some optimization pass to kick in.

Do you reckon that we could have both the desired optimization and the VarDefs? Or, otherwise, any thoughts on how to get the VarDefs back?

@mknyszek
Copy link
Contributor

CC @golang/compiler

@mknyszek mknyszek added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 30, 2023
@mknyszek mknyszek added this to the Backlog milestone May 30, 2023
@randall77
Copy link
Contributor

I can reproduce. Here's a simpler repro:

package main

type T struct {
	a [2]byte
}

//go:noinline
func f(p *T) {
}

func main() {
	var t1 T
	f(&t1)
}

I'm a little bit unclear about what we do want to appear in these lists. At least, when -N is not specified.
The repro requires [2]byte to demonstrate that we got a location list with 1.19 but not with 1.20 or tip.
If we use [1]byte we don't get location lists in any of those versions.

So although I agree that we're missing some debug info that we used to have, it doesn't seem qualitatively different. Location lists are still missing for lots of stuff in either case. (It's only variables with non-SSAable, pointerless types that have changed?)

If we think that location lists are important for optimized code, we should probably figure out a way to make them available generally. At least, for things that don't get totally optimized away.

@randall77
Copy link
Contributor

I take some of that back, for my repro using [1]byte 1.19.4 still generates a location list for t1.

@randall77
Copy link
Contributor

(But the original repro does not.)

@andreimatei
Copy link
Contributor Author

So although I agree that we're missing some debug info that we used to have, it doesn't seem qualitatively different. Location lists are still missing for lots of stuff in either case. (It's only variables with non-SSAable, pointerless types that have changed?)

If we think that location lists are important for optimized code, we should probably figure out a way to make them available generally. At least, for things that don't get totally optimized away.

Debug info for optimized code in general, and variable location lists in particular, are quite important for debugging optimized code in some of the trickiest, production-only investigations. At least that's my conclusion after the past years of working on CockroachDB. As of recently, I'm working on tooling for making better use of this debug info, and hopefully bringing production debugging to more people. So, I personally am very interested in the quality of this info. I'd like to personally contribute to it on the compiler side, if opportunity presents itself. So if you have thoughts on other places where the compiler doesn't maintain the right info that could be improved, and if a newcomer like me can help, I'll always be interested.

It's only variables with non-SSAable, pointerless types that have changed?

This particular regression seems significant to me; I've immediately gotten bitten by it when upgrading to go1.20. I think maybe every local var of struct type that doesn't contain a pointer is affected?

@randall77
Copy link
Contributor

I think maybe every local var of struct type that doesn't contain a pointer is affected?

I think it would only be ones that are not SSAable for some reason. They are large, contain somewhere an array of length > 1, or have their address taken. Otherwise such structs would be broken up into pieces and put in registers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Development

No branches or pull requests

5 participants