Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: bad traceback arguments #45728

Closed
randall77 opened this issue Apr 23, 2021 · 12 comments
Closed

cmd/compile: bad traceback arguments #45728

randall77 opened this issue Apr 23, 2021 · 12 comments
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@randall77
Copy link
Contributor

package main

type T struct {
	x int
}

//go:noinline
func (t *T) Foo() int {
	return t.x
}

func main() {
	(*T)(nil).Foo()
}

When run in go 1.16, we get:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x105e185]

goroutine 1 [running]:
main.(*T).Foo(0x0, 0xc000074058)
	/Users/khr/gowork/tmp4.go:9 +0x5
main.main()
	/Users/khr/gowork/tmp4.go:13 +0x2a
exit status 2

When run at tip, we get:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x10548c0]

goroutine 1 [running]:
main.(*T).Foo(0xc000000180)
	/Users/khr/gowork/tmp4.go:9
main.main()
	/Users/khr/gowork/tmp4.go:13 +0x1b
exit status 2

Note that the arguments printed for Foo are not correct. There's only one instead of two arguments. In particular, the thing we got a nil pointer exception on is not listed.

Maybe we're now not printing results, and that's why there's only one? But if that's the case, we're printing the wrong one.

Same problem with

package main

type T struct {
	x int
}

//go:noinline
func Foo(t *T) int {
	return t.x
}

func main() {
	Foo((*T)(nil))
}

So it looks like it isn't specifically due to receivers.

@cherrymui @dr2chase @thanm

@randall77 randall77 added this to the Go1.17 milestone Apr 23, 2021
@cherrymui
Copy link
Member

Yes, we are not printing results now.

Yes, the argument is wrong if it is never spilled to memory. Currently we are just printing the memory content. There are pending works that tracks whether the memory slot is accurate. For the moment I don't think we will print register values. At least, panic message says addr=0x0.

@cherrymui cherrymui added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Apr 26, 2021
@dmitshur
Copy link
Contributor

This is in NeedsDecision state and looks like a regression specific to Go 1.17. Adding release-blocker to force a decision. If this doesn't need to be resolved in time for Go 1.17 (or Go 1.17 beta 1), let's move it to Backlog milestone instead.

@cherrymui
Copy link
Member

I don't think this needs to be resolved in 1.17. It is unfortunate that with arguments in registers some traceback args are bad. To accurately track which arg live in which register/stack slots it needs a fair amount of complexity and binary size (for the metadata). I'm not sure this is worth that. Move to Backlog for now.

@tobiaskohlbau
Copy link

If this was moved out of scope for 1.17, could we ask for a statement in the release notes that under some circumstances, values printed in a stack trace, could be wrong? Otherwise it could confuse people like me in #46740. Especially if the stack trace printed the correct value in 1.16 and even does in 1.17 for darwin/arm64.

package main

//go:noinline
func crash(x int) {
	panic(x)
}

func main() {
	crash(5)
}

darwin/arm64:

panic: 5

goroutine 1 [running]:
main.crash(0x5)
	/Users/tobias/src/github.com/tobiaskohlbau/gotest/main.go:5 +0x38
main.main()
	/Users/tobias/src/github.com/tobiaskohlbau/gotest/main.go:9 +0x28
exit status 2

linux/amd64:

panic: 5

goroutine 1 [running]:
main.crash(0xc0000001a0)
	/Users/tobias/src/github.com/tobiaskohlbau/gotest/main.go:5 +0x28
main.main()
	/Users/tobias/src/github.com/tobiaskohlbau/gotest/main.go:9 +0x1e

@cherrymui
Copy link
Member

The third paragraph in https://tip.golang.org/doc/go1.17#compiler mentions that. Or you suggest changing the word inaccurate to wrong? Thanks.

@tobiaskohlbau
Copy link

tobiaskohlbau commented Jun 14, 2021

In fact it is there. With the hint from you I'm able to read exactly that into the sentence. Beforehand I've never connected the sentence with that issue. Maybe because I connected the sentence with the following:

Results (which were usually inaccurate) are no longer printed.

I thought that the bad values from:

A caveat is that the value of an argument that only lives in a register and is not stored to memory may be inaccurate.

where meant to be excluded (not printed). Maybe it's an idea to name the result values as return values to avoid confusion?

Like:
Return values (which were usually inaccurate) are no longer printed.

I don't know why but I thought the compiler tracks potential inaccurate values and skips printing these.

@cherrymui
Copy link
Member

cherrymui commented Jun 14, 2021

Thanks. I thought we mostly call them results but the spec does use both results and return values. Will do.

I don't know why but I thought the compiler tracks potential inaccurate values and skips printing these.

The compiler could do that. I have a half-baked local change that does that. But it does add extra complexity and an increase of binary size, so I'm not sure we want to do it or not.

@tobiaskohlbau
Copy link

I'm not the right person to ask if it's worth it. From a user perspective I would argue: ABIINTERNAL did decrease the binary size by a few percent. If the complexity is manageable and the increase of the binary size is not overhaul increasing from 1.16 to 1.17 I'm fine with it.

@cpuguy83
Copy link

Given this program:

package main

import (
	"context"
	"fmt"
	"runtime"
)

type foo struct {
	A string
}

func main() {
	ch := make(chan struct{})
	h := "hello"
	go func() {
		getStack(context.Background(), h, 1, foo{}, &foo{})
		close(ch)
	}()
	<-ch
}

func getStack(ctx context.Context, _ string, _ int, _ foo, _ *foo) {
	buf := make([]byte, 32*1024*1024)
	runtime.Stack(buf, true)

	fmt.Println(string(buf))
}

I get this stack trace:

goroutine 18 [running]:
main.getStack({0x0, 0x0}, {0x0, 0x0}, 0x0, {{0x0, 0x0}}, 0x0)
	/Users/cpuguy83/dev/test/stack.go:25 +0x45
main.main.func1()
	/Users/cpuguy83/dev/test/stack.go:17 +0x54
created by main.main
	/Users/cpuguy83/dev/test/stack.go:16 +0x85

goroutine 1 [chan receive]:
main.main()
	/Users/cpuguy83/dev/test/stack.go:20 +0x91

As you can see all the function parameters come up as 0x0 (or {0x0...} depending on type).
This happens when the run happens in a goroutine. If I run this without the goroutine it gives the expected(-ish) values:

goroutine 1 [running]:
main.getStack({0x0, 0xc0000001a0}, {0xc000092f70, 0x1004b79}, 0x60, {{0x0, 0x0}}, 0x1015be5)
	/Users/cpuguy83/dev/test/stack.go:19 +0x45
 16 package main
main.main()
	/Users/cpuguy83/dev/test/stack.go:14 +0x4f

What worries me is I've often used these stack traces to track down dead locks in my code where, as an example, I can trace an argument being passed down the stack to find "Request A is blocked on a mutex, and request B is blocked on "system doing things" (maybe blocked in a syscall, an RPC request, etc...) because the argument value is consistent.

With go1.17 the information is useless because it is all zero'd out.

@cherrymui
Copy link
Member

If you add some real uses of the arguments of getStack after the runtime.Stack call (e.g. return it, even the return value is not used in the caller), it will likely show some useful values.

maxatome added a commit to maxatome/go-testdeep that referenced this issue Jul 20, 2021
Since golang/go#45728

Signed-off-by: Maxime Soulé <btik-git@scoubidou.com>
@gopherbot
Copy link

Change https://golang.org/cl/352057 mentions this issue: cmd/compile, runtime: track argument stack slot liveness

@gopherbot
Copy link

Change https://golang.org/cl/359196 mentions this issue: runtime: fix noopt builder

@dmitshur dmitshur modified the milestones: Backlog, Go1.18 Oct 28, 2021
gopherbot pushed a commit that referenced this issue Oct 28, 2021
CL 352057 added track argument stack slot liveness, and updated
TestTracebackArgs for argument liveness. But when optimization is
disabled, all arguments are considered lived. The abiSel does not
consider this case and return wrong expected result.

To fix this, checking if we are running in a noopt builder and return
the correct expected result. Also, skipping TestTracebackArgs in quick
mode, since when quick mode run the test without optimization disable.

Updates #45728

Change-Id: I3737a1b1a5fa0c711fbb3218205f2f6e34f36260
Reviewed-on: https://go-review.googlesource.com/c/go/+/359196
Trust: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Run-TryBot: Cuong Manh Le <cuong.manhle.vn@gmail.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
@golang golang locked and limited conversation to collaborators Oct 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

6 participants