Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: bad pointer in frame #32288

Closed
AlekSi opened this issue May 28, 2019 · 14 comments
Closed

runtime: bad pointer in frame #32288

AlekSi opened this issue May 28, 2019 · 14 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker
Milestone

Comments

@AlekSi
Copy link
Contributor

AlekSi commented May 28, 2019

What did you do?

See https://github.com/AlekSi/go-bug.

git clone https://github.com/AlekSi/go-bug
cd go-bug
go test

What did you expect to see?

Normal panic due to bug in code:

--- FAIL: TestGoBug (0.00s)
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
…

What did you see instead?

Crash:

runtime: bad pointer in frame _/Users/aleksi/Code/My/go-bug.f at 0xc00007ced0: 0x3b
fatal error: invalid pointer found on stack
Full output
runtime: bad pointer in frame _/Users/aleksi/Code/My/go-bug.f at 0xc00007ced0: 0x3b
fatal error: invalid pointer found on stack

runtime stack:
runtime.throw(0x113e45b, 0x1e)
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/panic.go:617 +0x72 fp=0x7ffeefbfeff0 sp=0x7ffeefbfefc0 pc=0x102aa32
runtime.adjustpointers(0xc00007cec8, 0x7ffeefbff0f0, 0x7ffeefbff480, 0x12176f0, 0x1221c40)
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/stack.go:591 +0x223 fp=0x7ffeefbff050 sp=0x7ffeefbfeff0 pc=0x1040303
runtime.adjustframe(0x7ffeefbff390, 0x7ffeefbff480, 0x1221c40)
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/stack.go:633 +0x351 fp=0x7ffeefbff120 sp=0x7ffeefbff050 pc=0x1040661
runtime.gentraceback(0xffffffffffffffff, 0xffffffffffffffff, 0x0, 0xc0000c0000, 0x0, 0x0, 0x7fffffff, 0x1142318, 0x7ffeefbff480, 0x0, ...)
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/traceback.go:341 +0x139c fp=0x7ffeefbff3f8 sp=0x7ffeefbff120 pc=0x104cd2c
runtime.copystack(0xc0000c0000, 0x1000, 0x1)
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/stack.go:881 +0x25b fp=0x7ffeefbff5b0 sp=0x7ffeefbff3f8 pc=0x1040cbb
runtime.newstack()
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/stack.go:1050 +0x2fd fp=0x7ffeefbff740 sp=0x7ffeefbff5b0 pc=0x10410bd
runtime.morestack()
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/asm_amd64.s:429 +0x8f fp=0x7ffeefbff748 sp=0x7ffeefbff740 pc=0x10552bf

goroutine 17 [copystack]:
sync.(*Pool).pin(0x122c240, 0x0)
	/usr/local/Cellar/go/1.12.5/libexec/src/sync/pool.go:180 +0x70 fp=0xc00007cb50 sp=0xc00007cb48 pc=0x105f9b0
sync.(*Pool).Get(0x122c240, 0x1233dc0, 0x12aad98)
	/usr/local/Cellar/go/1.12.5/libexec/src/sync/pool.go:128 +0x2f fp=0xc00007cb98 sp=0xc00007cb50 pc=0x105f6cf
fmt.newPrinter(0x3)
	/usr/local/Cellar/go/1.12.5/libexec/src/fmt/print.go:134 +0x31 fp=0xc00007cbc0 sp=0xc00007cb98 pc=0x109dad1
fmt.Sprintf(0x113968d, 0x5, 0xc00007cc50, 0x1, 0x1, 0x10b51c3, 0xc000044490)
	/usr/local/Cellar/go/1.12.5/libexec/src/fmt/print.go:213 +0x26 fp=0xc00007cc18 sp=0xc00007cbc0 pc=0x109e056
testing.fmtDuration(0x318a, 0x11427b8, 0xbc100)
	/usr/local/Cellar/go/1.12.5/libexec/src/testing/testing.go:494 +0xc9 fp=0xc00007cc70 sp=0xc00007cc18 pc=0x10b5049
testing.(*T).report(0xc0000bc100)
	/usr/local/Cellar/go/1.12.5/libexec/src/testing/testing.go:1091 +0x47 fp=0xc00007ccf8 sp=0xc00007cc70 pc=0x10b6827
testing.tRunner.func1(0xc0000bc100)
	/usr/local/Cellar/go/1.12.5/libexec/src/testing/testing.go:829 +0x37a fp=0xc00007cd78 sp=0xc00007ccf8 pc=0x10b9b7a
runtime.call32(0x0, 0x11428d0, 0xc000076670, 0x800000008)
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/asm_amd64.s:519 +0x3b fp=0xc00007cda8 sp=0xc00007cd78 pc=0x105552b
panic(0x11138e0, 0x122c0a0)
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/panic.go:522 +0x1b5 fp=0xc00007ce38 sp=0xc00007cda8 pc=0x102a565
runtime.panicmem(...)
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/panic.go:82
runtime.sigpanic()
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/signal_unix.go:390 +0x411 fp=0xc00007ce68 sp=0xc00007ce38 pc=0x103e0d1
runtime.duffcopy()
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/duff_amd64.s:382 +0x302 fp=0xc00007ce70 sp=0xc00007ce68 pc=0x1057e62
_/Users/aleksi/Code/My/go-bug.f(0xc0000bc100, 0x12302a0, 0x5ced663f, 0x47aed48, 0x27465f126141)
	/Users/aleksi/Code/My/go-bug/gobug_test.go:32 +0x63 fp=0xc00007cf70 sp=0xc00007ce70 pc=0x10f0cb3
_/Users/aleksi/Code/My/go-bug.TestGoBug(0xc0000bc100)
	/Users/aleksi/Code/My/go-bug/gobug_test.go:37 +0x37 fp=0xc00007cfa8 sp=0xc00007cf70 pc=0x10f0dc7
testing.tRunner(0xc0000bc100, 0x1142100)
	/usr/local/Cellar/go/1.12.5/libexec/src/testing/testing.go:865 +0xc0 fp=0xc00007cfd0 sp=0xc00007cfa8 pc=0x10b5ab0
runtime.goexit()
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc00007cfd8 sp=0xc00007cfd0 pc=0x10570a1
created by testing.(*T).Run
	/usr/local/Cellar/go/1.12.5/libexec/src/testing/testing.go:916 +0x35a

goroutine 1 [chan receive]:
runtime.gopark(0x1142598, 0xc00009e238, 0x170d, 0x3)
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/proc.go:301 +0xef fp=0xc0000afbb8 sp=0xc0000afb98 pc=0x102c77f
runtime.goparkunlock(...)
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/proc.go:307
runtime.chanrecv(0xc00009e1e0, 0xc0000afccf, 0xc000000101, 0x10b5e6a)
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/chan.go:524 +0x2ea fp=0xc0000afc48 sp=0xc0000afbb8 pc=0x100591a
runtime.chanrecv1(0xc00009e1e0, 0xc0000afccf)
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/chan.go:406 +0x2b fp=0xc0000afc78 sp=0xc0000afc48 pc=0x100561b
testing.(*T).Run(0xc0000bc100, 0x1139fdf, 0x9, 0x1142100, 0x1069b76)
	/usr/local/Cellar/go/1.12.5/libexec/src/testing/testing.go:917 +0x381 fp=0xc0000afd28 sp=0xc0000afc78 pc=0x10b5e91
testing.runTests.func1(0xc0000bc000)
	/usr/local/Cellar/go/1.12.5/libexec/src/testing/testing.go:1157 +0x78 fp=0xc0000afd78 sp=0xc0000afd28 pc=0x10b9c68
testing.tRunner(0xc0000bc000, 0xc0000afe30)
	/usr/local/Cellar/go/1.12.5/libexec/src/testing/testing.go:865 +0xc0 fp=0xc0000afda0 sp=0xc0000afd78 pc=0x10b5ab0
testing.runTests(0xc00009a020, 0x122c2a0, 0x1, 0x1, 0x0)
	/usr/local/Cellar/go/1.12.5/libexec/src/testing/testing.go:1155 +0x2a9 fp=0xc0000afe60 sp=0xc0000afda0 pc=0x10b72f9
testing.(*M).Run(0xc0000ba000, 0x0)
	/usr/local/Cellar/go/1.12.5/libexec/src/testing/testing.go:1072 +0x162 fp=0xc0000aff30 sp=0xc0000afe60 pc=0x10b6272
main.main()
	_testmain.go:42 +0x13e fp=0xc0000aff98 sp=0xc0000aff30 pc=0x10f0fee
runtime.main()
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/proc.go:200 +0x20c fp=0xc0000affe0 sp=0xc0000aff98 pc=0x102c38c
runtime.goexit()
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc0000affe8 sp=0xc0000affe0 pc=0x10570a1

goroutine 2 [force gc (idle)]:
runtime.gopark(0x1142598, 0x1233270, 0x1410, 0x1)
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/proc.go:301 +0xef fp=0xc000044fb0 sp=0xc000044f90 pc=0x102c77f
runtime.goparkunlock(...)
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/proc.go:307
runtime.forcegchelper()
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/proc.go:250 +0xb7 fp=0xc000044fe0 sp=0xc000044fb0 pc=0x102c627
runtime.goexit()
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc000044fe8 sp=0xc000044fe0 pc=0x10570a1
created by runtime.init.5
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/proc.go:239 +0x35

goroutine 3 [GC sweep wait]:
runtime.gopark(0x1142598, 0x12333e0, 0x140c, 0x1)
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/proc.go:301 +0xef fp=0xc0000457a8 sp=0xc000045788 pc=0x102c77f
runtime.goparkunlock(...)
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/proc.go:307
runtime.bgsweep(0xc00001a070)
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/mgcsweep.go:70 +0x9c fp=0xc0000457d8 sp=0xc0000457a8 pc=0x101f2cc
runtime.goexit()
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc0000457e0 sp=0xc0000457d8 pc=0x10570a1
created by runtime.gcenable
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/mgc.go:208 +0x58

goroutine 4 [finalizer wait]:
runtime.gopark(0x1142598, 0x124e860, 0x140f, 0x1)
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/proc.go:301 +0xef fp=0xc000045f58 sp=0xc000045f38 pc=0x102c77f
runtime.goparkunlock(...)
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/proc.go:307
runtime.runfinq()
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/mfinal.go:175 +0xa6 fp=0xc000045fe0 sp=0xc000045f58 pc=0x1015f96
runtime.goexit()
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/asm_amd64.s:1337 +0x1 fp=0xc000045fe8 sp=0xc000045fe0 pc=0x10570a1
created by runtime.createfing
	/usr/local/Cellar/go/1.12.5/libexec/src/runtime/mfinal.go:156 +0x61
exit status 2
FAIL	_/Users/aleksi/Code/My/go-bug	0.008s

System details

go version go1.12.5 darwin/amd64
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/aleksi/Library/Caches/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/aleksi/Soft/GOPATH"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/Cellar/go/1.12.5/libexec"
GOTMPDIR=""
GOTOOLDIR="/usr/local/Cellar/go/1.12.5/libexec/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
GOROOT/bin/go version: go version go1.12.5 darwin/amd64
GOROOT/bin/go tool compile -V: compile version go1.12.5
uname -v: Darwin Kernel Version 18.6.0: Thu Apr 25 23:16:27 PDT 2019; root:xnu-4903.261.4~2/RELEASE_X86_64
ProductName:	Mac OS X
ProductVersion:	10.14.5
BuildVersion:	18F132
lldb --version: lldb-1001.0.13.3
  Swift-5.0
@mark-rushakoff
Copy link
Contributor

It looks like this is fixed on tip (go version devel +0f897f916a Tue May 28 02:52:39 2019 +0000 darwin/amd64):

--- FAIL: TestGoBug (0.00s)
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1059e22]

goroutine 34 [running]:
testing.tRunner.func1(0xc0000b6100)
	/Users/mr/gotip/src/github.com/golang/go/src/testing/testing.go:874 +0x3a3
panic(0x111be80, 0x1231f40)
	/Users/mr/gotip/src/github.com/golang/go/src/runtime/panic.go:619 +0x1b2
_/tmp/xfsa/go-bug.f(0xc0000b6100, 0x12363a0, 0x5ced681d, 0x110d7a80, 0x5fdf7bd0d1eee)
	/tmp/xfsa/go-bug/gobug_test.go:32 +0x55
_/tmp/xfsa/go-bug.TestGoBug(0xc0000b6100)
	/tmp/xfsa/go-bug/gobug_test.go:37 +0x37
testing.tRunner(0xc0000b6100, 0x114d198)
	/Users/mr/gotip/src/github.com/golang/go/src/testing/testing.go:909 +0xbf
created by testing.(*T).Run
	/Users/mr/gotip/src/github.com/golang/go/src/testing/testing.go:960 +0x350
exit status 2
FAIL	_/tmp/xfsa/go-bug	0.009s

The nil pointer dereference is a correct error:

	e := (*Struct)(nil)
	if e != nil {
		return res
	}
	res = append(res, *e) // e is nil here, it can't be dereferenced.

@AlekSi
Copy link
Contributor Author

AlekSi commented May 28, 2019

The nil pointer dereference is a correct error

Yes, that's the real problem. Still, there should not be "fatal error".

I'm not sure it is fixed in the tip. I can "fix" this code for 1.12.5 by removing one field from Struct, but that's hardly a fix.

@josharian

This comment has been minimized.

@josharian josharian added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker labels May 28, 2019
@josharian josharian added this to the Go1.13 milestone May 28, 2019
@gopherbot

This comment has been minimized.

@josharian

This comment has been minimized.

@josharian
Copy link
Contributor

Never mind. My patch introduces a bug into the compiler. Sigh. It is different than @AlekSi's. I was distracted by the fact that they looked similar and arrived at the same time.

@josharian
Copy link
Contributor

As penance, I'll see about bisecting/investigating @AlekSi's.

@AlekSi
Copy link
Contributor Author

AlekSi commented May 28, 2019

I bisected it down to 69c2c56. /cc @randall77

@josharian
Copy link
Contributor

Slightly minimized version:

package gobug

import (
	"testing"
)

type T struct {
	s   [1]string
	pad [16]uintptr
}

func f(t *int, p *int) []T {
	var res []T
	for {
		var e *T
		res = append(res, *e)
	}
}

func TestGoBug(t *testing.T) {
	f(nil, nil)
}

Run with go test.

I've looked at this a bit but am out of time for now. Notes and impressions:

Dereferencing *e causes a panic. This is caught by testing.tRunner.func1. That calls t.report, which calls fmtDuration, which calls into package fmt, which eventually calls sync.Pool.pin. That call causes stack growth; the crash occurs while adjusting f's pointers.

The data on the stack for the temp for *e appears to be junk. It looks like that is populated by a DUFFCOPY. I speculate that the call to DUFFCOPY fails with a SIGSEGV early on, leaving *e containing junk, but that during the recover, we expect to be able to adjust pointers within it. The sequence of calls leading up to this happen to ensure that we have a 0x2a in the string's pointer slot on the stack and that stack growth occurs during the recover.

@randall77
Copy link
Contributor

Tag I'm it :)

I agree with your analysis.
I think the problem is here:

v13 (16) = VarDef <mem> {.autotmp_5} v58
v50 (16) = LEAQ <*T> {.autotmp_5} v2 : DI
v18 (16) = MOVQconst <*T> [0] : SI
v15 (+16) = DUFFCOPY <mem> [770] v50 v18 v13

After the VarDef, the autotmp is alive. But we load nil in to the source register, then call duffcopy. The duffcopy segfaults trying to copy from nil, without initializing the target. That leaves .autotmp_5 uninitialized, but live.

I think we need to ensure that all instructions between a VarDef for a variable, and that variable being fully initialized, are not faulting instructions (or GC safepoints, for that matter). In this particular case, it looks like we got rid of the nil check in the late nilcheck elim pass. Before that pass, we have:

v12 (+16) = LoweredNilCheck <void> v10 v58
v13 (16) = VarDef <mem> {.autotmp_5} v58
v14 (16) = LEAQ <*T> {.autotmp_5} v2
v15 (16) = DUFFCOPY <mem> [770] v14 v10 v13

(v10 = MOVQconst [0].) After late nilcheck elim, v12 is gone.

We might just need to treat VarDef as clearing the unnecessary list in nilcheckelim2.

@josharian
Copy link
Contributor

@randall77 sounds reasonable. But is it merely an accident that your mid-stack inlining CL is when this begins? When I looked, the generated code before/after was identical.

@randall77
Copy link
Contributor

I'm not sure about that part.

@gopherbot
Copy link

Change https://golang.org/cl/179239 mentions this issue: cmd/compile: don't move nil checks across a VarDef

@randall77
Copy link
Contributor

I don't know why my CL was the culprit. But the test included in the CL I just mailed also fails on 1.11, so it can't be directly caused by my culprit, which landed in 1.12. This bug is very sensitive to what's left over on the stack, so it can come and go with any number of otherwise unrelated CLs.

@golang golang locked and limited conversation to collaborators May 30, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker
Projects
None yet
Development

No branches or pull requests

5 participants