Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plugin: impossible type kind 0 #17709

Closed
aclements opened this issue Oct 31, 2016 · 15 comments
Closed

plugin: impossible type kind 0 #17709

aclements opened this issue Oct 31, 2016 · 15 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@aclements
Copy link
Member

What version of Go are you using (go version)?

go version devel +f9e1adb Mon Oct 31 21:58:08 2016 +0000 linux/386

What operating system and processor architecture are you using (go env)?

GOARCH="386"
GOBIN=""
GOEXE=""
GOHOSTARCH="386"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/austin/r/go"
GORACE=""
GOROOT="/home/austin/go.dev"
GOTOOLDIR="/home/austin/go.dev/pkg/tool/linux_386"
GCCGO="gccgo"
GO386=""
CC="gcc"
GOGCCFLAGS="-fPIC -m32 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build397883128=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
PKG_CONFIG="pkg-config"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"

What did you do?

git checkout f9e1adb
cd src && GOARCH=386 GOHOSTARCH=386 ./make.bash
cd ../misc/cgo/testplugin && ./test.bash

What did you expect to see?

PASS

What did you see instead?

runtime: impossible type kind 0
fatal error: runtime: impossible type kind

goroutine 1 [running]:
runtime.throw(0x815bfa2, 0x1d)
	/home/austin/go.dev/src/runtime/panic.go:596 +0x7c fp=0x18551a60 sp=0x18551a54
runtime.typesEqual(0xf51e1780, 0xf52bd580, 0xf52bd580)
	/home/austin/go.dev/src/runtime/type.go:653 +0x792 fp=0x18551ae8 sp=0x18551a60
runtime.typesEqual(0xf51e4820, 0xf52c2700, 0xe105ecb)
	/home/austin/go.dev/src/runtime/type.go:608 +0x604 fp=0x18551b70 sp=0x18551ae8
runtime.typelinksinit()
	/home/austin/go.dev/src/runtime/type.go:506 +0x2f5 fp=0x18551c84 sp=0x18551b70
plugin.lastmoduleinit(0xf74901f0, 0x1850db10, 0xf7491668)
	/home/austin/go.dev/src/runtime/plugin.go:46 +0x14a fp=0x18551ce8 sp=0x18551c84
plugin.open(0x185103c0, 0x36, 0x0, 0x0, 0x0)
	/home/austin/go.dev/src/plugin/plugin_dlopen.go:72 +0x1eb fp=0x18551e14 sp=0x18551ce8
plugin.Open(0x185103c0, 0x36, 0x185103c0, 0x36, 0x0)
	/home/austin/go.dev/src/plugin/plugin.go:30 +0x29 fp=0x18551e2c sp=0x18551e14
main.main()
	/home/austin/go.dev/misc/cgo/testplugin/src/host/host.go:59 +0x392 fp=0x18551fc8 sp=0x18551e2c
runtime.main()
	/home/austin/go.dev/src/runtime/proc.go:185 +0x1f2 fp=0x18551ff0 sp=0x18551fc8
runtime.goexit()
	/home/austin/go.dev/src/runtime/asm_386.s:1629 +0x1 fp=0x18551ff4 sp=0x18551ff0

goroutine 17 [syscall, locked to thread]:
runtime.goexit()
	/home/austin/go.dev/src/runtime/asm_386.s:1629 +0x1

I noticed this failure on the dashboard, where linux-386, linux-386-clang, and linux-386-sid all failed at commit f9e1adb. I was also able to reproduce it locally. This seems to be completely deterministic at this commit, but cleared up in the very next commit. Neither commit has anything to do with plugins, and f9e1adb was basically a trivial refactoring of a runtime function, so I suspect this has something to do with particular load addresses or alignment.

@aclements aclements added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 31, 2016
@aclements aclements added this to the Go1.8 milestone Oct 31, 2016
@aclements
Copy link
Member Author

Looks like this started today:

$ greplogs -E "impossible type kind 0" -dashboard -md -l
2016-10-31T04:49:52-f4c7a12/linux-amd64-noopt
2016-10-31T16:39:55-53c004f/linux-amd64-noopt
2016-10-31T17:17:05-398e861/linux-amd64-noopt
2016-10-31T17:17:42-4b90b7a/linux-amd64-noopt
2016-10-31T17:17:46-92568bc/linux-amd64-noopt
2016-10-31T17:39:50-61ffec4/linux-amd64-noopt
2016-10-31T18:05:59-051cf38/linux-amd64-noopt
2016-10-31T19:13:52-854ae03/linux-amd64-noopt
2016-10-31T19:14:01-7b50bd8/linux-amd64-noopt
2016-10-31T19:18:47-49b2dd5/linux-amd64-noopt
2016-10-31T19:20:44-b5203be/linux-amd64-noopt
2016-10-31T19:26:51-4de3df8/linux-amd64-noopt
2016-10-31T19:26:59-35d342e/linux-amd64-noopt
2016-10-31T19:27:05-3f6070c/linux-amd64-noopt
2016-10-31T19:34:18-eec1e5d/linux-amd64-noopt
2016-10-31T21:58:08-f9e1adb/linux-386
2016-10-31T21:58:08-f9e1adb/linux-386-clang
2016-10-31T21:58:08-f9e1adb/linux-386-sid

$ greplogs -E "impossible type kind 0" -dashboard -l | findflakes -paths
First observed f4c7a12 31 Oct 04:49 2016 (24 commits ago)
Last observed  f9e1adb 31 Oct 21:58 2016 (2 commits ago)
10% chance failure is still happening
68% failure probability (16 of 23 commits)
Likely culprits:
   68% f4c7a12 runtime: make module typemaps visible to the GC
   22% 9da7058 cmd/link, plugin: use full plugin path for symbols
    7% 590fce4 compress/flate: tighten the BestSpeed max match offset bound.
No known past failures

@randall77
Copy link
Contributor

CL 32313 might be tripping up on this same issue.

@crawshaw
Copy link
Member

I believe this is fixed by https://golang.org/cl/29692. Someone want to try it out/review it?

@crawshaw
Copy link
Member

crawshaw commented Nov 1, 2016

The good news is I have replicated this reliably, and it does not appear to be related to threads. Bad news is CL 29692 does not fix it.

@crawshaw
Copy link
Member

crawshaw commented Nov 1, 2016

The problem appears to be overly aggressive dead code elimination. I can fix it by marking all itablinks in a plugin as reachable.

@gopherbot
Copy link

CL https://golang.org/cl/32532 mentions this issue.

@crawshaw
Copy link
Member

crawshaw commented Nov 1, 2016

OK that doesn't work. Here's what I know so far. With this extra diagnostics:

diff --git a/src/runtime/type.go b/src/runtime/type.go
index a3a19b9..4ed5e59 100644
--- a/src/runtime/type.go
+++ b/src/runtime/type.go
@@ -604,7 +604,14 @@ func typesEqual(t, v *_type) bool {
                        if tname.pkgPath() != vname.pkgPath() {
                                return false
                        }
-                       if !typesEqual(it.typ.typeOff(tm.ityp), iv.typ.typeOff(vm.ityp)) {
+                       tityp := it.typ.typeOff(tm.ityp)
+                       vityp := iv.typ.typeOff(vm.ityp)
+                       if tk, vk := (tityp.kind & kindMask), (vityp.kind & kindMask); tk == 0 || vk == 0 {
+                               println("typesEqual interface", t.string(), "has bad method itype:")
+                               println("\ttityp.kind=", tityp.kind, "tname=", tname.name())
+                               println("\tvityp.kind=", vityp.kind, "vname=", vname.name())
+                       }
+                       if !typesEqual(tityp, vityp) {
                                return false
                        }
                }

I get:

:~/go/misc/cgo/testplugin$ ./test.bash 
typesEqual interface error has bad method itype:
    tityp.kind= 0 tname= 
    vityp.kind= 0 vname= 
runtime: impossible type kind 0
fatal error: runtime: impossible type kind

goroutine 1 [running]:
runtime.throw(0x54b306, 0x1d)
    /usr/local/google/home/crawshaw/go/src/runtime/panic.go:596 +0x95 fp=0xc420077458 sp=0xc420077438
runtime.typesEqual(0x7f64f860c900, 0x7f64f8922980, 0x0)
    /usr/local/google/home/crawshaw/go/src/runtime/type.go:659 +0xb5b fp=0xc420077580 sp=0xc420077458
runtime.typesEqual(0x7f64f8610fc0, 0x7f64f8927000, 0xc40e105ecb)
    /usr/local/google/home/crawshaw/go/src/runtime/type.go:614 +0x964 fp=0xc4200776a8 sp=0xc420077580
runtime.typelinksinit()
    /usr/local/google/home/crawshaw/go/src/runtime/type.go:506 +0x43f fp=0xc4200778b8 sp=0xc4200776a8
plugin.lastmoduleinit(0x13db420, 0xc42000fb58, 0x13dcbd0)
    /usr/local/google/home/crawshaw/go/src/runtime/plugin.go:47 +0x1e7 fp=0xc420077990 sp=0xc4200778b8
plugin.open(0xc420014140, 0x45, 0x0, 0x0, 0x0)
    /usr/local/google/home/crawshaw/go/src/plugin/plugin_dlopen.go:72 +0x25d fp=0xc420077bb0 sp=0xc420077990
plugin.Open(0xc420014140, 0x45, 0xc420014140, 0x45, 0x0)
    /usr/local/google/home/crawshaw/go/src/plugin/plugin.go:30 +0x35 fp=0xc420077be8 sp=0xc420077bb0
main.main()
    /usr/local/google/home/crawshaw/go/misc/cgo/testplugin/src/host/host.go:59 +0x42c fp=0xc420077f88 sp=0xc420077be8
runtime.main()
    /usr/local/google/home/crawshaw/go/src/runtime/proc.go:185 +0x20a fp=0xc420077fe0 sp=0xc420077f88
runtime.goexit()
    /usr/local/google/home/crawshaw/go/src/runtime/asm_amd64.s:2184 +0x1 fp=0xc420077fe8 sp=0xc420077fe0

goroutine 17 [syscall, locked to thread]:
runtime.goexit()
    /usr/local/google/home/crawshaw/go/src/runtime/asm_amd64.s:2184 +0x1

Instrumenting the dead code analysis in the linker shows the symbol "type.error" being processed, along with its method signature "Error() (type.string)". The two relevant symbols type..namedata.Error. and type.func() string are marked, and with nm I can see them in both plugins.

Right now the most interesting thing in this is that tname.name() is resolving to an empty string instead of "Error". Next up is working out what the real nameOff value should be and how far we are off. (Did a dynamic relocation move our base to the wrong module?)

@randall77
Copy link
Contributor

randall77 commented Nov 1, 2016

When I looked at this I dumped the raw data that t was pointing to. It was landing in type data, but not at the start of a type. It was 32 bytes before the start of a real type descriptor.

@crawshaw
Copy link
Member

crawshaw commented Nov 1, 2016

Yeah I suspect an offset is being resolved against the wrong module. Ugh.

@crawshaw
Copy link
Member

crawshaw commented Nov 1, 2016

I think I've got it. The underlying array in interfacetype.mhdr is dynamically relocated in the freshly loaded module to equivalent array in an earlier module. Then typesEqual attempts to resolve offsets from inside that array agains the _type in the outer interfacetype, which is a different module.

The full fix is to finish the project of removing pointers from the type information. But that's a substantial undertaking that will have to wait for 1.9. For now I'm going to read through this code, find any potential typeOff/nameOff method calls matching against the wrong module and switch to using the slice array as the base pointer.

@gopherbot
Copy link

CL https://golang.org/cl/32513 mentions this issue.

@crawshaw
Copy link
Member

crawshaw commented Nov 1, 2016

cc @mwhudson, as I suspect this is a problem for -buildmode=shared on Go 1.7.

@mwhudson
Copy link
Contributor

mwhudson commented Nov 1, 2016

On 2 November 2016 at 08:51, David Crawshaw notifications@github.com
wrote:

cc @mwhudson https://github.com/mwhudson, as I suspect this is a
problem for -buildmode=shared on Go 1.7.

How would I be able to tell? (I'll try applying some thinking too but I
might as well ask you in parallel :-p)

@mwhudson
Copy link
Contributor

mwhudson commented Nov 1, 2016

On 2 November 2016 at 07:55, David Crawshaw notifications@github.com
wrote:

I think I've got it. The underlying array in interfacetype.mhdr is
dynamically relocated in the freshly loaded module to equivalent array in
an earlier module. Then typesEqual attempts to resolve offsets from inside
that array agains the _type in the outer interfacetype, which is a
different module.

I know you have a different fix, but my instinctive reaction to this
problem statement is that some symbol or other should be local. Is that not
the case here?

@crawshaw
Copy link
Member

crawshaw commented Nov 2, 2016

How would I be able to tell?

In CL 32513 I swap the typeOff method for resolveTypeOff. You can modify it to do both, compare the outputs, and if they're not equal, panic.

some symbol or other should be local

I think you could fix this explicit problem with local symbols, yes. But it's a little tricky because you don't want the symbols in the type section that turn into *_type to be local in case there are still dynamic relocations hanging about. Overall this change seemed easier.

@golang golang locked and limited conversation to collaborators Nov 2, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

5 participants