Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/link: loading c-shared into Go program crashes on Windows #22192

Open
mattn opened this issue Oct 10, 2017 · 44 comments
Open

cmd/link: loading c-shared into Go program crashes on Windows #22192

mattn opened this issue Oct 10, 2017 · 44 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows
Milestone

Comments

@mattn
Copy link
Member

mattn commented Oct 10, 2017

What version of Go are you using (go version)?

go version devel +bb0bfd002a Tue Oct 10 01:02:27 2017 +0000 windows/amd64

Does this issue reproduce with the latest release?

Only tip

What operating system and processor architecture are you using (go env)?

set GOARCH=amd64
set GOBIN=
set GOEXE=.exe
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOOS=windows
set GOPATH=c:/dev/go
set GORACE=
set GOROOT=c:\go
set GOTOOLDIR=c:\go\pkg\tool\windows_amd64
set GCCGO=gccgo
set CC=gcc
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\Users\mattn\AppData\Local\Temp\go-build049037059=/tmp/go-build -gno-record-gcc-switches
set CXX=g++
set CGO_ENABLED=1
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config

What did you do?

foo.go

package main

import "C"
import (
	"fmt"
)

//export Foo
func Foo() {
	fmt.Println("foo")
}

func main() {
}

creating foo.dll

go build -buildmode=c-shared -o foo.dll foo.go

call Foo in foo.dll

package main

import (
	"syscall"
)

func main() {
	syscall.NewLazyDLL("foo.dll").NewProc("Foo").Call()
}

What did you see instead?

output foo

What did you expect to see?

panic

fatal error: unexpected signal during runtime execution
[signal 0xc0000005 code=0x0 addr=0xfffffffac5feddfe pc=0x6ae92064]

goroutine 1 [running]:
runtime.throw(0x6af45d8d, 0x2a)
	c:/go/src/runtime/panic.go:616 +0x88 fp=0xc042033730 sp=0xc042033710 pc=0x6aea9398
runtime.sigpanic()
	c:/go/src/runtime/signal_windows.go:155 +0x170 fp=0xc042033760 sp=0xc042033730 pc=0x6aeb9da0
runtime.heapBitsSetType(0xc042034020, 0x10, 0x10, 0x6af1f520)
	c:/go/src/runtime/mbitmap.go:921 +0x544 fp=0xc0420337b8 sp=0xc042033760 pc=0x6ae92064
runtime.mallocgc(0x10, 0x6af1f520, 0x1, 0xc04200c2c0)
	c:/go/src/runtime/malloc.go:740 +0x582 fp=0xc042033858 sp=0xc0420337b8 pc=0x6ae8f202
runtime.convT2Estring(0x6af1f520, 0xc0420338e8, 0x40, 0x6af52100)
	c:/go/src/runtime/iface.go:357 +0x71 fp=0xc042033890 sp=0xc042033858 pc=0x6ae8d361
runtime.preprintpanics(0xc042033978)
	c:/go/src/runtime/panic.go:402 +0xe6 fp=0xc042033908 sp=0xc042033890 pc=0x6aea8896
panic(0x6af26ee0, 0x6afb2ec0)
	c:/go/src/runtime/panic.go:543 +0x41b fp=0xc0420339b0 sp=0xc042033908 pc=0x6aea8edb
runtime.panicmem()
	c:/go/src/runtime/panic.go:63 +0x59 fp=0xc0420339d0 sp=0xc0420339b0 pc=0x6aea7bc9
runtime.sigpanic()
	c:/go/src/runtime/signal_windows.go:161 +0x83 fp=0xc042033a00 sp=0xc0420339d0 pc=0x6aeb9cb3
os.(*File).write(0x0, 0xc04200e068, 0x4, 0x8, 0x0, 0x0, 0xc2034028)
	c:/go/src/os/file_windows.go:224 +0x29 fp=0xc042033a48 sp=0xc042033a00 pc=0x6aeec0a9
os.(*File).Write(0x0, 0xc04200e068, 0x4, 0x8, 0x0, 0x0, 0x0)
	c:/go/src/os/file.go:140 +0x73 fp=0xc042033ac0 sp=0xc042033a48 pc=0x6aeeb943
fmt.Fprintln(0x6af520a0, 0x0, 0xc042033ba8, 0x1, 0x1, 0x4b3600, 0xc042033c48, 0x3)
	c:/go/src/fmt/print.go:255 +0x89 fp=0xc042033b28 sp=0xc042033ac0 pc=0x6af08109
fmt.Println(0xc042033ba8, 0x1, 0x1, 0x0, 0x4b3600, 0xc042033c90)
	c:/go/src/fmt/print.go:264 +0x5e fp=0xc042033b78 sp=0xc042033b28 pc=0x6af081ce
main.Foo()
	C:/dev/go-sandbox/dll/dll.go:10 +0x64 fp=0xc042033bc8 sp=0xc042033b78 pc=0x6af0f334
main._cgoexpwrap_e001a1a948c6_Foo()
	b002/_cgo_gotypes.go:45 +0x27 fp=0xc042033bd8 sp=0xc042033bc8 pc=0x6af0f2b7
runtime.call32(0x0, 0x22fc90, 0x22fe0f, 0x0)
	c:/go/src/runtime/asm_amd64.s:509 +0x42 fp=0xc042033c08 sp=0xc042033bd8 pc=0x6aecde12
runtime.cgocallbackg1(0x0)
	c:/go/src/runtime/cgocall.go:316 +0x1aa fp=0xc042033c88 sp=0xc042033c08 pc=0x6ae82c0a
runtime.cgocallbackg(0x0)
	c:/go/src/runtime/cgocall.go:194 +0xef fp=0xc042033cf0 sp=0xc042033c88 pc=0x6ae829bf
runtime.cgocallback_gofunc(0x402150, 0x44ac60, 0x4b3948, 0xc042033d50)
	c:/go/src/runtime/asm_amd64.s:762 +0xaf fp=0xc042033d10 sp=0xc042033cf0 pc=0x6aecf46f

goroutine 1 [runnable, locked to thread]:
syscall.Syscall(0x775b54e0, 0x3, 0x3, 0x1, 0x0, 0x0, 0x0, 0x0)
	c:/go/src/runtime/syscall_windows.go:171 +0xf9
syscall.SetHandleInformation(0x3, 0x1, 0x30, 0x6af31900)
	c:/go/src/syscall/zsyscall_windows.go:936 +0x6b
syscall.CloseOnExec(0x3)
	c:/go/src/syscall/exec_windows.go:125 +0x3b
syscall.getStdHandle(0xfffffffffffffff6, 0x6af41949)
	c:/go/src/syscall/syscall_windows.go:370 +0x45

As far as I can see, panic occur at (*File) write(b []byte)

func (f *File) write(b []byte) (n int, err error) {
	n, err = f.pfd.Write(b)
	runtime.KeepAlive(f)
	return n, err
}

Maybe, poller in dll is conflicting with main process.

@mattn mattn changed the title cmd/go: panic with calling poller in dll cmd/go: panic with calling function using poller in dll Oct 10, 2017
@mattn
Copy link
Member Author

mattn commented Oct 10, 2017

If calling from C, works fine for me.

#include <windows.h>

typedef void (*pfn)(void);

int
main(int argc, char* argv[]) {
    pfn fn;
    HMODULE h = LoadLibrary("foo.dll");
    fn = (pfn) GetProcAddress(h, "Foo");
    fn();
    return 0;
}

@alexbrainman
Copy link
Member

I suspect what is happening here is that you cannot have 2 Go runtimes coexist in a single process. In particular the way we access TLS has to be changed to allow 2 or more runtimes in one process. I will let Ian decide what to do here.

Alex

@ianlancetaylor
Copy link
Contributor

The intent of c-shared is to permit loading a Go DLL into a C program. You seem to be using c-shared to load a Go DLL into a Go program. That is problematic. The expectation is that you would be using -buildmode=plugin or -buildmode=shared here.

But I can see why Windows developers would expect this to work, especially since neither -buildmode=plugin nor -buildmode=shared currently work on Windows.

But I don't know how to make it work. I'm not sure what we should do.

@ianlancetaylor ianlancetaylor changed the title cmd/go: panic with calling function using poller in dll cmd/go: loading c-shared into Go program crashes on Windows Oct 10, 2017
@ianlancetaylor ianlancetaylor added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows labels Oct 10, 2017
@ianlancetaylor ianlancetaylor added this to the Unplanned milestone Oct 10, 2017
@mattn
Copy link
Member Author

mattn commented Oct 11, 2017

@ianlancetaylor This is caused by that runtime.islibrary is set to 1 when it called from exe built with Go. When it called from Go, all of global variables are not initialized (include os.Stdout). As alex mentioned, add way to access TLS or put global variables into bss section?

@ianlancetaylor
Copy link
Contributor

In general, Go requires that the complete executable image have only a single copy of the runtime package in the overall executable image. I don't know how we can make it work if there are multiple copies. A c-shared is intended to have its own copy of the runtime, since a c-shared is expected to be loaded by a C program that will not, of course, have a copy of the runtime. Loading a c-shared by a Go program inevitably means multiple copies of the runtime. Windows does not have ELF-style interposition, so I see no simple way to make the c-shared used the Go program's copy of the runtime.

@mattn
Copy link
Member Author

mattn commented Oct 11, 2017

How about to install main-runtime into copies using function exported from DLL? For example, DLL export runtime.install like below.

type runtime_share strcut {
	name string
	ptr uintptr
}

var runtime_shared []runtime_share

func runtime_install(data []runtime_share)
	for _, rs := range runtime_share
		*((*unsafe.Pointer)(find_runtime_share(rs[i].name).ptr)) =
			*((*unsafe.Pointer)(rs.ptr))
	}
}

And runtime_shared is appended from each packages used in DLL.

os/file.go

func init() {
	runtime_install([]runtime_share{
		{"os.Stdout", &os.Stdout},
		{"os.Stdin", &os.Stdin},
		...
	})
}

@ianlancetaylor
Copy link
Contributor

That suggestion seems to get the initial values the same, which is a start, but we really need to have a single instance of each variable. When one instance of the runtime changes a variable, the other instance of the runtime has to see that change.

@mattn
Copy link
Member Author

mattn commented Oct 11, 2017

Yes, also runtime functions in DLL should be replated to runtime functions in main.

@mattn
Copy link
Member Author

mattn commented Oct 11, 2017

This is talking about -buildmode=shared. Let's focus to c-shared. At least, about c-shared, we need to initialize runtime as same as call from C.

@mattn
Copy link
Member Author

mattn commented Oct 11, 2017

Sorry, I was confused. runtime.islibrary is not related on this issue. This seems to be conflicting of bss section?

@alexbrainman
Copy link
Member

But I can see why Windows developers would expect this to work, especially since neither -buildmode=plugin nor -buildmode=shared currently work on Windows.

What is involved implementing either of those? Do I start with correspondent /misc/cgo/test... and see where it takes me? Do you think it is hard? Do you think it is possible to implement them on WIndows?

I will also try to see why @mattn example fails (when I have time). Maybe it is something simple.

I also wonder what happens, if two or more Go DLLs are loaded from C program. I suspect we might get a similar crash too.

Alex

@ianlancetaylor
Copy link
Contributor

Unfortunately I don't remember all of the details of Windows DLLs. My vague recollection is that a Windows DLL needs to have a clear notion of whether a symbol is defined in a different DLL at compile time. If that is correct, it doesn't fit will with the current approach, in which we only really know whether a symbol is defined in a different DLL at link time. That would be the first question to resolve.

@alexbrainman
Copy link
Member

My vague recollection is that a Windows DLL needs to have a clear notion of whether a symbol is defined in a different DLL at compile time.

Windows DLL is similar to Windows EXE regarding external symbols - all external symbols (functions) are kept in a DLL file. There are 2 ways that external DLL functions are found:

  • executable or dll can find them manually by calling LoadLibrary and GetProcAddress Windows APIs;
  • executable or dll can have "import" section that provides instructions for Windows program loader about how to find all required code.

Perhaps you are talking about something else.

Alex

@ianlancetaylor
Copy link
Contributor

The question is not so much where functions are found, as how the compiler needs to handle the code that refers to those functions. When compiling a reference to a variable defined in a different DLL, does the compiler need to know that the variable is defined elsewhere?

@alexbrainman
Copy link
Member

When compiling a reference to a variable defined in a different DLL, does the compiler need to know that the variable is defined elsewhere?

Compiler does need to know that function lives in external DLL. Like I said before both Windows executables and DLLs have "import" section that is used by Windows to resolve all external call sites when program starts. So as long as Go compiler arranges all those external calls as required, it all just works. That is what we do for Windows executables already.

Alex

@alexbrainman
Copy link
Member

I suspect this issue might be due to us using unsupported TLS slot.

https://devblogs.microsoft.com/oldnewthing/20190418-00/?p=102428

Windows amd64 is using

MOVQ 0x28(GS), CX

instruction to read Arbitrary data slot from https://en.wikipedia.org/wiki/Win32_Thread_Information_Block

But we need to read TLS slots, 4/8 bytes per slot, 64 slots instead. The problem I am having is that TLS slots, 4/8 bytes per slot, 64 slots contains 64 8 byte slots, and slot index is a variable, unlike current 0x28(GS) that we have now. I could not come up with correct amd64 instruction for that.

Our windows-arm port is already doing correct hing - see runtime·save_g and runtime·load_g. But, I still could not copy that code onto amd64. Perhaps amd64 and arm assemblers are different.

@ianlancetaylor can you help? Thank you.

Alex

@ianlancetaylor
Copy link
Contributor

I don't have any simple solutions. I would look into the way that Android supports TLS on amd64 before Android Q.

@whtiehack
Copy link

whtiehack commented Sep 8, 2019

Hi,how is this problem now?

This is another repo that can reproduce this problem,and random errors.

@whtiehack
Copy link

Funny thing: adding a MessageBox before calling the DLL's exported function makes it work.

I think this is your illusion.
The problem still exists.

@DaLynX
Copy link

DaLynX commented Dec 16, 2019

Maybe but the test program does not crash. It runs, prints, opens the dialog and returns fine.

@whtiehack
Copy link

Is there any progress?

@86speed
Copy link

86speed commented Jan 4, 2021

@mattn Can this problem be dealt with

@mattn
Copy link
Member Author

mattn commented Jan 4, 2021

@86speed This problem is hard to fix since current implementation can not use external runtime's goroutine from main goroutine.

@VeNoMouS
Copy link

VeNoMouS commented Jul 21, 2022

1.5 yrs on.... where we at...

@xyhmnb
Copy link

xyhmnb commented Sep 2, 2022

hello, how to solve the problem?

@qmuntal
Copy link
Contributor

qmuntal commented Sep 9, 2022

I suspect this issue might be due to us using unsupported TLS slot.

Our windows-arm port is already doing correct hing - see runtime·save_g and runtime·load_g. But, I still could not copy that code onto amd64. Perhaps amd64 and arm assemblers are different.

@alexbrainman I can confirm that this issue is not reproduced on windows/arm64, at least the basic reproducer in the issue description works well.

I'm also leaning towards a TLS issue. Will give it a try.

@qmuntal
Copy link
Contributor

qmuntal commented Sep 15, 2022

I'm also leaning towards a TLS issue. Will give it a try.

I have a (yet-not-submitted) change that fixes this issue by using a proper TLS slot instead of the arbitrary pointer memory. Still fighting a couple of sharp edges, I'll try to submit it for review during the next week if I time allows.

@gopherbot
Copy link

Change https://go.dev/cl/431775 mentions this issue: runtime,cmd/internal/obj/x86: use TEB TLS slots on windows/amd64

@cherrymui
Copy link
Member

Sorry for being late here (just saw this). But with the current design, I don't think the c-shared build mode works well (and intended to) with loading from a Go program. Currently it is designed to be the only instance of the Go runtime in the program. It doesn't know other instances of the Go runtime, and doesn't do anything to combine the various metadata the runtime needs (some needs to be deduplicated, some needs to be merged, etc.)

It is possible that it works for some simple use cases, if we carefully keep the two Go world completely separated. (And the TLS patch would help on that.) But it wouldn't be surprised that things would go wrong with more complex programs, especially with passing references around between two Go worlds. I'm not sure it is the right direction to pursue this path.

@qmuntal
Copy link
Contributor

qmuntal commented Sep 19, 2022

It is possible that it works for some simple use cases, if we carefully keep the two Go world completely separated. (And the TLS patch would help on that.) But it wouldn't be surprised that things would go wrong with more complex programs, especially with passing references around between two Go worlds. I'm not sure it is the right direction to pursue this path.

Your call. The patch is there in case someone wants to keep pursuing this path.

Additionally, the test coverage for loading Go shared libraries on other Go processes is next to nothing, so I wouldn't feel confortable saying that this scenario is supported, even if my patch is merged and the runtime metadata is combined.

@cherrymui
Copy link
Member

Thanks.

Additionally, the test coverage for loading Go shared libraries on other Go processes is next to nothing

Right. It is nothing because it is not currently intended to be supported. If in the future we support this mode we will need more tests.

@alexbrainman
Copy link
Member

@cherrymui

Regardless of your decision if calling Go DLL from Go executable, I think you should consider using TlsAlloc Windows API to allocate TLS slots ( what https://go.dev/cl/431775 does ).

Currently Go uses unsupported way to store TLS register ( #22192 (comment) ). If any DLL imported by Go program decides to use the same unsupported trick, chaos ensues. Go program does not have control of what DLLs it imports, because these DLLs are installed by companies IT departments and people who don't realise what they are installing (think of antivirus and any other tricky software).

So there is a good chance we fix some unexplained bugs by replacing current TLS implementation with TlsAlloc.

Just MHO.

Alex

@cherrymui
Copy link
Member

Yeah, I'm not opposing using TlsAlloc, which I think is probably a good thing to do by itself. I'm just saying this doesn't necessarily make loading c-shared by a Go program (this issue) work. Removed the Hold on the CL. Thanks.

@alexbrainman
Copy link
Member

@mattn

Please try https://go-review.googlesource.com/c/go/+/431775 to see if it fixes your problem.

Thank you.

Alex

@qmuntal
Copy link
Contributor

qmuntal commented Oct 3, 2022

@aarzilli @derekparker take a look at https://go.dev/cl/431775. If it is finally merged, Delve assumption about G always being at 0x28 bytes from TLS will no longer be true. It will have to be inferred using runtime.tls_g like it is already happing in windows/arm64.

@mattn
Copy link
Member Author

mattn commented Oct 3, 2022

@alexbrainman @qmuntal I confirmed the CL fixes my problem. Thank you.

@mattn
Copy link
Member Author

mattn commented Oct 3, 2022

It seems that cl431775 fixes #11100 too. Right?

@qmuntal
Copy link
Contributor

qmuntal commented Oct 3, 2022

It seems that cl431775 fixes #11100 too. Right?

I don't see how CL 431775 could directly fix #11100. dlclose is a Linux API which is not aware of Windows TEB nor Windows-specific TLS handling.

@mattn
Copy link
Member Author

mattn commented Oct 3, 2022

Ah, issue #1110 is not for only Windows. Ignore this.

@anacrolix
Copy link
Contributor

I don't think this issue is constrained to Windows only. It also occurs on Darwin. Reproducing isn't the same, but any loading a c-shared dynamic library from Go will have issues.

gopherbot pushed a commit that referenced this issue Nov 14, 2022
This CL redesign how we get the TLS pointer on windows/amd64.

We were previously reading it from the [TEB] arbitrary data slot,
located at 0x28(GS), which can only hold 1 TLS pointer.

With this CL, we will read the TLS pointer from the TEB TLS slot array,
located at 0x1480(GS). The TLS slot array can hold multiple
TLS pointers, up to 64, so multiple Go runtimes running on the
same thread can coexists with different TLS.

Each new TLS slot has to be allocated via [TlsAlloc],
which returns the slot index. This index can then be used to get the
slot offset from GS with the following formula: 0x1480 + index*8

The slot index is fixed per Go runtime, so we can store it
in runtime.tls_g and use it latter on to read/update the TLS pointer.

Loading the TLS pointer requires the following asm instructions:

  MOVQ runtime.tls_g, AX
  MOVQ AX(GS), AX

Notice that this approach is also implemented on windows/arm64.

[TEB]: https://en.wikipedia.org/wiki/Win32_Thread_Information_Block
[TlsAlloc]: https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-tlsalloc

Updates #22192

Change-Id: Idea7119fd76a3cd083979a4d57ed64b552fa101b
Reviewed-on: https://go-review.googlesource.com/c/go/+/431775
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Alex Brainman <alex.brainman@gmail.com>
@VeNoMouS
Copy link

5 years... and were all still waiting for this to get sorted...

@cherrymui
Copy link
Member

cherrymui commented Mar 27, 2023

@Linkangyis proposing alternative solutions is always welcome, but having intensive discussion about your project is probably not going to solve this issue. If you have any idea that helps this issue, you're welcome to discuss or contribute a CL. Otherwise, please keep the discussion about your project on your own issue tracker or a forum. Thanks.

@Linkangyis
Copy link

@cherrymui Okay, I have deleted the relevant information. Sorry for the inconvenience

以上为机翻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows
Projects
None yet
Development

No branches or pull requests