Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: deadlock at _cgo_wait_runtime_init_done #42190

Open
tungnh28 opened this issue Oct 24, 2020 · 27 comments
Open

runtime: deadlock at _cgo_wait_runtime_init_done #42190

tungnh28 opened this issue Oct 24, 2020 · 27 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows
Milestone

Comments

@tungnh28
Copy link

tungnh28 commented Oct 24, 2020

What version of Go are you using (go version)?

$ go version
go version go1.15.3 windows/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
set GO111MODULE=
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\nguyenha\AppData\Local\go-build
set GOENV=C:\Users\nguyenha\AppData\Roaming\go\env
set GOEXE=.exe
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOINSECURE=
set GOMODCACHE=C:\Users\nguyenha\go\pkg\mod
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=C:\Users\nguyenha\go
set GOPRIVATE=
set GOPROXY=https://proxy.golang.org,direct
set GOROOT=c:\go
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLDIR=c:\go\pkg\tool\windows_amd64
set GCCGO=gccgo
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=D:\work\projects\mimic\mimic-core\go.mod
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\Users\nguyenha\AppData\Local\Temp\go-build126925810=/tmp/go-build -gno-record-gcc-switches

What did you do?

set CGO_ENABLED=1
set CC=clang
go build -buildmode=c-archive -o bin/libcrawler.lib -v core/export

Go side:
package main
import "C"

//export Init
func Init() {
print("something")
}
// Required by CGO
//
func main() {

}

From C++ side:
#include
#include "libcrawler.h"
#include

using namespace std;

int main() {
cout << "Hello, world1!" << endl;

// x_cgo_notify_runtime_init_done(NULL);
Init();
cout << "Hello, world!" << endl;
return 0;
}

I use clang-cl to compile c++ project

What did you expect to see?

Can call Go function (Init()) from C++

What did you see instead?

A deadlock at _cgo_wait_runtime_init_done. I tried to test by calling x_cgo_notify_runtime_init_done(NULL); then I could pass the loop but crash later.

@ianlancetaylor ianlancetaylor changed the title Cgo: deadlock at _cgo_wait_runtime_init_done runtime: deadlock at _cgo_wait_runtime_init_done Oct 24, 2020
@ianlancetaylor
Copy link
Contributor

Is the code you are showing us the complete test case? How exactly are you building and running the program? Please try to show us exactly how we can reproduce the problem ourselves. Thanks.

@ianlancetaylor ianlancetaylor added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 24, 2020
@ianlancetaylor ianlancetaylor added this to the Go1.16 milestone Oct 24, 2020
@tungnh28
Copy link
Author

tungnh28 commented Oct 24, 2020

The build command and export.go and client.cpp are exactly the same as I posted.
From c++, I will create a simple project and send you.

@tungnh28
Copy link
Author

tungnh28 commented Oct 24, 2020

Test.zip.
Here you go, visual studio project, compile using clang-cl. If you find anything or any hint that I could try from my side, please let me know :). @ianlancetaylor

@ianlancetaylor
Copy link
Contributor

Thanks, but, sorry, I can't use a Visual Studio project. I'm not a Windows user at all.

Is there a way that you can show me the exact clang commands that you use to build the executable? Thanks.

@tungnh28
Copy link
Author

tungnh28 commented Oct 26, 2020

The command in windows:
-cd your_project_root
-clang -Xlinker /libpath:D:\work\projects\test\Test\Test\lib -Xlinker libcrawler.a Test.cpp libcrawler.h

It looks like the problem is windows-only, buildmode=c-archive only. I have tried with buildmode=c-shared and it is ok

@dzonerzy
Copy link

Same issue here, works fine with c-shared but with c-archive _cgo_wait_runtime_init_done just hand due to an infinite loop 'cause runtime_init_done is never 1 and thus the event won't be triggered, still trying to understand the root cause.

@james-li
Copy link

james-li commented Jan 21, 2021

same issue with c-shared in linux.

I've compiled a so file to use LD_PRELOAD. Deadlock occurs when using the LD_PRELOAD so file

[golib]# LD_PRELOAD=`pwd`/libshellhook.so bash  grepconf.sh -c
<hang here>

When I use gdb to debug this, it's stack show:

(gdb) bt
#0  0x00007fdf8d79a995 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007fdf8e313723 in _cgo_wait_runtime_init_done () at gcc_libinit.c:40
#2  0x00007fdf8e313446 in execve (filename=0x1909350 "/usr/libexec/grepconf.sh", argv=0x1908b00, envp=0x1907d10) at _cgo_export.c:26
#3  0x000000000042fc02 in shell_execve ()

(gdb) info threads
  Id   Target Id         Frame 
* 1    Thread 0x7fdf8e856740 (LWP 25485) "bash" 0x00007fdf8d79a995 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0

go version

$ go version
go version go1.14.9 linux/amd64

It works fine in fedora 32 and centos 6, but hangs in ubuntu 1804 and centos 7(3.10.0-862.11.6.el7.x86_64.debug )

@james-li
Copy link

james-li commented Jan 21, 2021

I've found some clues. I put some debug code in /usr/lib/golang/src/runtime/cgo/gcc_libinit.c to trace the x_cgo_notify_runtime_init_done and _cgo_wait_runtime_init_done.

$ diff -u  ./src/runtime/cgo/gcc_libinit.c /usr/lib/golang/src/runtime/cgo/gcc_libinit.c
--- ./src/runtime/cgo/gcc_libinit.c	2021-01-21 13:31:03.192576533 +0800
+++ /usr/lib/golang/src/runtime/cgo/gcc_libinit.c	2021-01-21 16:08:13.426097949 +0800
@@ -11,6 +11,8 @@
 #include <stdlib.h>
 #include <string.h> // strerror
 #include <time.h>
+#include <unistd.h>
+#include <sys/types.h>
 #include "libcgo.h"
 #include "libcgo_unix.h"
 
@@ -28,7 +30,9 @@
 	if (err != 0) {
 		fprintf(stderr, "pthread_create failed: %s", strerror(err));
 		abort();
-	}
+	}else{
+		fprintf(stderr, "pthread_create success: %d:%x\n", getpid(), (unsigned)p);
+        }
 }
 
 uintptr_t
@@ -37,7 +41,15 @@
 
 	pthread_mutex_lock(&runtime_init_mu);
 	while (runtime_init_done == 0) {
-		pthread_cond_wait(&runtime_init_cond, &runtime_init_mu);
+                struct timespec ts;
+                clock_gettime(CLOCK_REALTIME, &ts);
+                ts.tv_sec += 1;
+		if(pthread_cond_timedwait(&runtime_init_cond, &runtime_init_mu, &ts)){
+                        fprintf(stderr, "wait runtime init done at thread %d:%u, value address %p\n", 
+                                (int)getpid(), (unsigned)pthread_self(), &runtime_init_done);
+                        continue;
+                }
+//                pthread_cond_wait(&runtime_init_cond, &runtime_init_mu);
 	}
 
 	// TODO(iant): For the case of a new C thread calling into Go, such
@@ -49,7 +61,10 @@
 	// initialization to be complete anyhow, later, by waiting for
 	// main_init_done to be closed in cgocallbackg1. We should wait here
 	// instead. See also issue #15943.
-	pfn = cgo_context_function;
+        if(runtime_init_done)
+                pfn = cgo_context_function;
+        else
+                pfn = nil;
 
 	pthread_mutex_unlock(&runtime_init_mu);
 	if (pfn != nil) {
@@ -64,10 +79,14 @@
 
 void
 x_cgo_notify_runtime_init_done(void* dummy __attribute__ ((unused))) {
-	pthread_mutex_lock(&runtime_init_mu);
-	runtime_init_done = 1;
-	pthread_cond_broadcast(&runtime_init_cond);
-	pthread_mutex_unlock(&runtime_init_mu);
+        int tries = 0;
+        while(tries ++ < 3){
+                fprintf(stderr, "notify runtime init done at thread %d:%u, value address %p\n", (int)getpid(), (unsigned)pthread_self(), &runtime_init_done);
+                pthread_mutex_lock(&runtime_init_mu);
+                runtime_init_done = 1;
+                pthread_cond_broadcast(&runtime_init_cond);
+                pthread_mutex_unlock(&runtime_init_mu);
+        }
 }
 
 // Sets the context function to call to record the traceback context

Here is the log

notify runtime init done at thread 32626:1276888832, value address 0x7ff74cea5268
wait runtime init done at thread 32630:1292597056, value address 0x7ff74cea5268

It's clear that the notify and wait action occur in two different processes(parent process and child process). So deadlock is unavoidable.

@ianlancetaylor
Copy link
Contributor

@james-li Using LD_PRELOAD is interesting, but I think it is completely different problem than the rest of this issue, which is about Windows and does not use LD_PRELOAD. I encourage you to take this topic to a different issue, or to a forum. Thanks. (I don't really expect LD_PRELOAD to work, but perhaps there is some way that it could.)

@james-li
Copy link

@ianlancetaylor Thanks. I post another issue
#43836

@dmitshur dmitshur modified the milestones: Go1.16, Backlog Feb 9, 2021
@Ishwar428
Copy link

Hi,

I am also facing this issue on Windows. Tested with go1.12 as well as go1.16. Below are the details of my application scenario.

I have generated a c-shared library(say gocshared.dll) from go code using buildmode=c-shared. The module hierarchy is as below:
LegacyApp(Executable) -> Dynamically loads legacy.dll -> Dynamically loads interface.dll -> Dynamically loads gocshared.dll.

When I call exported function of gocshared.dll, the call never returns. The application dump after this call shows below entries in call stack:

00000000009f3138 00007ffd0d6c8ba3 ntdll!ZwWaitForSingleObject+0x14
00000000009f3140 00000000649f615f KERNELBASE!WaitForSingleObjectEx+0x93
00000000009f31e0 00000000649f5def gocshared!cgo_wait_runtime_init_done+0x3f
00000000009f3240 00007ffcf6e11fe8 gocshared!SetConfg+0x2f
00000000009f32f0 00007ffcd9a957a9 interface!SetConfigW+0x10c
00000000009f33c0 00007ffcd9a96236 legacy!SetFeatureConfig+0xb409

Note: LegacyApp(Executable), legacy.dll and interface.dll are developed using C++.

When I use this shared library using test app, in below shown hierarchy, it works as expected i.e. the call does not hang.
TestApp(Executable)-> Dynamically loads interface.dll -> Dynamically loads gocshared.dll

Any idea what is wrong?

@ianlancetaylor
Copy link
Contributor

I don't know what is wrong.

What is supposed to happen is that the .ctors section of the archive will have a pointer to _rt0_amd64_windows_lib. Putting that address in the .ctors section is expected to cause the function to be invoked as a global constructor. That function will start a new thread of execution. That new thread will eventually call x_cgo_notify_runtime_init_done which will cause the event wait in cgo_wait_runtime_init_done to complete.

Perhaps simply using a .ctors section works for a DLL but not for an archive. I don't know.

@Ishwar428
Copy link

Thanks for response @ianlancetaylor.
Actually, I am using c-shared DLL and not c-archive. Also the call does not hang with my test app. Can this have something to do with other modules/dependencies in my legacy app?

I tried debugging test app with windbg to see x_cgo_notify_runtime_init_done call but as there is no pdb generated for gocshared.dll, debugger does not find symbols inside shared library. Is there a way I can debug this on Windows?

@ianlancetaylor
Copy link
Contributor

I'm sorry, I don't know the answers to your questions.

@hajimehoshi
Copy link
Member

hajimehoshi commented May 2, 2022

Same issue here. VC++ and a static library (-buildmode=c-archive) don't work. Perhaps, _rt0_go or _rt0_amd64_windows_lib is not called. For example, even x_cgo_init or runtime.args is not invoked. I have confirmed this by inserting OutputDebugStirngA there.

CgoTest.cpp:

#include <iostream>

extern "C" int AddFromGo(int a, int b);

int main()
{
    std::cout << "Hello World!\n" << AddFromGo(1, 2) << "\n";
}

extern "C" void __mingw_vfprintf() {}

main.go:

package main

import "C"

//export AddFromGo
func AddFromGo(a, b C.int) C.int {
        return a + b
}

func main() {
}
  1. Create a VC++ project (named CgoTest or whatever)
  2. Rewrite CgoTest.cpp as the above C++ code
  3. Create a static library with the above Go code with MinGW (I used scoop's MinGW) named cgo.a
  4. Add cgo.a to the project as a library
  5. Run it. Confirmed _cgo_wait_runtime_init_done got stuck.
  • Visual Studio: Professional 2017 Version 15.9.44
  • gcc.exe (GCC) 11.2.0

@hajimehoshi
Copy link
Member

Is this related to #35006?

@hajimehoshi
Copy link
Member

Calling _rt0_amd64_windows_lib() explicitly at the C's main worked (at least at the above console example).

@hajimehoshi
Copy link
Member

In rt0_windows_amd64.s, the comment says:

For static libraries it is called when the final executable starts, during the C runtime initialization phase.

My question is whether this is true with the VC++ runtime.

@hajimehoshi
Copy link
Member

/CC @thanm

@thanm
Copy link
Contributor

thanm commented May 2, 2022

Hi,

I read through this issue.

My guess would be that this problem is not related to #35006. That issue (and the various CLs I submitted to the linker) has to do mainly with cgo internal linking to help with programs that use the race detector ("go build -race") in combination with contemporary versions of GCC and clang.

Given what you mention in this comment and what Ian said previously, I think the problem has to do with the .ctors section, and why this mechanism is not working properly in this case.

Thanks, Than

@hajimehoshi
Copy link
Member

Thanks.

Is this duplicated with #30347?

@qiqi0625
Copy link

Hi, Same issue here.

bind go functions for iOS application don't work. but it is ok in simple demo project。

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 7, 2022
@ks75vl
Copy link

ks75vl commented Apr 14, 2023

CRT initialization can be used to invoke the _rt0_amd64_windows_lib, which can remove deadlocks in _cgo_wait_runtime_init_done

#ifdef _MSC_VER
#ifdef __cplusplus
extern "C" {
#endif

	void _rt0_amd64_windows_lib();

	__pragma(section(".CRT$XCU", read));
	__declspec(allocate(".CRT$XCU")) void (*init1)() = _rt0_amd64_windows_lib;
	__pragma(comment(linker, "/include:init1"));

#ifdef __cplusplus
}
#endif
#endif

int main() { return 0; }

@ks75vl
Copy link

ks75vl commented Apr 14, 2023

CRT initialization can be used to invoke the _rt0_amd64_windows_lib, which can remove deadlocks in _cgo_wait_runtime_init_done

#ifdef _MSC_VER
#ifdef __cplusplus
extern "C" {
#endif

	void _rt0_amd64_windows_lib();

	__pragma(section(".CRT$XCU", read));
	__declspec(allocate(".CRT$XCU")) void (*init1)() = _rt0_amd64_windows_lib;
	__pragma(comment(linker, "/include:init1"));

#ifdef __cplusplus
}
#endif
#endif

int main() { return 0; }

#30347

@cdahlberg1
Copy link

This works great for 64 bit. How about 32 bit? I tried replacing _rt0_amd64_windows_lib with _rt0_386_windows_lib such as

#ifdef _MSC_VER
#ifdef __cplusplus
extern "C" {
#endif

void _rt0_386_windows_lib();

__pragma(section(".CRT$XCU", read));
__declspec(allocate(".CRT$XCU")) void (*init1)() = _rt0_386_windows_lib;
__pragma(comment(linker, "/include:init1"));

#ifdef __cplusplus
}
#endif
#endif

but that gets an unresolved external symbol for _rt0_386_windows_lib

@jat001
Copy link

jat001 commented Nov 1, 2023

Thanks @ks75vl. I have successfully built both static and shared libraries with MSVC.

These are build commands:
https://github.com/jat001/ddns4cdn?tab=readme-ov-file#msvc
https://github.com/jat001/ddns4cdn?tab=readme-ov-file#msvc-1

@ks75vl
Copy link

ks75vl commented Nov 2, 2023

This works great for 64 bit. How about 32 bit? I tried replacing _rt0_amd64_windows_lib with _rt0_386_windows_lib such as

#ifdef _MSC_VER #ifdef __cplusplus extern "C" { #endif

void _rt0_386_windows_lib();

__pragma(section(".CRT$XCU", read));
__declspec(allocate(".CRT$XCU")) void (*init1)() = _rt0_386_windows_lib;
__pragma(comment(linker, "/include:init1"));

#ifdef __cplusplus } #endif #endif

but that gets an unresolved external symbol for _rt0_386_windows_lib

It's about the name mangling, here is the patch version for both 32 and 64 bits.

#ifdef _MSC_VER
#ifdef __cplusplus
extern "C" {
#endif

	__pragma(section(".CRT$XCU", read));
#ifdef _WIN64
	void _rt0_amd64_windows_lib();
	__declspec(allocate(".CRT$XCU")) void (*init1)() = _rt0_amd64_windows_lib;
	__pragma(comment(linker, "/include:init1"));
#else
	void rt0_386_windows_lib();
	__declspec(allocate(".CRT$XCU")) void (*init1)() = rt0_386_windows_lib;
	__pragma(comment(linker, "/include:_init1"));
#endif
#ifdef __cplusplus
}
#endif
#endif

int main() { return 0; }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. OS-Windows
Projects
Status: Triage Backlog
Development

No branches or pull requests