Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: SIGSEGV after performing clone(CLONE_PARENT) via C constructor prior to runtime start #65625

Open
lifubang opened this issue Feb 9, 2024 · 18 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@lifubang
Copy link

lifubang commented Feb 9, 2024

Go version

go version go1.22.0 linux/amd64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/root/.cache/go-build'
GOENV='/root/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/root/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/root/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.22.0'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/dev/null'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build720171492=/tmp/go-build -gno-record-gcc-switches'

What did you do?

The main language of runc is go, but we are using c to enter some linux namespaces. Recently, go 1.22.0 has been released, when we want to bump go version to 1.22.0(opencontainers/runc#4193), the CI is fail, it seems that after we are calling clone(2) in c, the children process can't return to go if the first process exited in c.

The test code is in https://github.com/lifubang/cgoclone2/blob/main/main.go
I think we should see From main! in the last line.

What did you see happen?

root@acmcoder:/home/acmcoder/cgo# go version
go version go1.22.0 linux/amd64
root@acmcoder:/home/acmcoder/cgo# go run main.go
STAGE_PARENT
STAGE_CHILD
STAGE_INIT
This from nsexec

What did you expect to see?

root@acmcoder:/home/acmcoder/cgo# go version
go version go1.21.1 linux/amd64
root@acmcoder:/home/acmcoder/cgo# go run main.go
STAGE_PARENT
STAGE_CHILD
STAGE_INIT
This from nsexec
From main!

@kolyshkin
Copy link
Contributor

I can't repro this locally (Fedora 39, kernel 6.7.3-200.fc39.x86_64).

@lifubang
Copy link
Author

lifubang commented Feb 9, 2024

I test two kernels:

  1. github codespaces:
@lifubang ➜ ~/cgoclone2 (main) $ go version
go version go1.22.0 linux/amd64
@lifubang ➜ ~/cgoclone2 (main) $ go run main.go
STAGE_PARENT
STAGE_CHILD
STAGE_INIT
This from nsexec
@lifubang ➜ ~/cgoclone2 (main) $ uname -a
Linux codespaces-f28aa2 6.2.0-1018-azure #18~22.04.1-Ubuntu SMP Tue Nov 21 19:25:02 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
  1. my pc
root@acmcoder:/home/acmcoder# uname -a
Linux acmcoder 5.15.0-92-generic #102~20.04.1-Ubuntu SMP Mon Jan 15 13:09:14 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
root@acmcoder:/home/acmcoder# go version
go version go1.22.0 linux/amd64
root@acmcoder:/home/acmcoder# cd cgo
root@acmcoder:/home/acmcoder/cgo# go run main.go
STAGE_PARENT
STAGE_CHILD
STAGE_INIT
This from nsexec
From main!
root@acmcoder:/home/acmcoder/cgo# go run main.go
STAGE_PARENT
STAGE_CHILD
STAGE_INIT
This from nsexec
root@acmcoder:/home/acmcoder/cgo#

It seems that there is a race condition? But I only can see 1 success correct log. Most of time it's wrong.

@kolyshkin
Copy link
Contributor

  1. It appears the bug is only visible when using some older distros/kernels (e.g. Ubuntu 20.04).
  2. The repro provided by @lifubang segfaults when compiled by Go 1.22 and run on an old distro/kernel.
  3. I've bisected this issue to commit 52e17c2 (CL 519457); see details below.
  4. When using go compiled from go1.22.0 source tree plus a revert of (CL 519457), I can't reproduce this.

Bisecting details

For bisecting, I was using code provided by @lifubang (see above), an x86_64 VM with Ubuntu 20.04 installed, and the following script:

#!/bin/bash

set -e -u -o pipefail

# Build go.
(cd src && ./make.bash)

# Build our repro.
./bin/go build -o main-tst ~/cgoclone2/main.go

# Run it; check if "main" is printed.
./main-tst | grep main

The bisect itself (cd to go source repo first) can be run like this:

git checkout go1.22.0
git bisect start
git bisect bad
git bisect good go1.21.0
git bisect run ./script

@prattmic
Copy link
Member

prattmic commented Feb 9, 2024

Most of the code in CL 519457 is dealing with libc/pthreads. I wonder if this actually related to kernel version or glibc version?

Could you test using a newer version of glibc on the machine with the older kernel version. (e.g., by running in a Docker container using a newer distro which has a newer version of glibc).

Could you also provide strace -f output from a failing run, and a backtrace (e.g., with gdb) if there is a crash (unclear)? Thanks!

@prattmic prattmic changed the title cgo maybe broken with clone(2) runtime: crash/exit when performing clone via C constructor prior to runtime start Feb 9, 2024
@prattmic
Copy link
Member

prattmic commented Feb 9, 2024

cc @golang/runtime

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Feb 9, 2024
@kolyshkin
Copy link
Contributor

@prattmic it looks like it is indeed related to glibc:

Here's a backtrace from a failed run:

kir@kir-Standard-PC-Q35-ICH9-2009:~/cgoclone2$ gdb ./main-1.23 core-main-1.23-404682-1707498496
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./main-1.23...
[New LWP 404682]
[New LWP 404680]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./main-1.23'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  __GI___libc_free (mem=0x100) at malloc.c:3102
3102	malloc.c: No such file or directory.
[Current thread is 1 (LWP 404682)]
warning: File "/home/kir/git/go/src/runtime/runtime-gdb.py" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
To enable execution of this file add
	add-auto-load-safe-path /home/kir/git/go/src/runtime/runtime-gdb.py
line to your configuration file "/home/kir/.gdbinit".
To completely disable this security protection add
	set auto-load safe-path /
line to your configuration file "/home/kir/.gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
	info "(gdb)Auto-loading safe path"
(gdb) bt
#0  __GI___libc_free (mem=0x100) at malloc.c:3102
#1  0x00007f24608c6cd1 in __pthread_attr_destroy (attr=attr@entry=0x7ffc5fec0740) at pthread_attr_destroy.c:38
#2  0x0000000000482f0a in x_cgo_getstackbound (bounds=bounds@entry=0xe1d6b0) at gcc_stack_unix.c:36
#3  0x0000000000482982 in _cgo_set_stacklo (g=g@entry=0x533140 <runtime.g0>, pbounds=pbounds@entry=0xe1d6b0) at gcc_libinit.c:97
#4  0x0000000000482bac in x_cgo_init (g=0x533140 <runtime.g0>, setg=<optimized out>, tlsg=0x0, tlsbase=0x0) at gcc_linux_amd64.c:44
#5  0x0000000000461db1 in runtime.rt0_go () at /home/kir/git/go/src/runtime/asm_amd64.s:220
#6  0x0000000000000000 in ?? ()
(gdb) 

and the last few lines from strace -f:

brk(NULL)                               = 0xe1d000
brk(0xe3e000)                           = 0xe3e000
write(1, "STAGE_PARENT\n", 13STAGE_PARENT
)          = 13
clone(child_stack=0x7ffc5fec06d0, flags=CLONE_PARENT|SIGCHLD) = 404681
exit_group(0)                           = ?
strace: Process 404681 attached
[pid 404680] +++ exited with 0 +++
write(1, "STAGE_CHILD\n", 12STAGE_CHILD
)           = 12
clone(child_stack=0x7ffc5fec06d0, flags=CLONE_PARENT|SIGCHLD) = 404682
strace: Process 404682 attached
[pid 404681] exit_group(0)              = ?
[pid 404682] write(1, "STAGE_INIT\n", 11STAGE_INIT
) = 11
[pid 404682] write(1, "This from nsexec\n", 17 <unfinished ...>
[pid 404681] +++ exited with 0 +++
This from nsexec
<... write resumed>)                    = 17
openat(AT_FDCWD, "/proc/self/maps", O_RDONLY|O_CLOEXEC) = 3
prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0
fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
read(3, "00400000-00402000 r--p 00000000 "..., 1024) = 1024
read(3, "460a1a000-7f2460a1e000 r--p 001e"..., 1024) = 1024
read(3, "0a59000-7f2460a7c000 r-xp 000010"..., 1024) = 809
close(3)                                = 0
sched_getaffinity(404680, 32, 0xe1dd40) = -1 ESRCH (No such process)
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xf8} ---
+++ killed by SIGSEGV (core dumped) +++

This is glibc 2.31 (libc6:amd64 2.31-0ubuntu9.14)

I will run it with newer glibc later today.

@prattmic
Copy link
Member

prattmic commented Feb 9, 2024

We initialize the attr passed to pthread_attr_destroy at https://cs.opensource.google/go/go/+/master:src/runtime/cgo/gcc_stack_unix.c;l=25, but don't check for errors.

I bet that call is failing when the sched_getaffinity fails. man 3 pthread_getattr_np says "In addition, if thread refers to the main thread, then pthread_getattr_np() can fail because of errors from various underlying calls: fopen(3), if /proc/self/maps can't be opened; and getrlimit(2), if the RLIMIT_STACK resource limit is not supported." sched_getaffinity isn't mentioned here, but we do see the /proc/self/maps and prlimit64, so presumably this could be within pthread_getattr_np.

sched_getaffinity is passed PID 404680, which is the PID of one of the exited children (exit message above). That seems wrong. Did pthreads get confused about what the current PID is?

@prattmic
Copy link
Member

prattmic commented Feb 9, 2024

Did pthreads get confused about what the current PID is?

After reading the reproducer more closely, I can see how this could happen.

clone(CLONE_PARENT) is effectively fork; the only difference is that the parent of the new PID is the same as the parent of the current PID (rather than being the current PID).

This reproducer is doing clone(CLONE_PARENT), letting the original process exit, and letting the new process continue to Go and execute as normal. If some part of pthreads stores global state (such as the current PID) prior to the clone(), that state will propagate into the child, which would be problematic if it is no longer accurate.

I don't have a reproducer environment, but I suspect if you adjust the program like this:

void __attribute__((constructor)) init(void) {
	nsexec();

	pthread_attr_t attr;
	int ret = pthread_getattr_np(pthread_self(), &attr);
	if (ret < 0) {
		abort();
	}
}

That it will hit the abort. And that even extracting this program from the Go file and making it 100% C would have the same effect.

Why working on some versions of glibc and not on others? I suspect that the old version of glibc sets some global state prior to clone() (either in a glibc constructor, on inside clone() itself), while the new version of glibc has either removed this early initialization of global state or learned to clear it after a fork-ish clone.

@prattmic prattmic changed the title runtime: crash/exit when performing clone via C constructor prior to runtime start runtime: SIGSEGV after performing clone(CLONE_PARENT) via C constructor prior to runtime start Feb 9, 2024
@dr2chase dr2chase added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Feb 9, 2024
@kolyshkin
Copy link
Contributor

That it will hit the abort. And that even extracting this program from the Go file and making it 100% C would have the same effect.

In my testing it doesn't, and pthread_getattr_np(pthread_self(), &attr) returns 3 (with either go1.21.6 or go 1.22.0).

Going to try a different glibc now.

@kolyshkin
Copy link
Contributor

Hmm, glibc 2.31 from an older release of Fedora (Fedora 32) works fine (I tried glibc-2.31-6.fc32.x86_64 rpm, glibc-2.31-2.fc32.x86_64 rpm, and glibc-2.31 compiled from source).

glibc 2.32 from Fedora 33 also works.

glibc 2.31-13+deb11u7 from Debian 11 also works. Older versions (2.31-13+deb11u5, 2.31-7, 2.31-1) also work.

My guess, the issue is somehow specific to Ubuntu (it's either their glibc patches, or gcc patches, or some configuration detail, but not the kernel).

I also tried bisecting upstream glibc but it's quite challenging.

@kolyshkin
Copy link
Contributor

@dr2chase could you please specify what kind of info do you need me to provide (except for what was provided above)?

@prattmic
Copy link
Member

To reproduce this, I added this Dockerfile to https://github.com/lifubang/cgoclone2, and placed a GOROOT source tree in the go directory.

FROM ubuntu:20.04

RUN apt-get update && apt-get install -y curl git gcc strace

RUN curl -L "https://go.dev/dl/go1.22.0.linux-amd64.tar.gz" > gobootstrap.tar.gz

RUN tar xf gobootstrap.tar.gz

RUN mv go gobootstrap

COPY ./go /go

RUN cd /go/src/ && GOROOT_BOOTSTRAP=/gobootstrap ./make.bash

COPY main.go /main.go

RUN CGO_ENABLED=1 /go/bin/go build -o /cgoclone2 /main.go

ENTRYPOINT ["strace", "-f", "/cgoclone2"]
$ docker build -t cgoclone2 .
$ docker run --rm --init cgoclone2 
...
prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0
fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
read(3, "00400000-00402000 r--p 00000000 "..., 1024) = 1024
read(3, "/libc-2.31.so\n7fcd51845000-7fcd5"..., 1024) = 1024
read(3, "d-2.31.so\n7fcd51898000-7fcd518a0"..., 1024) = 630
close(3)                                = 0
sched_getaffinity(10, 32, 0x6cd940)     = -1 ESRCH (No such process)
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x40000000f8} ---
+++ killed by SIGSEGV (core dumped) +++

@prattmic prattmic removed the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Feb 12, 2024
@prattmic
Copy link
Member

My theory of a stale PID from #65625 (comment) is indeed the immediate cause of issues:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[Attaching after Thread 0x7f8544400740 (LWP 980) fork to child process 981]
[New inferior 12 (process 981)]
[Detaching after fork from parent process 980]
[Inferior 11 (process 980) detached]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[Attaching after Thread 0x7f8544400740 (LWP 981) fork to child process 982]
[New inferior 13 (process 982)]
[Detaching after fork from parent process 981]
[Inferior 12 (process 981) detached]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[Switching to Thread 0x7f8544400740 (LWP 982)]

Thread 13.1 "cgoclone2" hit Breakpoint 1, pthread_getattr_np (thread_id=140210352424768, attr=attr@entry=0x7ffd7104ead0) at pthread_getattr_np.c:34
(gdb) s
(gdb) p thread->tid
$11 = 980

We are on TID/PID 982, but pthreads thinks the TID is 980, and ultimately passes this to sched_getaffinity. I haven't looked at why my C-only reproducer doesn't fail yet.

@thanm thanm added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Feb 12, 2024
@prattmic
Copy link
Member

OK, I think I've gotten to the bottom of this. tl;dr, it is indeed the same issue in #65625 (comment).

This C program reproduces the problem:

#define _GNU_SOURCE
#include <endian.h>
#include <errno.h>
#include <fcntl.h>
#include <grp.h>
#include <limits.h>
#include <pthread.h>
#include <sched.h>
#include <setjmp.h>
#include <signal.h>
#include <stdarg.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <string.h>
#include <unistd.h>

#include <sys/ioctl.h>
#include <sys/prctl.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <sys/wait.h>

#include <linux/limits.h>
#include <linux/netlink.h>
#include <linux/types.h>

#include <sched.h>

#define STAGE_SETUP  -1
#define STAGE_PARENT  0
#define STAGE_CHILD   1
#define STAGE_INIT    2

int current_stage = STAGE_SETUP;

struct clone_t {
        char stack[4096] __attribute__((aligned(16)));
        char stack_ptr[0];

        jmp_buf *env;
        int jmpval;
};

static int child_func(void *arg) __attribute__((noinline));
static int child_func(void *arg)
{
        struct clone_t *ca = (struct clone_t *)arg;
        longjmp(*ca->env, ca->jmpval);
}

static int clone_parent(jmp_buf *env, int jmpval) __attribute__((noinline));
static int clone_parent(jmp_buf *env, int jmpval)
{
        struct clone_t ca = {
                .env = env,
                .jmpval = jmpval,
        };

        return clone(child_func, ca.stack_ptr, CLONE_PARENT | SIGCHLD, &ca);
}

static void nsexec() {
        jmp_buf env;
    current_stage = setjmp(env);
    switch (current_stage) {
                case STAGE_PARENT: {
                        printf("STAGE_PARENT\n");
                        clone_parent(&env, STAGE_CHILD);
                        exit(0);
                }
                break;
                case STAGE_CHILD: {
                        printf("STAGE_CHILD\n");
                        clone_parent(&env, STAGE_INIT);
                        exit(0);
                }
                break;
                case STAGE_INIT: {
                        printf("STAGE_INIT\n");
                }
                break;
        }

        printf("This from nsexec\n");
        return;
}

void __attribute__((constructor)) init(void) {
        nsexec();
        pthread_attr_t attr;
        int ret = pthread_getattr_np(pthread_self(), &attr);
        if (ret != 0) {
                printf("pthread_getattr_np: %s\n", strerror(ret));
                /* Try to destroy attr anyway. Bad idea, because getattr fails, but this is what Go does. */
                pthread_attr_destroy(&attr);
                abort();
        }
}

int main(void) {
        printf("Hello from main!\n");
}

With glibc 2.31, it crashes in the same way as Go:

sched_getaffinity(1267, 32, 0x5643749ccd20) = -1 ESRCH (No such process)
write(1, "pthread_getattr_np: No such proc"..., 36pthread_getattr_np: No such process
) = 36
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0xf0b5f7} ---
+++ killed by SIGSEGV (core dumped) +++

With glibc 2.37 (the local version I happen to have), it still gets ESRCH, but doesn't SIGSEGV:

sched_getaffinity(815888, 32, 0x555d05817e20) = -1 ESRCH (No such process)
write(1, "pthread_getattr_np: No such proc"..., 36pthread_getattr_np: No such process
) = 36
rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
gettid()                                = 815890
getpid()                                = 815890
tgkill(815890, 815890, SIGABRT)         = 0
--- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=815890, si_uid=200669} ---
+++ killed by SIGABRT +++

In #65625 (comment), I assumed that pthread_getattr_np used errno, but it turns out that it just directly returns positive errnos. So when @kolyshkin said that it returned 3, that was indeed reproducing the issue! errno 3 is ESRCH.

So newer glibc is still getting confused about the current TID, it just isn't crashing in pthread_attr_destroy. I believe the difference is bminor/glibc@86ed077: starting in glibc 2.32, pthread_getattr_np zeroes the attr argument on entry. pthread_attr_destroy does nothing on a zeroed structure, so no crash. Without the zeroing, the uninitialized value may go down the if (iattr->extension != NULL) path and try to free some garbage values.

In my opinion, this is ultimately a bug in the C program (which I assume is extracted from runc). i.e., it is not safe to do anything not async-signal-safe after clone(CLONE_PARENT), just as after fork. Certainly not continuing execution as if we are the parent. Perhaps it could be argued that this is a glibc bug, which should allow safe execution after clone(CLONE_PARENT), but this is not the issue tracker to litigate that :).

I don't think that Go should work around this. I do think we should check for errors from pthread_getattr_np, but my preference is to abort the program on error. In theory, we could add a fallback that just estimates the stack bounds on error. But at the end of the day, pthread has corrupted state and I feel pretty strongly that we shouldn't just try to keep executing with corrupt state. That is a recipe for having even more difficult to debug crashes later in execution when some other pthread code crashes.

@gopherbot
Copy link

Change https://go.dev/cl/563379 mentions this issue: WIP: runtime/cgo: check pthread calls for errors

@mknyszek
Copy link
Contributor

In triage, we're of the opinion at this point that this is generally not a bug in the Go project. However, we'll keep the issue open until we land a planned change to add error checking to some of the glibc calls made (that @prattmic described in the previous comment), and at least try to fail with a nicer error.

@mknyszek mknyszek added this to the Backlog milestone Feb 14, 2024
stgraber added a commit to zabbly/incus that referenced this issue Feb 16, 2024
Go 1.22 currently causes crashes on older Debian/Ubuntu systems.

lxc/incus#497
golang/go#65625
opencontainers/runc#4193

Signed-off-by: Stéphane Graber <stgraber@stgraber.org>
stgraber added a commit to zabbly/incus that referenced this issue Feb 16, 2024
Go 1.22 currently causes crashes on older Debian/Ubuntu systems.

lxc/incus#497
golang/go#65625
opencontainers/runc#4193

Signed-off-by: Stéphane Graber <stgraber@stgraber.org>
@lifubang
Copy link
Author

lifubang commented Apr 1, 2024

In my opinion, this is ultimately a bug in the C program (which I assume is extracted from runc). i.e., it is not safe to do anything not async-signal-safe after clone(CLONE_PARENT), just as after fork. Certainly not continuing execution as if we are the parent. Perhaps it could be argued that this is a glibc bug, which should allow safe execution after clone(CLONE_PARENT), but this is not the issue tracker to litigate that :).

If we remove CLONE_PARENT, your provided C code still fail in glibc 2.31. So I think this is not related to CLONE_PARENT, but only related to clone(2).

@cyphar
Copy link

cyphar commented Apr 4, 2024

I managed to finally root-cause the reason for the stale pthread_self() value. For those interested, see this longer comment for more details.

@lifubang pointed out that the PID cache was removed from glibc in 2016, which also removed the code they had to update THREAD_SELF->tid after a clone(2) syscall. fork(2) happens to have the correct pthread_self() (and thus using fork(2) happens to fix this specific issue) because they use CLONE_CHILD_SETTID to write the child TID into the TLS cache (THREAD_SELF->tid) in their fork(2) implementation. The same is not true for clone(2) and there is no nice way to implement this as a user (there is the option of prctl(PR_GET_TID_ADDRESS) but that is pretty dodgy).

Critically, this means the issue is not with CLONE_PARENT, it's with the clone(2) wrapper provided by glibc and this issue was first introduced in glibc 2.25.

Ultimately I think the core issue is that we are violating signal-safety(7) in runc, as outlined in above comments. Thanks for helping us debug this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Status: In Progress
Development

No branches or pull requests

8 participants