Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: Should respect/understand the process limit when managing threads #14835

Closed
mithro opened this issue Mar 16, 2016 · 10 comments
Closed

Comments

@mithro
Copy link

mithro commented Mar 16, 2016

Currently if the go runtime tries to create a new system thread and is unable
to do so, it will fail with an error like;

18:22:18.752169 [go test -timeout 600s -v -race ./common/paniccatcher] was slow: 3m11.377s
runtime/cgo: pthread_create failed: Resource temporarily unavailable
SIGABRT: abort
PC=0x7f54695bdcc9

One reason for this occurring is the system have a low "process limit". For a
long time it was fairly common for systems to allow 10k or more, but with
systemd and Linux 4.3 the default limit can be as little as 512.

Most of the code which calls pthread_create in src/runtime/cgo seems to do
something like;

    err = pthread_create(&p, &attr, threadentry, ts);

    pthread_sigmask(SIG_SETMASK, &oset, nil);

    if (err != 0) {
        fprintf(stderr, "runtime/cgo: pthread_create failed: %s\n", strerror(err));
        abort();
    }

This actually seems reasonable as recovering from thread creation is pretty
hard. As well, creating more then your system's process limit does feel like a
"just don't do that" type things.

However, from what I can see goroutine scheduler will create up to
sched.maxmcount threads and this is set to be initialized to 10k in proc.go at
line 425 (https://github.com/golang/go/blob/master/src/runtime/proc.go#L425).

Linux provides an API for getting the current thread limit, the getrlimit call
with RLIMIT_NPROC (see http://man7.org/linux/man-pages/man2/setrlimit.2.html)
which already seems to be exposed to Go code as syscall.Getrlimit but it is
missing the RLIMIT_NPROC constant needed to get the information.

This is similar to idea of respecting memlimit see
https://github.com/golang/go/blob/master/src/runtime/os1_linux.go#L270 and
probably related to #5049

@mithro
Copy link
Author

mithro commented Mar 16, 2016

I started trying to make this happen at mithro@7538394 but don't yet understand the go runtime code enough to know how to make the syscall.getrlimit call...

@mithro
Copy link
Author

mithro commented Mar 16, 2016

I also uploaded the patch to https://go-review.googlesource.com/20751 incase anyone wants to make comments and suggest how it could be done.

@minux
Copy link
Member

minux commented Mar 16, 2016 via email

@bradfitz
Copy link
Contributor

@minux, I believe @mithro is proposing that Go read (not write) the max thread count and use it to avoid creating new threads when it would fail anyway.

@mithro
Copy link
Author

mithro commented Mar 16, 2016

I'm still trying to create a simple test case which reliably causes the exception I listed above. I think it should be as simple as using ulimit to set the number of process to something small and then running go code with loads of goroutines but I'm having trouble making it occur.

If I understand correctly (which is a big if -- this is my first time looking at this code) the sched code seems to spawn native threads to run goroutines on? If so, it seems like it would make sense to read the max process count from the OS and not spawn more than XX% of that? I don't really understand under what conditions it decide it needs more native threads?

The other option, which feels a lot harder, is to make an unsuccessful pthread_create call not a hard failure (and dealing with the consequences)?

@minux
Copy link
Member

minux commented Mar 16, 2016 via email

@ianlancetaylor
Copy link
Contributor

@mithro You should read the discussion on #4056.

@mithro
Copy link
Author

mithro commented Mar 18, 2016

Thanks @ianlancetaylor, #4056 does include a lot of discussion on this topic. The current summary seems to be that it is preferred for the runtime to abort here rather than have the potential for code to deadlock and users should manage their goroutines to prevent this from happening.

With that in mind, I'm going to close this bug and open a new one just about letting go code read the current thread limit so that user code is able to use this information when managing their goroutines.

@minux
Copy link
Member

minux commented Mar 18, 2016 via email

@mithro
Copy link
Author

mithro commented Mar 18, 2016

@minux The RLIMIT_NPROC constant is currently missing from the syscall package. I believe the fix is just to update the regex in src/syscall/mkerrors.sh and rerun it. Let's move the conversation about getting that fixed to #14854.

@mithro mithro closed this as completed Mar 18, 2016
@golang golang locked and limited conversation to collaborators Mar 19, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants