Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: NumCPU() should respect zone CPU cap on illumos #35199

Closed
jclulow opened this issue Oct 27, 2019 · 3 comments
Closed

runtime: NumCPU() should respect zone CPU cap on illumos #35199

jclulow opened this issue Oct 27, 2019 · 3 comments

Comments

@jclulow
Copy link
Contributor

jclulow commented Oct 27, 2019

In the machine independent runtime bits:

// NumCPU returns the number of logical CPUs usable by the current process.
//
// The set of available CPUs is checked by querying the operating system
// at process startup. Changes to operating system CPU allocation after
// process startup are not reflected.
func NumCPU() int {
        return int(ncpu)
}

On illumos, this is initialised using sysconf(3C) with the _SC_NPROCESSORS_ONLN variable, returning the "number of processors online":

func getncpu() int32 {
        n := int32(sysconf(__SC_NPROCESSORS_ONLN))
        if n < 1 {
                return 1
        }
        return n
}

func osinit() {
        ncpu = getncpu()
...

The operating system affords several different mechanisms for controlling (capping) the resources used by a zone. One is to place the zone in a processor set, which would generally exclude other zones from sharing the CPUs in the set. When in configured to run under a processor set, the _SC_NPROCESSORS_ONLN value would reflect the number of processors in that set, which is fine.

Another common limiting mechanism is a CPU cap, which works differently. All available processors are still visible as online, and are potentially available for use. The cap is enforced by preventing processes from running for more than the configured quantity of CPU seconds per second. For example, if a system has 48 processors and the zone has a cap equivalent to 2 CPUs, the zone may expend its cap running just two threads on two actual CPUs, or on 48 threads each running for 1/24 seconds per wall clock second.

It should be reasonably easy to enhance the getncpu() function to check for a CPU cap when determining the count of usable CPUs.

@jclulow
Copy link
Contributor Author

jclulow commented Oct 27, 2019

I wrote a short C program that demonstrates roughly what we'll need to do to check for the cap:

#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <rctl.h>
#include <err.h>
#include <errno.h>

int
main(int argc, char *argv[])
{
	int32_t sysconf_ncpus = (int32_t)sysconf(_SC_NPROCESSORS_ONLN);

	printf("ncpu = %d\n", sysconf_ncpus);

	rctlblk_t *rblks[] = {
		calloc(1, rctlblk_size()),
		calloc(1, rctlblk_size()),
	};

	/*
	 * As per resource_controls(5), the "zone.cpu-cap" resource control is
	 * "the percentage of a single CPU that can be used by all user threads
	 * in a zone, expressed as an integer."  That is, a value of 100 means
	 * one whole CPU on this system.  If there is no cap, or the cap is
	 * greater than the number of actual detected CPUs, we just use the CPU
	 * count value.
	 */
	uint64_t capval = 0;

	int flag = RCTL_FIRST;
	for (unsigned i = 0; ; i++) {
		int flag = i == 0 ? RCTL_FIRST : RCTL_NEXT;
		rctlblk_t *rblk = rblks[i % 2];
		rctlblk_t *rblkprev = rblks[(i + 1) % 2];

		if (getrctl("zone.cpu-cap", rblkprev, rblk, flag) != 0) {
			if (errno == ENOENT) {
				break;
			}
			err(1, "getrctl");
		}

		int s = 0;
		if (!(rctlblk_get_local_flags(rblk) & RCTL_LOCAL_MAXIMAL) &&
		    rctlblk_get_local_action(rblk, &s) == RCTL_LOCAL_DENY) {
			rctl_qty_t v = rctlblk_get_value(rblk);

			if (capval == 0 || capval > v) {
				capval = v;
			}
		}
	}

	if (capval == 0) {
		printf("uncapped, so %d CPUs\n", sysconf_ncpus);
	} else {
		int cap_ncpus = capval / 100;

		if (cap_ncpus > sysconf_ncpus) {
			printf("cap too large, so %d CPUs\n", sysconf_ncpus);
		} else {
			printf("capped, at %d CPUs\n", cap_ncpus);
		}
	}
}

Run on a system with 8 CPUs and no cap configured:

$ ./ncpus 
ncpu = 8
uncapped, so 8 CPUs

Run on the current illumos buildlet host:

[root@gobuild1 /var/tmp]# ./ncpus 
ncpu = 48
capped, at 3 CPUs

@gopherbot
Copy link

Change https://golang.org/cl/203758 mentions this issue: runtime: NumCPU() should respect zone CPU cap on illumos

@jclulow
Copy link
Contributor Author

jclulow commented Oct 27, 2019

I tested this on a couple of systems with a simple program:

package main

import (
        "fmt"
        "runtime"
)

func main() {
        fmt.Printf("runtime.NumCPU() = %v\n", runtime.NumCPU())
}

With no cap and 8 CPUs:

$ psrinfo | wc -l
       8
$ prctl -n zone.cpu-cap $$
process: 14207: -bash
NAME    PRIVILEGE       VALUE    FLAG   ACTION                       RECIPIENT
zone.cpu-cap
        usage               0    
        system          4.29G     inf   deny                                 -
$ ./printncpus 
runtime.NumCPU() = 8

With a cap and 48 CPUs:

$ psrinfo | wc -l
48
$ prctl -n zone.cpu-cap $$
process: 962118: -bash
NAME    PRIVILEGE       VALUE    FLAG   ACTION                       RECIPIENT
zone.cpu-cap
        usage              13    
        privileged        336       -   deny                                 -
        system          4.29G     inf   deny                                 -
$ /var/tmp/printncpus 
runtime.NumCPU() = 3

@golang golang locked and limited conversation to collaborators Oct 27, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants