runtime: optionally (reliably) avoid netpoller #32009

tamird · 2019-05-13T18:42:42Z

The gVisor project implements a user-space Kernel, and its implementation performance-sensitive, which forces a manual avoidance of the netpoller by avoiding certain APIs.

It would be nice to automate and enforce this avoidance, either by exposing some API that could be use to assert in a test that the netpoller has never been used, or by exposing a build tag that would guarantee that the netpoller is inactive. As of this writing it seems concretely that we want to avoid ever incrementing netpollWaiters.

cc @iangudger @nlacasse @prattmic @amscanne

andybons · 2019-05-13T18:44:48Z

This seems like the beginnings of a proposal but there’s no concrete next steps.

What changes would you like to be made specifically?

Thanks

@randall77 @ianlancetaylor

randall77 · 2019-05-13T19:07:10Z

Do you want gVisor to never use the net poller full stop, or does this need to apply only to certain operations within gVisor?

The whole point of the netpoller is to be more efficient (particularly, in the # of OS threads needed) than just blocking an OS thread on each read/write. I'm curious in what circumstances netpoll needs to be avoided. Maybe we could solve that problem instead of this one.

iangudger · 2019-05-13T19:20:56Z

We never want gVisor to use netpoll, full stop. One way of doing this that we have discussed is adding some way to detect if it was ever used to the runtime and then running all of our tests and failing any which use netpoll.

golang.org/cl/78915 has more context on why netpoll is a problem.

randall77 · 2019-05-13T19:27:36Z

Could you just check runtime.netpollInited? Or does that get set despite netpoll never being used?

golang.org/cl/78915 has more context on why netpoll is a problem.

That CL has been merged for a long time now. I don't see any comments there about other issues besides the one that was fixed in the CL.

iangudger · 2019-05-13T19:33:53Z

Could you just check runtime.netpollInited? Or does that get set despite netpoll never being used?

Correct. runtime.netpollInited gets initialized when files are created with os.File regardless of whether netpoll is actually used.

golang.org/cl/78915 has more context on why netpoll is a problem.

That CL has been merged for a long time now. I don't see any comments there about other issues besides the one that was fixed in the CL.

The benefits documented in the CL are only if netpoll is not in use.

randall77 · 2019-05-13T19:50:40Z

Correct. runtime.netpollInited gets initialized when files are created with os.File regardless of whether netpoll is actually used.

Bummer.

The benefits documented in the CL are only if netpoll is not in use.

So you want the 12% improvement to CPU usage that this CL provides? But you only get that 12% if you never use netpoll? Or are you interested in the 0.5% latency improvement?

How close is your app to those benchmarks? They are really corner case benchmarks, with lots of very quick trips into and out of the poller/channel ops/scheduler, with no work on top of that.

iangudger · 2019-05-13T20:41:48Z

It is mostly the CPU usage. gVisor is used in high-density environments. gVisor's CPU usage is much higher than Linux and we are currently looking into other options for reducing CPU usage as well.

Latency is important too though. We measure latency in nanoseconds and shaving even a few nanoseconds in a hot path can be a win for us.

prattmic · 2019-05-13T21:57:26Z

How close is your app to those benchmarks? They are really corner case benchmarks, with lots of very quick trips into and out of the poller/channel ops/scheduler, with no work on top of that.

At the time golang.org/cl/78915 was written, it reduced total runtime of a Tensorflow model training benchmark running inside gVisor by 5% (and total CPU usage by 10%).

Tensorflow can be extremely futex heavy, as it coordinates very small units of work (size depends on the model) on a threadpool, where workers contend on resources. When an application calls futex inside gVisor and it actually blocks, that ultimately becomes a wait on a channel. Since new work is likely to be available very soon, this application becomes very sensitive to overall latency and CPU usage of the Go scheduler to wake the goroutine back up.

google/gvisor#205 is a similar situation. In general, I think overall scheduler improvements (such as making netpoll cheaper) for these cases would be a possible alternative to an explicit API.

nlacasse · 2019-06-12T16:23:43Z

Ideally we'd like to prevent the netpoller from ever running, but I'd be happy just with a way to check whether the netpoller has run.

Here two proposals:

Add a new "implementation" of netpoll that just panics, and put that behind a nonetpoll build tag.
Add a boolean that is set to true whenever netpoll is called. Users that want to avoid netpoll could link against this boolean and check that it is false (likely inside tests).

Are there preferences/objections, or better alternatives to this?

ianlancetaylor · 2019-06-13T00:19:01Z

From my perspective this is so special purpose that it's hard to get excited about having to maintain some publicly visible API for it. I'm pretty skeptical that it would ever have more than one user.

We've talked about having some sort of runtime package stats access (#15490). Perhaps we could make sure that those stats include some data on use the netpoller. Then your tests could use that.

gopherbot · 2019-07-22T21:01:49Z

Change https://golang.org/cl/187137 mentions this issue: runtime: keep track of netpoll usage

andybons added the NeedsInvestigation label May 13, 2019

andybons added this to the Unplanned milestone May 13, 2019

nlacasse mentioned this issue Jun 18, 2019

Add basic code for nfs filesystem. google/gvisor#258

Closed

gopherbot added the compiler/runtime label Jul 7, 2022

mknyszek added this to Go Compiler / Runtime Jul 7, 2022

mknyszek removed this from Go Compiler / Runtime Jul 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

runtime: optionally (reliably) avoid netpoller #32009

runtime: optionally (reliably) avoid netpoller #32009

tamird commented May 13, 2019

andybons commented May 13, 2019

randall77 commented May 13, 2019

iangudger commented May 13, 2019 •

edited

Loading

randall77 commented May 13, 2019

iangudger commented May 13, 2019

randall77 commented May 13, 2019

iangudger commented May 13, 2019

prattmic commented May 13, 2019 •

edited

Loading

nlacasse commented Jun 12, 2019

ianlancetaylor commented Jun 13, 2019

gopherbot commented Jul 22, 2019

runtime: optionally (reliably) avoid netpoller #32009

runtime: optionally (reliably) avoid netpoller #32009

Comments

tamird commented May 13, 2019

andybons commented May 13, 2019

randall77 commented May 13, 2019

iangudger commented May 13, 2019 • edited Loading

randall77 commented May 13, 2019

iangudger commented May 13, 2019

randall77 commented May 13, 2019

iangudger commented May 13, 2019

prattmic commented May 13, 2019 • edited Loading

nlacasse commented Jun 12, 2019

ianlancetaylor commented Jun 13, 2019

gopherbot commented Jul 22, 2019

iangudger commented May 13, 2019 •

edited

Loading

prattmic commented May 13, 2019 •

edited

Loading