runtime: improve scheduler fairness
Currently global runqueue is starved if a group of goroutines
constantly respawn each other (local runqueue never becomes empty).
Fixes issue 5639.
https://codereview.appspot.com/10042044/diff/11001/src/pkg/runtime/proc.c File src/pkg/runtime/proc.c (right): https://codereview.appspot.com/10042044/diff/11001/src/pkg/runtime/proc.c#newcode1164 src/pkg/runtime/proc.c:1164: if((int32)(p->tick - p->tickglobcheck) > 0) // be careful with ...
11 years, 10 months ago
(2013-06-06 23:56:10 UTC)
#2
https://codereview.appspot.com/10042044/diff/11001/src/pkg/runtime/proc.c
File src/pkg/runtime/proc.c (right):
https://codereview.appspot.com/10042044/diff/11001/src/pkg/runtime/proc.c#new...
src/pkg/runtime/proc.c:1164: if((int32)(p->tick - p->tickglobcheck) > 0) // be
careful with overflow
I don't really see the point of tickglobcheck. Pulling in another goroutine
every so often shouldn't be too expensive, and it's hard for me to believe that
adjusting tickglobcheck based on the size of the P's run queue will matter very
much. Suppose you just say something like
if ((p->tick & 0x3f) == 0)
https://codereview.appspot.com/10042044/diff/11001/src/pkg/runtime/proc.c#new...
src/pkg/runtime/proc.c:2209: // Try get one G from the global queue.
s/Try get/Try to get/
https://codereview.appspot.com/10042044/diff/11001/src/pkg/runtime/proc.c#new...
src/pkg/runtime/proc.c:2213: gp = runtime·sched.runqhead;
This introduces another spot that needs to manage this list, but it's not
performance critical at all. Suppose we just call globrunqget here. Or, if you
really don't want that many g's (though I don't know why that would matter), add
an argument to globrunqget: the maximum number of g's to move. For the other
calls pass, I don't know, 32 or something. Here, pass 1.
https://codereview.appspot.com/10042044/diff/11001/src/pkg/runtime/proc.c File src/pkg/runtime/proc.c (right): https://codereview.appspot.com/10042044/diff/11001/src/pkg/runtime/proc.c#newcode1164 src/pkg/runtime/proc.c:1164: if((int32)(p->tick - p->tickglobcheck) > 0) // be careful with ...
11 years, 10 months ago
(2013-06-07 14:42:06 UTC)
#3
https://codereview.appspot.com/10042044/diff/11001/src/pkg/runtime/proc.c
File src/pkg/runtime/proc.c (right):
https://codereview.appspot.com/10042044/diff/11001/src/pkg/runtime/proc.c#new...
src/pkg/runtime/proc.c:1164: if((int32)(p->tick - p->tickglobcheck) > 0) // be
careful with overflow
On 2013/06/06 23:56:10, iant wrote:
> I don't really see the point of tickglobcheck. Pulling in another goroutine
> every so often shouldn't be too expensive, and it's hard for me to believe
that
> adjusting tickglobcheck based on the size of the P's run queue will matter
very
> much. Suppose you just say something like
> if ((p->tick & 0x3f) == 0)
That's probably fine. But this it too simple check, I was actually able to
exploit it with the following program, so that the global queue is never polled.
Is it possible to do a less exploitable check w/o using division?
func TestTimerFairness2(t *testing.T) {
done := make(chan bool)
c := make(chan bool)
for i := 0; i < 2; i++ {
go func() {
timer := time.After(20 * time.Millisecond)
var buf [1]byte
for {
syscall.Read(0, buf[0:0])
select {
case c <- true:
case <-c:
case <-timer:
done <- true
return
}
}
}()
}
<-done
<-done
}
Added a comment explaining why we don't want to poll global queue too often. https://codereview.appspot.com/10042044/diff/11001/src/pkg/runtime/proc.c ...
11 years, 10 months ago
(2013-06-07 15:10:57 UTC)
#4
Added a comment explaining why we don't want to poll global queue too often.
https://codereview.appspot.com/10042044/diff/11001/src/pkg/runtime/proc.c
File src/pkg/runtime/proc.c (right):
https://codereview.appspot.com/10042044/diff/11001/src/pkg/runtime/proc.c#new...
src/pkg/runtime/proc.c:2213: gp = runtime·sched.runqhead;
On 2013/06/06 23:56:10, iant wrote:
> This introduces another spot that needs to manage this list, but it's not
> performance critical at all. Suppose we just call globrunqget here. Or, if
you
> really don't want that many g's (though I don't know why that would matter),
add
> an argument to globrunqget: the maximum number of g's to move. For the other
> calls pass, I don't know, 32 or something. Here, pass 1.
Done.
11 years, 10 months ago
(2013-06-07 15:12:10 UTC)
#5
On 2013/06/07 14:42:06, dvyukov wrote:
> https://codereview.appspot.com/10042044/diff/11001/src/pkg/runtime/proc.c
> File src/pkg/runtime/proc.c (right):
>
>
https://codereview.appspot.com/10042044/diff/11001/src/pkg/runtime/proc.c#new...
> src/pkg/runtime/proc.c:1164: if((int32)(p->tick - p->tickglobcheck) > 0) //
be
> careful with overflow
> On 2013/06/06 23:56:10, iant wrote:
> > I don't really see the point of tickglobcheck. Pulling in another goroutine
> > every so often shouldn't be too expensive, and it's hard for me to believe
> that
> > adjusting tickglobcheck based on the size of the P's run queue will matter
> very
> > much. Suppose you just say something like
> > if ((p->tick & 0x3f) == 0)
>
>
> That's probably fine. But this it too simple check, I was actually able to
> exploit it with the following program, so that the global queue is never
polled.
> Is it possible to do a less exploitable check w/o using division?
A good check would be something like (tick%59)==0 || (tick%61)==0 (both prime
numbers), but I don't want 2 divisions on scheduler fast path.
> func TestTimerFairness2(t *testing.T) {
> done := make(chan bool)
> c := make(chan bool)
> for i := 0; i < 2; i++ {
> go func() {
> timer := time.After(20 * time.Millisecond)
> var buf [1]byte
> for {
> syscall.Read(0, buf[0:0])
> select {
> case c <- true:
> case <-c:
> case <-timer:
> done <- true
> return
> }
> }
> }()
> }
> <-done
> <-done
> }
On Fri, Jun 7, 2013 at 8:12 AM, <dvyukov@google.com> wrote: > > A good check ...
11 years, 9 months ago
(2013-06-10 17:38:25 UTC)
#6
On Fri, Jun 7, 2013 at 8:12 AM, <dvyukov@google.com> wrote:
>
> A good check would be something like (tick%59)==0 || (tick%61)==0 (both
> prime numbers), but I don't want 2 divisions on scheduler fast path.
Well, how would you feel about some multiplications on the fast path?
Division by an unsigned constant is equivalent to multiplication by a
large unsigned constant. Look at the code that GCC -O2 generates for
x % 59.
Ian
*** Submitted as https://code.google.com/p/go/source/detail?r=81e554ab7787 *** runtime: improve scheduler fairness Currently global runqueue is starved if ...
11 years, 9 months ago
(2013-06-15 12:06:39 UTC)
#11
Issue 10042044: code review 10042044: runtime: improve scheduler fairness
(Closed)
Created 11 years, 10 months ago by dvyukov
Modified 11 years, 9 months ago
Reviewers:
Base URL:
Comments: 9