Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: stopped Ms can't become dedicated or fractional GC workers #44313

Closed
prattmic opened this issue Feb 16, 2021 · 10 comments
Closed

runtime: stopped Ms can't become dedicated or fractional GC workers #44313

prattmic opened this issue Feb 16, 2021 · 10 comments
Assignees
Labels
FrozenDueToAge GarbageCollector NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@prattmic
Copy link
Member

prattmic commented Feb 16, 2021

When a GC starts, Ms are enlisted to run a dedicated or fractional GC worker via a call to findRunnableGCWorker early in schedule.

This is problematic for Ms blocked in findrunnable, which never calls findRunnableGCWorker, and thus cannot run dedicated or fractional GC workers. findrunnable can run idle-priority GC workers, and thus the GC work should get done regardless, but this is going to become particularly problematic if idle-priority GC is removed (#44163).

The typical flow should look something like:

  1. Normal goroutine calls gcStart.
  2. During the process of starting GC, startTheWorldWithSema calls wakep, waking an M stopped in findrunnable.
  3. The woken M runs a GC mark worker at idle-priority.
  4. The M from (1) eventually enters the scheduler and calls findRunnableGCWorker, running a dedicated GC mark worker, and then calls wakep from schedule, waking another M stopped in findrunnable.
  5. The woken M runs a GC mark worker at idle-priority.

If the Ms at (3) and (5) are not spinning (which they won't be if it will be fully stopped), I don't believe there will be anymore wakep calls, thus limiting the total number of workers to a maximum of three.

FWIW, I don't think it should be fundamentally difficult to move a findRunnableGCWorker check into findrunnable.

This is somewhat related to #39004, another case of a source of work not reachable from findrunnable (though in that case there is no wakep at all).

cc @mknyszek @aclements

@prattmic prattmic added GarbageCollector NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. labels Feb 16, 2021
@prattmic prattmic added this to the Go1.17 milestone Feb 16, 2021
@prattmic prattmic self-assigned this Feb 16, 2021
@gopherbot
Copy link

Change https://golang.org/cl/307914 mentions this issue: runtime: refactor findrunnable spinning recheck

@gopherbot
Copy link

Change https://golang.org/cl/307913 mentions this issue: runtime: refactor work stealing to dedicated function

@gopherbot
Copy link

Change https://golang.org/cl/307910 mentions this issue: runtime: move delta computation closer to use

@gopherbot
Copy link

Change https://golang.org/cl/307911 mentions this issue: runtime: remove redudant tryWakeP component

@gopherbot
Copy link

Change https://golang.org/cl/307912 mentions this issue: runtime: clarify which work needs spinning coordination

gopherbot pushed a commit that referenced this issue Apr 16, 2021
findrunnable has a couple places where delta is recomputed from a new
pollUntil value. This proves to be a pain in refactoring, as it is easy
to forget to do properly.

Move computation of delta closer to its use, where it is more logical
anyways.

This CL should have no functional changes.

For #43997.
For #44313.

Change-Id: I89980fd7f40f8a4c56c7540cae03ff99e12e1422
Reviewed-on: https://go-review.googlesource.com/c/go/+/307910
Trust: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
gopherbot pushed a commit that referenced this issue Apr 16, 2021
Here tryWakeP can't already be true, so there is no need to combine the
values.

This CL should have no functional changes.

For #43997.
For #44313.

Change-Id: I640c7bb88a5f70c8d22f89f0b5b146b3f60c0136
Reviewed-on: https://go-review.googlesource.com/c/go/+/307911
Trust: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
gopherbot pushed a commit that referenced this issue Apr 16, 2021
The overview comments discuss readying goroutines, which is the most
common source of work, but timers and idle-priority GC work also require
the same synchronization w.r.t. spinning Ms.

This CL should have no functional changes.

For #43997
Updates #44313

Change-Id: I7910a7f93764dde07c3ed63666277eb832bf8299
Reviewed-on: https://go-review.googlesource.com/c/go/+/307912
Trust: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
gopherbot pushed a commit that referenced this issue Apr 20, 2021
findrunnable has grown very large and hard to follow over the years.
Parts we can split out into logical chunks should help make it more
understandable and easier to change in the future.

The work stealing loop is one such big chunk that is fairly trivial to
split out into its own function, and even has the advantage of
simplifying control flow by removing a goto around work stealing.

This CL should have no functional changes.

For #43997.
For #44313.

Change-Id: Ie69670c7bc60bd6c114e860184918717829adb22
Reviewed-on: https://go-review.googlesource.com/c/go/+/307913
Trust: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Chris Hines <chris.cs.guy@gmail.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
gopherbot pushed a commit that referenced this issue Apr 20, 2021
Break the main components of the findrunnable spinning -> non-spinning
recheck out into their own functions, which simplifies both findrunnable
and the new functions, which can make use of fancy features like early
returns.

This CL should have no functional changes.

For #43997
For #44313

Change-Id: I6d3060fcecda9920a3471ff338f73d53b1d848a3
Reviewed-on: https://go-review.googlesource.com/c/go/+/307914
Trust: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
@mknyszek
Copy link
Contributor

I don't think anything else is going to be done here in this release. @prattmic does that sound right? Optimistically kicking to 1.18.

@mknyszek mknyszek modified the milestones: Go1.17, Go1.18 May 24, 2021
@prattmic
Copy link
Member Author

prattmic commented Jun 1, 2021

Agreed. I don't think this warrants a backport/late fix.

@mknyszek
Copy link
Contributor

It's probably too late to do anything here for this cycle. Moving this to backlog since it was already moved back one cycle. Feel free to move it back into 1.19 if you plan to do some work on it.

@gopherbot
Copy link

Change https://go.dev/cl/393880 mentions this issue: runtime: move scheduling decisions by schedule into findrunnable

@gopherbot
Copy link

Change https://go.dev/cl/395634 mentions this issue: runtime: disable idle mark workers with at least one dedicated worker

gopherbot pushed a commit that referenced this issue Apr 26, 2022
This change completes the proposal laid out in #44163. With #44313
resolved, we now ensure that stopped Ms are able to wake up and become
dedicated GC workers. As a result, idle GC workers are in theory no
longer required to be a proxy for scheduling dedicated mark workers.

And, with at least one dedicated mark worker running (which is
non-preemptible) we ensure the GC makes progress in all circumstances
when at least one is running. Currently we ensure at least one idle mark
worker is available at all times because it's possible before #44313
that a dedicated worker doesn't ever get scheduled, leading to a
deadlock if user goroutines block on a GC completing. But now that extra
idle mark worker should be unnecessary to ensure GC progress when at
least one dedicated mark worker is going to be scheduled.

Fixes #44163.

Change-Id: I62889ef2db4e69d44da883e8e6eebcfe5398c86d
Reviewed-on: https://go-review.googlesource.com/c/go/+/395634
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@prattmic prattmic self-assigned this Jun 24, 2022
@golang golang locked and limited conversation to collaborators Jun 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge GarbageCollector NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

3 participants