Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: allocSpan called from wbufFlush that runs with an unwired P #42339

Closed
Helflym opened this issue Nov 2, 2020 · 6 comments
Closed

runtime: allocSpan called from wbufFlush that runs with an unwired P #42339

Helflym opened this issue Nov 2, 2020 · 6 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. release-blocker
Milestone

Comments

@Helflym
Copy link
Contributor

Helflym commented Nov 2, 2020

There is a new failure on aix/ppc64 builder related to getMCache:
https://build.golang.org/log/fa90b60048ac7f99bc84523d236948a1f090b4c3
https://build.golang.org/log/a5515473b2094744ca0ff0d19c330fa66ca6184a

It seems to appear only on AIX for now.

According to @mknyszek, this isn't related to #42305.

@mknyszek mknyszek self-assigned this Nov 2, 2020
@mknyszek mknyszek added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker labels Nov 2, 2020
@mknyszek mknyszek added this to the Go1.16 milestone Nov 2, 2020
@mknyszek
Copy link
Contributor

mknyszek commented Nov 2, 2020

OK I see what's going on here. We're going into a syscall and handing off our P, but the GC is trying to terminate and so asks each P to flush its workbuf. Technically we're flushing the workbuf with a P, but this P has already been released from the G and the M (via handoffp(releasep()) in entersyscallblock_handoff) and I had assumed that the page/span allocator is always called with a P, but this is a rare case where it isn't (and it didn't technically have this requirement before).

Unfortunately, I believe this means we need to just support this case, so there needs to be a fallback for updating the stats when we don't have a P (either that or we plumb the mcache through, though that sounds onerous).

Also, this is a potential problem on all platforms, not just aix/ppc64. Definitely a release-blocker.

@mknyszek mknyszek added the NeedsFix The path to resolution is known, but the work has not been done. label Nov 2, 2020
@gopherbot gopherbot removed the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Nov 2, 2020
@mknyszek mknyszek changed the title runtime: getMCache called with no P or outside bootstrapping on aix/ppc64 runtime: allocSpan called from wbufFlush that runs with an unwired P Nov 2, 2020
@mknyszek
Copy link
Contributor

mknyszek commented Nov 2, 2020

CC @prattmic @aclements

@gopherbot
Copy link

Change https://golang.org/cl/267157 mentions this issue: runtime: make getMCache inlineable

@gopherbot
Copy link

Change https://golang.org/cl/267158 mentions this issue: runtime: decouple consistent stats from mcache and allow P-less update

gopherbot pushed a commit that referenced this issue Nov 2, 2020
This change moves the responsibility of throwing if an mcache is not
available to the caller, because the inlining cost of throw is set very
high in the compiler. Even if it was reduced down to the cost of a usual
function call, it would still be too expensive, so just move it out.

This choice also makes sense in the context of #42339 since we're going
to have to handle the case where we don't have an mcache to update stats
in a few contexts anyhow.

Also, add getMCache to the list of functions that should be inlined to
prevent future regressions.

getMCache is called on the allocation fast path and because its not
inlined actually causes a significant regression (~10%) in some
microbenchmarks.

Fixes #42305.

Change-Id: I64ac5e4f26b730bd4435ea1069a4a50f55411ced
Reviewed-on: https://go-review.googlesource.com/c/go/+/267157
Trust: Michael Knyszek <mknyszek@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Go Bot <gobot@golang.org>
@mknyszek
Copy link
Contributor

mknyszek commented Nov 2, 2020

The aix/ppc64 builder seems to pass! Thanks @Helflym for the prompt report and for filing an issue.

@mknyszek
Copy link
Contributor

mknyszek commented Nov 2, 2020

Ah, it occurs to me that this failure isn't exactly common on aix/ppc64 so that's not necessarily proof of a fix, but I definitely do understand the problem. I suppose I'll land and we'll reopen this issue if we see it again?

@golang golang locked and limited conversation to collaborators Nov 2, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. release-blocker
Projects
None yet
Development

No branches or pull requests

3 participants