-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: fatal error: releasep: invalid p state #15246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
/CC @aclements (from the dying message |
I just hit this as well go version devel +a4650a2 Mon Apr 11 21:38:18 2016 +0000 linux/amd64
This feels like a recent regression. |
I suspect a4650a2, /cc @josharian |
Ugh. Will take a look, although today promises to be busy. How reliably can you reproduce? Any hints for doing so? If reliably, can you confirm that it was the append changes? |
It's not specific to Solaris. It did start right after a4650a2. It's not immediately obvious to me how that would have caused this, but statistically speaking, it's a likely candidate.
|
I'm puzzled too, although spidey sense says missing VARDEF. When I'm at a computer (or if one of you beats me to it) next, we can temporarily disable the inplace append code with a well-placed && false to get this stable again. |
I can't seem to reproduce on darwin, which is going to make this a bear to track down. Mailed CL 21891 to disable in the meantime. |
CL https://golang.org/cl/21891 mentions this issue. |
I'll see if I can reproduce on linux/amd64. |
That was surprisingly easy. It looks like it was indeed the append optimization. With the append optimization (811ebb6^) I get 3 failures out of 20 runs in |
0 failures out of 100 runs with the append optimization disabled. |
I can't reproduce locally. Maybe my five year old laptop is too slow, or maybe it's darwin. In any case, I'd appreciated it greatly if someone who can get this to fail reliably for them would try patching in CL 21964. Thanks! |
Hmm. 24 "invalid p state" failures out of 1,476 runs with CL 21964. Plus 45 other failures. :( I'm going to do an overnight run without CL 21964 to see if I can reproduce it without the append optimization in a large number of runs. |
Ouch. Thanks so much for doing this, @aclements, much appreciated. I look forward to the results. |
Thanks @aclements I was able to trigger this once, but haven't been able to trigger it again before the CL was rolled back. |
@josharian it is possible that the problem doesn't manifest on osx because of dumb luck and or inadvertent memory barriers because of differences in other runtime interactions with the operating system. |
0 "invalid p state" failures out of 2,787 runs without CL 21964 (3f66d8c, the parent of CL 21964). (20 other failures; mostly "bind: address already in use" at dial_test.go:753 in net.) |
Thanks, @aclements. I have some new theories, but I clearly need to be able to test on my own now instead of bothering you. :) I tried to reproduce on GCE with a small machine running jessie; after an overnight run of 218 with CL 21964, I have no failures. At the rate you reported, I should have seen a handful. I'll leave it running for a while longer, but suggestions/ideas for setting up to reproduce myself would be welcome. |
I am set up to reproduce, but my reproduction rate is not as high as yours, @aclements. If it is easy (and only if it is easy), would you mind patching in CL 22197 and giving it a stress run as well? If not, no prob, I'll leave mind running overnight anyway. |
My machine's busy running benchmarks, but I've queued up a stress test of master on it, which should run tonight. |
My stress tests look good, and the CL has just been submitted, so I think On Tue, Apr 19, 2016 at 2:31 PM, Austin Clements notifications@github.com
|
Ran overnight on my machine with 0 failures (other than the flaky net failures) out of ~2300 runs. |
On the solaris-amd64-smartos buildbot. See http://build.golang.org/log/3e4aa690c01a4123a508595aa248329b26499b18.
/CC looks like runtime/GC gatekeepers (who?). It expected _Prunning but got _Pgcstop.
The text was updated successfully, but these errors were encountered: