Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: runtime:cpu124 test crash or stall when GO_GCFLAGS=-N -l #15853

Closed
quentinmit opened this issue May 26, 2016 · 4 comments
Closed

runtime: runtime:cpu124 test crash or stall when GO_GCFLAGS=-N -l #15853

quentinmit opened this issue May 26, 2016 · 4 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@quentinmit
Copy link
Contributor

Starting with @aclements CL 23391 ("runtime: pass gcWork to scanstack") yesterday, the runtime tests are consistently timing out on the linux-amd64-noopt builder. See e.g. https://build.golang.org/log/c70503b10af6b554372e2e11f257f7a0d8524678

It looks like these tests are running in anywhere from 10s to 100s on the other builders.

Austin, can you take a look at the traceback and see if this is an important regression for Go 1.7?

@quentinmit quentinmit added this to the Go1.7 milestone May 26, 2016
@quentinmit quentinmit added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 26, 2016
@aclements
Copy link
Member

aclements commented May 27, 2016

Repro:

GO_GCFLAGS="-N -l" ./make.bash
GOMAXPROCS=2 go test runtime -cpu=1,2,4 -short

It looks like it's usually not a timeout, but rather a segfault. (Which is good; that's probably easier. :)

@aclements aclements changed the title runtime: tests consistently timing out on linux-amd64-noopt runtime: runtime:cpu124 test crash or stall when GO_GCFLAGS=-N -l May 27, 2016
@aclements aclements modified the milestones: Go1.7Beta, Go1.7 May 27, 2016
@aclements
Copy link
Member

One definite problem is that markrootFreeGStacks calls shrinkstack on a preemptible, growable user stack, which means we may corrupt the internal stack allocation structures. The fix for this is trivial, but I'm working on confirming that this is in fact the root cause of this failure.

@aclements
Copy link
Member

I've confirmed that there's a stack growth happening in the middle of stackcacherelease when it calls lock, after it's already picked up the mcache. The stack growth also accesses the mcache, and intertwining the two operations corrupts the mcache. 3be48b4 triggered it because it grew the stack of markroot from 0x68 bytes to 0x70 bytes on the noopt builder, and markroot is on the path to the stackcacherelease when the growth happens (the added argument that grew the markroot stack frame isn't, but that's the noopt builder for you. :)

@gopherbot
Copy link

CL https://golang.org/cl/23511 mentions this issue.

@golang golang locked and limited conversation to collaborators May 27, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

3 participants