Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: regression in race-instrumentation causes deadline races #17547

Closed
dsnet opened this issue Oct 21, 2016 · 5 comments
Closed

cmd/compile: regression in race-instrumentation causes deadline races #17547

dsnet opened this issue Oct 21, 2016 · 5 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@dsnet
Copy link
Member

dsnet commented Oct 21, 2016

8b3194a attempts to fix #17449.

In a large RPC test, this change causes a GET request to report a "deadline exceeded error" immediately with some high probability even though the deadline has not passed.

Presently, I do not have a smaller reproduction that I can provide. Information is sparse; more to come.

Anyone opposed to reverting?

/cc @mdempsky @Dhananjay92

@dsnet dsnet added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 21, 2016
@dsnet dsnet added this to the Go1.8 milestone Oct 21, 2016
@JayNakrani
Copy link
Contributor

As a temporary fix, I am okay with reverting. Can you point me to the test though?

@dsnet
Copy link
Member Author

dsnet commented Oct 22, 2016

The test is proprietary. I have suspicions that it has to deal with different regression caused by https://go-review.googlesource.com/#/c/31173/, which causes similar failures. It's possible that CL/31317 caused a bug and your CL simply made them occur more frequently.

@mdempsky
Copy link
Member

@dsnet That CL only affects -race builds. Can you confirm the internal test is using -race?

It seems unlikely to be the actual problem (the new code is more intuitively correct IMO), but I'm okay with reverting it if we can't find another fix.

@JayNakrani
Copy link
Contributor

JayNakrani commented Oct 24, 2016

As discussed in #17559, the deadline_exceeded regression is because of race between abortPendingRead() and backgroundRead() net.Conn implementation not respecting semantics. The race instrumentation change is not playing any role in this. I sent out https://golang.org/cl/31755 which I believe takes care of that race and another deadlock.

@dsnet
Copy link
Member Author

dsnet commented Oct 24, 2016

The source of this issue is due to a buggy implementation of net.Conn that started exhibit failures once https://go-review.googlesource.com/#/c/31173/ landed.

https://go-review.googlesource.com/#/c/31317/ was incorrectly identified as the cause because flakiness in whether the problem exhibited itself caused the bisect to incorrectly identify a slightly later CL as the issue.

@dsnet dsnet closed this as completed Oct 24, 2016
@golang golang locked and limited conversation to collaborators Oct 24, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

4 participants