New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sync: WaitGroup showing up with race detected #8054
Labels
Milestone
Comments
There are cases where the code could Wait() before all Add()s are invoked. However, the document states that Wait() may exit early which I am willing to live with. This is following that same pattern that httptest.Server uses which would mean that code is potentially broke as well. In this case however, that is not the case. I am sending in a request that is Add()ed and then invoking a Wait(). |
For this test case, no. I added some printouts to prove to myself that Adds happened before Wait. In general, I cannot guarantee that since there is a window of possibility which I am willing to live with. The below shows the Data Race again, but this time I printed before/after around my Adds/Waits. https://travis-ci.org/cloudfoundry/gorouter/jobs/25689283 |
I did some more testing just to satisfy myself. On my mac, and linux VMs, I could never reproduce the problem with 1.2.2 or 1.3beta2. This "race" only seems to appear on Travis. It doesn't seem to matter how many times I try. I even broke down and wrote an easier testcase which couldn't recreate the issue. Even if I force Adds after Wait but before Done, I never got a data race which I did think was a bit odd. |
Given that the Add call is happening from proxy.ServeHTTP called from the http server, it seems clear that the Add is concurrent with the Wait: the Wait can't possibly know when a new request is going to come in. It sounds like you are saying that you want this Wait to apply to all the Adds that come in before it and any subset of the ones racing with it but don't care about the race. Not caring about the race doesn't mean that it's not a race. This is usually a mistake and probably worth fixing in your code just so that you can get meaningful reports from the race detector about other races. The fix would be to have a "current" wait group, and have that protected by a mutex, and your Wait method would lock the current wait group, swap in a brand new one, and call Wait on the old one. Status changed to WorkingAsIntended. |
I hate to belabor the point but it seems you just invalidated the WaitGroup implementation. I understand the rules that are in place when using WaitGroup. All Add()s before a Wait(). I am assuming Done() is allowed any time after a corresponding Add() including after Wait(). The problem I have is the implementation of Done() which invalidates the rules of use. Done() is simply an Add(-1) which means Add() can occur after Wait(). Or are we really saying that you cannot allow the count to go to zero and then above 0 with an outstanding Wait()? The moment I have to wrap this all with a Mutex I might as well count myself and use a buffered channel or Condition or even a WaitGroup of 1 which I trigger when my count is 0. But like I stated in the beginning net/http/httptest/Server.go has the same race since it too makes the same bad assumption on how to use a WaitGroup. |
The rule for WaitGroup is: all Add's that can transition the counter from 0 to >0 must happen before Wait. Add's with negative values, or Add's with positive values which start when counter >0 can be executed at any time. > Or are we really saying that you cannot allow the count to go to zero and then above 0 with an outstanding Wait()? I don't see how it can be useful. |
The documentation sentence "Note that calls with positive delta must happen before the call to Wait, or else Wait may wait for too small a group." is notoriously confusing. Given comment #10 here, could that that be clarified too? People read that as "all calls" or "some calls" depending on what they wish it would do. |
This issue was closed.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
The text was updated successfully, but these errors were encountered: