New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net/http: data race in TestTransportResponseHeaderTimeout #7264
Labels
FrozenDueToAge
Testing
An issue that has been verified to require only test changes, not just a test failure.
Milestone
Comments
I don't see where. Also, you're not even clear about what you think is at fault--- that test or httptest.Server? If no HTTP request has been received, there is nothing to wait for in httptest.Server.Close. This test makes 3 HTTP requests. How will the wg.Add not be called 3 times? I also can't reproduce this. |
The test makes 3 requests, but it does not know that the servers handler for /slow request has started. It may not be even received from Listen. Or received from Listen and the handler goroutine is kicked off, but it has not yet executed wg.Add. The test is at fault, it must do something along the lines of: received := make(chan bool) mux.HandleFunc("/slow", func(w ResponseWriter, r *Request) { received <- true time.Sleep(2 * time.Second) }) <-received // Only NOW we do know that wg.Add has happened! ts.Close() ------------------------------------- Essentially what happens here is: var wg sync.WaitGroup go func() { // this is slow handler wg.Add(1) time.Sleep(2 * time.Second) wg.Done() }() wg.Wait() |
In the general case, I agree that the wg.Add(1) could be earlier in httptest, and I could imagine scenarios that would exploit that, but you wrote that the test (TestTransportResponseHeaderTimeout) is at fault and I kinda disagree. The test makes three synchronous HTTP requests. They're not in goroutines. How can I reproduce the data race you saw? Did you have local modifications? Normally it's very easy to repro races, but I can't make this one trigger, and an eyeball analysis of all the relevant code seems fine too. I totally understand what you're talking about with wg.Add(1) being done in a goroutine (and I'll send a CL for httptest to do the wg.Add(1) earlier), but I don't see how this can happen here, at least in the success paths, unless the test is already failing for other reasons. Maybe I could imagine a scenario where the second HTTP request actually connects, but the machine is so loaded, the Go scheduler doesn't run the http Handler goroutine for seconds, meanwhile the client code goroutine does run and passes, the func exits, and then the Handler runs. Was the builder very overloaded? I guess I'm never surprised by how slow the build machines can be. |
Sent https://golang.org/cl/70950043 Owner changed to @bradfitz. Status changed to Started. |
I will try to provide more reliable reproducer today. > Maybe I could imagine a scenario where the second HTTP request actually connects, but the machine is so loaded, the Go scheduler doesn't run the http Handler goroutine for seconds, meanwhile the client code goroutine does run and passes, the func exits, and then the Handler runs. The race detector does not care about timings, it watches when things are not synchronized. > I totally understand what you're talking about with wg.Add(1) being done in a goroutine (and I'll send a CL for httptest to do the wg.Add(1) earlier) Just doing wd.Add after Accept won't help. Because if you do http.Get with timeout, then you don't know that even Accept has returned when timeout fires. |
CL https://golang.org/cl/70950043 references this issue. |
Assigning to Dmitry to create a repro. Owner changed to @dvyukov. Status changed to Accepted. |
Mailed CL with small and reliable reproducer: https://golang.org/cl/86770043 Owner changed to @bradfitz. |
CL https://golang.org/cl/86770043 mentions this issue. |
CL https://golang.org/cl/86770043 mentions this issue. |
CL https://golang.org/cl/86770043 mentions this issue. |
This issue was updated by revision 388d533. LGTM=bradfitz R=golang-codereviews, bradfitz CC=golang-codereviews https://golang.org/cl/86770043 |
CL https://golang.org/cl/87640043 mentions this issue. |
CL https://golang.org/cl/87640043 mentions this issue. |
CL https://golang.org/cl/87640043 mentions this issue. |
This bug finally reproduces consistently for me, and I can confirm that https://golang.org/cl/70950043 fixes the race. ba12:src bradfitz$ go test -v -race -run=TestTransportResponseHeaderTimeout net/http === RUN TestTransportResponseHeaderTimeout ================== WARNING: DATA RACE Write by goroutine 5: sync.raceWrite() /Users/bradfitz/go/src/pkg/sync/race.go:41 +0x35 sync.(*WaitGroup).Wait() /Users/bradfitz/go/src/pkg/sync/waitgroup.go:122 +0x176 net/http/httptest.(*Server).Close() /Users/bradfitz/go/src/pkg/net/http/httptest/server.go:168 +0x74 net/http_test.TestTransportResponseHeaderTimeout() /Users/bradfitz/go/src/pkg/net/http/transport_test.go:1307 +0xe8e testing.tRunner() /Users/bradfitz/go/src/pkg/testing/testing.go:422 +0x10f Previous read by goroutine 11: sync.raceRead() /Users/bradfitz/go/src/pkg/sync/race.go:37 +0x35 sync.(*WaitGroup).Add() /Users/bradfitz/go/src/pkg/sync/waitgroup.go:60 +0xbe net/http/httptest.(*waitGroupHandler).ServeHTTP() /Users/bradfitz/go/src/pkg/net/http/httptest/server.go:198 +0x5e net/http.serverHandler.ServeHTTP() /Users/bradfitz/go/src/pkg/net/http/server.go:1675 +0x1e8 net/http.(*conn).serve() /Users/bradfitz/go/src/pkg/net/http/server.go:1176 +0xf16 Goroutine 5 (running) created at: testing.RunTests() /Users/bradfitz/go/src/pkg/testing/testing.go:503 +0xb4d testing.Main() /Users/bradfitz/go/src/pkg/testing/testing.go:435 +0xa8 main.main() /var/folders/zr/clvw_31d1wdcd404ybr3pmch0000gn/T/go-build455016757/net/http/_test/_testmain.go:477 +0xdc Goroutine 11 (running) created at: net/http.(*Server).Serve() /Users/bradfitz/go/src/pkg/net/http/server.go:1723 +0x339 ================== --- PASS: TestTransportResponseHeaderTimeout (2.01 seconds) PASS Found 1 data race(s) exit status 66 FAIL net/http 3.048s ba12:src bradfitz$ hg clpatch 70950043 70950043: net/http/httptest: lock Server's WaitGroup earlier, at Accept time Fixes issue #7264 Reviewer: dvyukov@google.com CC: golang-codereviews@googlegroups.com Files: src/pkg/net/http/httptest/server.go ba12:src bradfitz$ go test -v -race -run=TestTransportResponseHeaderTimeout net/http === RUN TestTransportResponseHeaderTimeout --- PASS: TestTransportResponseHeaderTimeout (2.01 seconds) PASS ok net/http 3.049s ba12:src bradfitz$ go test -v -race -run=TestTransportResponseHeaderTimeout net/http === RUN TestTransportResponseHeaderTimeout --- PASS: TestTransportResponseHeaderTimeout (2.01 seconds) PASS ok net/http 3.049s I will reply on that CL to try to convince Dmitry why I think this is correct. |
This issue was updated by revision 172acae. Races: http://build.golang.org/log/a2e401fdcd4903a61a3375bff5da702a20ddafad http://build.golang.org/log/ec4c69e92076a747ac6d5df7eb7b382b31ab3d43 I think this is the first time I've actually seen a manifestation of issue #7264, and one that I can reproduce. I don't know why it triggers on this test and not any others just like it, or why I can't reproduce issue #7264 independently, even when Dmitry gives me minimal repros. Work around it for now with some synchronization to make the race detector happy. The proper fix will probably be in net/http/httptest itself, not in all hundred some tests. LGTM=rsc R=rsc CC=dvyukov, golang-codereviews https://golang.org/cl/87640043 |
CL https://golang.org/cl/87970045 mentions this issue. |
This issue was closed by revision 6ddd995. Status changed to Fixed. |
This issue was closed by revision 50fc0f1. |
dvyukov
added
fixed
Testing
An issue that has been verified to require only test changes, not just a test failure.
labels
Sep 8, 2014
wheatman
pushed a commit
to wheatman/go-akaros
that referenced
this issue
Jun 25, 2018
Now it's failing on Windows: panic: httptest: failed to listen on a port: listen tcp 127.0.0.1:0: listen: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full. Since we can't seem to understand what the test is trying to test, and because it is causing problems on multiple systems, delete it. Fixes golang#7264. TBR=bradfitz CC=brainman, golang-codereviews https://golang.org/cl/141210043
wheatman
pushed a commit
to wheatman/go-akaros
that referenced
this issue
Jul 9, 2018
Now it's failing on Windows: panic: httptest: failed to listen on a port: listen tcp 127.0.0.1:0: listen: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full. Since we can't seem to understand what the test is trying to test, and because it is causing problems on multiple systems, delete it. Fixes golang#7264. TBR=bradfitz CC=brainman, golang-codereviews https://golang.org/cl/141210043
This issue was closed.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
FrozenDueToAge
Testing
An issue that has been verified to require only test changes, not just a test failure.
The text was updated successfully, but these errors were encountered: