Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: data race in tip after CL 461675 #60041

Closed
davecheney opened this issue May 8, 2023 · 13 comments
Closed

net/http: data race in tip after CL 461675 #60041

davecheney opened this issue May 8, 2023 · 13 comments
Assignees
Labels
NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@davecheney
Copy link
Contributor

What version of Go are you using (go version)?

devel

Does this issue reproduce with the latest release?

Go 1.20.x: no
Tip: yes

What operating system and processor architecture are you using (go env)?

darwin

What did you do?

package main

import (
	"context"
	"errors"
	"net/http"
	"net/http/httptest"
	"strings"
	"testing"
	"time"
)

func TestHTTPRace(t *testing.T) {
	errCh := make(chan error, 1)
	svr := httptest.NewUnstartedServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		t.Logf("request received: %s", r.URL.Path)
		select {
		case <-r.Context().Done():
			errCh <- r.Context().Err()
			return
		case <-time.After(1 * time.Second):
			errCh <- errors.New("request should have been cancelled")
			return
		}
	}))
	svr.EnableHTTP2 = true
	svr.StartTLS()

	ctx, cancel := context.WithCancel(context.Background())
	time.AfterFunc(20*time.Millisecond, func() {
		t.Log("cancelling request")
		cancel()
		t.Log("cancelled request")
	})
	req, err := http.NewRequestWithContext(
		ctx,
		"POST",
		"https://"+svr.Listener.Addr().String()+"/",
		strings.NewReader(`request body triggers the race`),
	)
	if err != nil {
		t.Fatal(err)
	}

	t.Logf("sending request to %s", req.URL)
	_, err = svr.Client().Do(req)
	if !errors.Is(err, context.Canceled) {
		t.Fatalf("expected context cancelled error, got: %v", err)
	}
	if !errors.Is(<-errCh, context.Canceled) {
		t.Fatalf("expected context cancelled error, got: %v", err)
	}
}
% go test -race -count=20 -cpu=1,3,5,7,11

What did you expect to see?

Test pass

What did you see instead?

(~/devel/http-data-race) % go test -race -count=20 -cpu=1,3,5,7,9
==================
WARNING: DATA RACE
Write at 0x00c000010359 by goroutine 24:
  net/http.(*readTrackingBody).Close()
      /Users/davecheney/go/src/net/http/transport.go:657 +0x2f
  net/http.(*http2clientStream).closeReqBodyLocked.func1()
      /Users/davecheney/go/src/net/http/h2_bundle.go:7469 +0x5d

Previous write at 0x00c000010359 by goroutine 6:
  net/http.(*readTrackingBody).Close()
      /Users/davecheney/go/src/net/http/transport.go:657 +0x2f
  net/http.(*Request).closeBody()
      /Users/davecheney/go/src/net/http/request.go:1438 +0xf87
  net/http.(*Transport).roundTrip()
      /Users/davecheney/go/src/net/http/transport.go:629 +0xf48
  net/http.(*Transport).RoundTrip()
      /Users/davecheney/go/src/net/http/roundtrip.go:17 +0x33
  net/http.send()
      /Users/davecheney/go/src/net/http/client.go:260 +0x943
  net/http.(*Client).send()
      /Users/davecheney/go/src/net/http/client.go:181 +0x14e
  net/http.(*Client).do()
      /Users/davecheney/go/src/net/http/client.go:724 +0x1173
  net/http.(*Client).Do()
      /Users/davecheney/go/src/net/http/client.go:590 +0x624
  http-data-race.TestHTTPRace()
      /Users/davecheney/devel/http-data-race/main_test.go:46 +0x60e
  testing.tRunner()
      /Users/davecheney/go/src/testing/testing.go:1595 +0x238
  ??()
      -:0 +0x1177b64

Goroutine 24 (running) created at:
  net/http.(*http2clientStream).closeReqBodyLocked()
      /Users/davecheney/go/src/net/http/h2_bundle.go:7468 +0x170
  net/http.(*http2clientStream).abortStreamLocked()
      /Users/davecheney/go/src/net/http/h2_bundle.go:7443 +0x9e
  net/http.(*http2clientStream).abortStream()
      /Users/davecheney/go/src/net/http/h2_bundle.go:7434 +0xc5
  net/http.(*http2ClientConn).RoundTrip()
      /Users/davecheney/go/src/net/http/h2_bundle.go:8308 +0xb6a
  net/http.(*http2Transport).RoundTripOpt()
      /Users/davecheney/go/src/net/http/h2_bundle.go:7582 +0x624
  net/http.(*http2Transport).RoundTrip()
      /Users/davecheney/go/src/net/http/h2_bundle.go:7534 +0x35
  net/http.(*Transport).roundTrip()
      /Users/davecheney/go/src/net/http/transport.go:602 +0xcee
  net/http.(*Transport).RoundTrip()
      /Users/davecheney/go/src/net/http/roundtrip.go:17 +0x33
  net/http.send()
      /Users/davecheney/go/src/net/http/client.go:260 +0x943
  net/http.(*Client).send()
      /Users/davecheney/go/src/net/http/client.go:181 +0x14e
  net/http.(*Client).do()
      /Users/davecheney/go/src/net/http/client.go:724 +0x1173
  net/http.(*Client).Do()
      /Users/davecheney/go/src/net/http/client.go:590 +0x624
  http-data-race.TestHTTPRace()
      /Users/davecheney/devel/http-data-race/main_test.go:46 +0x60e
  testing.tRunner()
      /Users/davecheney/go/src/testing/testing.go:1595 +0x238
  ??()
      -:0 +0x1177b64

Goroutine 6 (running) created at:
  testing.(*T).Run()
      /Users/davecheney/go/src/testing/testing.go:1648 +0x835
  testing.runTests.func1()
      /Users/davecheney/go/src/testing/testing.go:2054 +0x84
  testing.tRunner()
      /Users/davecheney/go/src/testing/testing.go:1595 +0x238
  testing.runTests()
      /Users/davecheney/go/src/testing/testing.go:2052 +0x896
  testing.(*M).Run()
      /Users/davecheney/go/src/testing/testing.go:1925 +0xb57
  main.main()
      _testmain.go:47 +0x2bd
==================
--- FAIL: TestHTTPRace (0.03s)
    main_test.go:45: sending request to https://127.0.0.1:62439/
    main_test.go:16: request received: /
    main_test.go:31: cancelling request
    main_test.go:33: cancelled request
    testing.go:1465: race detected during execution of test
==================
WARNING: DATA RACE
Read at 0x00c000010551 by goroutine 376:
  net/http.(*Transport).roundTrip()
      /Users/davecheney/go/src/net/http/transport.go:625 +0xf25
  net/http.(*Transport).RoundTrip()
      /Users/davecheney/go/src/net/http/roundtrip.go:17 +0x33
  net/http.send()
      /Users/davecheney/go/src/net/http/client.go:260 +0x943
  net/http.(*Client).send()
      /Users/davecheney/go/src/net/http/client.go:181 +0x14e
  net/http.(*Client).do()
      /Users/davecheney/go/src/net/http/client.go:724 +0x1173
  net/http.(*Client).Do()
      /Users/davecheney/go/src/net/http/client.go:590 +0x624
  http-data-race.TestHTTPRace()
      /Users/davecheney/devel/http-data-race/main_test.go:46 +0x60e
  testing.tRunner()
      /Users/davecheney/go/src/testing/testing.go:1595 +0x238
  ??()
      -:0 +0x1177b64

Previous write at 0x00c000010551 by goroutine 385:
  net/http.(*readTrackingBody).Close()
      /Users/davecheney/go/src/net/http/transport.go:657 +0x2f
  net/http.(*http2clientStream).cleanupWriteRequest()
      /Users/davecheney/go/src/net/http/h2_bundle.go:8545 +0x1d6
  net/http.(*http2clientStream).doRequest()
      /Users/davecheney/go/src/net/http/h2_bundle.go:8322 +0x44
  ??()
      -:0 +0x13ffd24

Goroutine 376 (running) created at:
  testing.(*T).Run()
      /Users/davecheney/go/src/testing/testing.go:1648 +0x835
  testing.runTests.func1()
      /Users/davecheney/go/src/testing/testing.go:2054 +0x84
  testing.tRunner()
      /Users/davecheney/go/src/testing/testing.go:1595 +0x238
  testing.runTests()
      /Users/davecheney/go/src/testing/testing.go:2052 +0x896
  testing.(*M).Run()
      /Users/davecheney/go/src/testing/testing.go:1925 +0xb57
  main.main()
      _testmain.go:47 +0x2bd

Goroutine 385 (running) created at:
  net/http.(*http2ClientConn).RoundTrip()
      /Users/davecheney/go/src/net/http/h2_bundle.go:8250 +0x7ea
  net/http.(*http2Transport).RoundTripOpt()
      /Users/davecheney/go/src/net/http/h2_bundle.go:7582 +0x624
  net/http.(*http2Transport).RoundTrip()
      /Users/davecheney/go/src/net/http/h2_bundle.go:7534 +0x35
  net/http.(*Transport).roundTrip()
      /Users/davecheney/go/src/net/http/transport.go:602 +0xcee
  net/http.(*Transport).RoundTrip()
      /Users/davecheney/go/src/net/http/roundtrip.go:17 +0x33
  net/http.send()
      /Users/davecheney/go/src/net/http/client.go:260 +0x943
  net/http.(*Client).send()
      /Users/davecheney/go/src/net/http/client.go:181 +0x14e
  net/http.(*Client).do()
      /Users/davecheney/go/src/net/http/client.go:724 +0x1173
  net/http.(*Client).Do()
      /Users/davecheney/go/src/net/http/client.go:590 +0x624
  http-data-race.TestHTTPRace()
      /Users/davecheney/devel/http-data-race/main_test.go:46 +0x60e
  testing.tRunner()
      /Users/davecheney/go/src/testing/testing.go:1595 +0x238
  ??()
      -:0 +0x1177b64
==================
--- FAIL: TestHTTPRace (0.03s)
    main_test.go:45: sending request to https://127.0.0.1:62482/
    main_test.go:16: request received: /
    main_test.go:31: cancelling request
    main_test.go:33: cancelled request
    testing.go:1465: race detected during execution of test
^Csignal: interrupt

I believe I have bisected it to this change, #49621, which introduce the type assertion to req.Body.(*readTrackingBody) which elides the read body lock.

/cc @bradfitz @neild

@davecheney davecheney changed the title affected/package: net/http data race in tip after affected/package: net/http data race in tip after https://github.com/golang/go/issues/49621 May 8, 2023
@davecheney davecheney changed the title affected/package: net/http data race in tip after https://github.com/golang/go/issues/49621 net/http data race in tip after https://github.com/golang/go/issues/49621 May 8, 2023
@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 8, 2023
@bcmills bcmills changed the title net/http data race in tip after https://github.com/golang/go/issues/49621 net/http data race in tip after CL 461675 May 8, 2023
@bcmills bcmills added this to the Go1.21 milestone May 9, 2023
@dmitshur dmitshur changed the title net/http data race in tip after CL 461675 net/http: data race in tip after CL 461675 May 10, 2023
@heschi
Copy link
Contributor

heschi commented May 17, 2023

@neild do you have an opinion on whether this is a release blocker? Getting very close to the freeze and RC now.

@gopherbot
Copy link

Change https://go.dev/cl/496016 mentions this issue: net/http: avoid race when closing request body

@neild
Copy link
Contributor

neild commented May 17, 2023

Don't know if it's a release blocker, but I believe I have a fix.

@gopherbot
Copy link

Change https://go.dev/cl/496335 mentions this issue: http2: close request bodies before RoundTrip error return

gopherbot pushed a commit to golang/net that referenced this issue May 24, 2023
When returning an error from RoundTrip, wait for the close
of the request body to complete before returning.

This avoids a race between the HTTP/2 transport closing
the request body and the net/http retry loop examining
the readTrackingBody to see if it has been closed.

For golang/go#60041

Change-Id: I8be69ff5056806406716e01e02d1f631deeca088
Reviewed-on: https://go-review.googlesource.com/c/net/+/496335
Run-TryBot: Damien Neil <dneil@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Bryan Mills <bcmills@google.com>
@mknyszek
Copy link
Contributor

We decided in the release meeting that this is a release blocker (and an RC1 release blocker). What's left to do here? Does this get resolved by vendoring x/net/http, or does https://go.dev/cl/496016 also need to land?

Thanks.

@eliben
Copy link
Member

eliben commented Jun 7, 2023

@dneil can you please confirm the status here? This is currently marked as an RC1 release blocker

https://go-review.googlesource.com/c/go/+/496016 hasn't made progress in 3 weeks.

@neild
Copy link
Contributor

neild commented Jun 7, 2023

https://go.dev/cl/496335 supersedes https://go.dev/cl/496016, should fix this issue, and has made it from x/net into the go tree. Closing.

@neild neild closed this as completed Jun 7, 2023
@WGH-
Copy link

WGH- commented Jul 21, 2023

I believe it wasn't fixed properly. I have managed to reproduce this with somewhat different test case on both Go 1.20.6 and Go 1.21rc3.

Here multiple requests are created with http.ReadRequest from a single io.Reader, and fed into RoundTrip. RoundTrip still trying to discard the request body asynchronously even after RoundTrip returns, so it creates data race against bufio.Reader with next http.ReadRequest.

package tmpgbudljeu3i_test

import (
	"bufio"
	"bytes"
	"fmt"
	"io"
	"net/http"
	"net/http/httptest"
	"testing"
)

func writeChunk(w io.Writer, b []byte) {
	fmt.Fprintf(w, "%x\r\n", len(b))
	w.Write(b)
	fmt.Fprint(w, "\r\n")
}

func startRequestGenerator(url string) io.ReadCloser {
	r, w := io.Pipe()
	chunk := bytes.Repeat([]byte("A"), 64)
	go func() {
		for {
			_, err := fmt.Fprintf(w, "POST %s HTTP/1.1\r\n", url)
			if err != nil {
				return
			}
			fmt.Fprint(w, "Host: localhost\r\n")
			fmt.Fprint(w, "Transfer-Encoding: chunked\r\n")
			fmt.Fprint(w, "\r\n")
			writeChunk(w, chunk)
			writeChunk(w, nil)
		}
	}()

	return r
}

func TestHTTPRace(t *testing.T) {
	svr := httptest.NewUnstartedServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		// cause cs.abortRequestBodyWrite()
		// https://github.com/golang/go/blob/go1.20.6/src/net/http/h2_bundle.go#L8205-L8216
		w.WriteHeader(http.StatusBadRequest)
	}))
	svr.EnableHTTP2 = true
	svr.StartTLS()

	r := startRequestGenerator("https://" + svr.Listener.Addr().String() + "/")
	defer r.Close()
	b := bufio.NewReader(r)
	for i := 0; i < 3; i++ {
		req, err := http.ReadRequest(b)
		if err != nil {
			t.Fatal(err)
		}
		req.RequestURI = ""
		_, err = svr.Client().Transport.RoundTrip(req)
		if err != nil {
			t.Log(err)
		}
	}
}
$ /tmp/go1.21rc3/bin/go test -race -count 100 -v -failfast
[...]
==================
WARNING: DATA RACE
Write at 0x00c00034e590 by goroutine 660:
  bufio.(*Reader).ReadSlice()
      /tmp/go1.21rc3/src/bufio/bufio.go:385 +0x634
  bufio.(*Reader).ReadLine()
      /tmp/go1.21rc3/src/bufio/bufio.go:408 +0x33
  net/textproto.(*Reader).readLineSlice()
      /tmp/go1.21rc3/src/net/textproto/reader.go:56 +0xc4
  net/textproto.(*Reader).ReadLine()
      /tmp/go1.21rc3/src/net/textproto/reader.go:39 +0xc8
  net/http.readRequest()
      /tmp/go1.21rc3/src/net/http/request.go:1020 +0xc9
  net/http.ReadRequest()
      /tmp/go1.21rc3/src/net/http/request.go:1003 +0x2e
  aaa_test.TestHTTPRace()
      /tmp/tmp.gbudljeU3I/lol_test.go:52 +0x4c5
  testing.tRunner()
      /tmp/go1.21rc3/src/testing/testing.go:1595 +0x238
  testing.(*T).Run.func1()
      /tmp/go1.21rc3/src/testing/testing.go:1648 +0x44

Previous write at 0x00c00034e590 by goroutine 680:
  bufio.(*Reader).Discard()
      /tmp/go1.21rc3/src/bufio/bufio.go:185 +0x5d
  net/http.(*body).readTrailer()
      /tmp/go1.21rc3/src/net/http/transfer.go:909 +0x544
  net/http.(*body).readLocked()
      /tmp/go1.21rc3/src/net/http/transfer.go:845 +0x15c
  net/http.bodyLocked.Read()
      /tmp/go1.21rc3/src/net/http/transfer.go:1038 +0x9e
  io.discard.ReadFrom()
      /tmp/go1.21rc3/src/io/io.go:658 +0x91
  io.(*discard).ReadFrom()
      <autogenerated>:1 +0x4c
  io.copyBuffer()
      /tmp/go1.21rc3/src/io/io.go:416 +0x1be
  io.Copy()
      /tmp/go1.21rc3/src/io/io.go:389 +0x30f
  net/http.(*body).Close()
      /tmp/go1.21rc3/src/net/http/transfer.go:1002 +0x2da
  net/http.(*readTrackingBody).Close()
      /tmp/go1.21rc3/src/net/http/transport.go:658 +0x4f
  net/http.(*http2clientStream).closeReqBodyLocked.func1()
      /tmp/go1.21rc3/src/net/http/h2_bundle.go:7470 +0x5d

Goroutine 660 (running) created at:
  testing.(*T).Run()
      /tmp/go1.21rc3/src/testing/testing.go:1648 +0x82a
  testing.runTests.func1()
      /tmp/go1.21rc3/src/testing/testing.go:2054 +0x84
  testing.tRunner()
      /tmp/go1.21rc3/src/testing/testing.go:1595 +0x238
  testing.runTests()
      /tmp/go1.21rc3/src/testing/testing.go:2052 +0x896
  testing.(*M).Run()
      /tmp/go1.21rc3/src/testing/testing.go:1925 +0xb57
  main.main()
      _testmain.go:47 +0x2bd

Goroutine 680 (running) created at:
  net/http.(*http2clientStream).closeReqBodyLocked()
      /tmp/go1.21rc3/src/net/http/h2_bundle.go:7469 +0x170
  net/http.(*http2clientStream).abortStreamLocked()
      /tmp/go1.21rc3/src/net/http/h2_bundle.go:7444 +0x9e
  net/http.(*http2clientStream).abortStream()
      /tmp/go1.21rc3/src/net/http/h2_bundle.go:7435 +0xc5
  net/http.(*http2clientConnReadLoop).processResetStream()
      /tmp/go1.21rc3/src/net/http/h2_bundle.go:9935 +0x278
  net/http.(*http2clientConnReadLoop).run()
      /tmp/go1.21rc3/src/net/http/h2_bundle.go:9314 +0x6a5
  net/http.(*http2ClientConn).readLoop()
      /tmp/go1.21rc3/src/net/http/h2_bundle.go:9171 +0x9e
  net/http.(*http2Transport).newClientConn.func3()
      /tmp/go1.21rc3/src/net/http/h2_bundle.go:7843 +0x33
==================

@heschi
Copy link
Contributor

heschi commented Jul 21, 2023

Since we're closing in on the final release, reopening and conservatively marking a release blocker.

@heschi heschi reopened this Jul 21, 2023
@eliben
Copy link
Member

eliben commented Jul 26, 2023

@neild any updates here?
We'll need to decide the severity of this (is it a release blocker for 1.21 final?) and also whether it can be promptly fixed.

@bcmills
Copy link
Contributor

bcmills commented Jul 26, 2023

I have managed to reproduce this with somewhat different test case on both Go 1.20.6 and Go 1.21rc3.

@WGH-, if you are able to reproduce that failure mode with Go 1.20.6, then I don't see how it can possibly be the same regression as the one originally reported for this issue, and it should not be a release-blocker.

I suggest filing a new issue with the new reproducer, and we can consider backports for that once it is fixed.

@WGH-
Copy link

WGH- commented Jul 26, 2023

Strictly speaking, yeah, it does appear in slightly different conditions. It's just underlying problem is the same: HTTP/2 round-tripper discards the request body asynchronously in some cases. One of the cases has been duct-tape-fixed, then duct tape fell apart, and we've got this regression. Some other case has been there for a while.

Anyway, I can't close the issue for I'm not the original author. Once you do, I'll create a new issue, no problem.

@neild
Copy link
Contributor

neild commented Jul 26, 2023

I can reproduce the race reported in #60041 (comment) with Go 1.20, so this is not a regression (although it is a bug).

I've filed #61596 to track the new report, re-closing this issue as still fixed.

@neild neild closed this as completed Jul 26, 2023
@neild neild removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker labels Jul 26, 2023
@dmitshur dmitshur added the NeedsFix The path to resolution is known, but the work has not been done. label Dec 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

9 participants