Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: data race in http/2 server #20704

Closed
rhysh opened this issue Jun 16, 2017 · 5 comments
Closed

net/http: data race in http/2 server #20704

rhysh opened this issue Jun 16, 2017 · 5 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@rhysh
Copy link
Contributor

rhysh commented Jun 16, 2017

It looks like the race is on the net/http.(*http2responseWriterState).bw field, which survives while the *http2responseWriterState value is passed through a sync.Pool. The goroutine running the handler does a write on that bufio.Writer, while another goroutine reads from it as part of an asynchronous frame write.

http2responseWriter.handlerDone puts rws into the pool, but does not appear to check if there's an async write occurring.


What version of Go are you using (go version)?

$ go1.8 version
go version go1.8.3 darwin/amd64
$ go-tip version
go version devel +952ecbe0a2 Wed Jun 14 21:44:01 2017 +0000 darwin/amd64

What operating system and processor architecture are you using (go env)?

$ go1.8 env
GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/rhys/go"
GORACE=""
GOROOT="/Users/rhys/go/version/go1.8"
GOTOOLDIR="/Users/rhys/go/version/go1.8/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/49/zmds5zsn75z1283vtzxyfr5hj7yjq4/T/go-build295970843=/tmp/go-build -gno-record-gcc-switches -fno-common"
CXX="clang++"
CGO_ENABLED="1"
PKG_CONFIG="pkg-config"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"

What did you do?

I ran the following program with the race detector: https://play.golang.org/p/Q9uoDKSNpf

package main

import (
	"context"
	"crypto/tls"
	"io"
	"log"
	"net/http"
	"net/http/httptest"
	"time"
)

const (
	itemSize  = 1 << 10
	itemCount = 100
)

func main() {
	handler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		for i := 0; i < itemCount; i++ {
			w.Write(make([]byte, itemSize))
		}
	})

	cl, srv := buildClientServer(handler)
	defer srv.Close()

	for i := 0; i < 10000; i++ {
		err := makeRequest(context.Background(), cl, srv.URL)
		if err != nil {
			log.Fatalf("err = %q", err)
		}
	}
}

func buildClientServer(h http.Handler) (*http.Client, *httptest.Server) {
	srv := httptest.NewUnstartedServer(h)

	srv.TLS = &tls.Config{
		NextProtos: []string{"h2", "http/1.1"},
	}
	srv.StartTLS()

	tr := &http.Transport{}
	cl := &http.Client{Transport: tr}
	// make a request to trigger HTTP/2 autoconfiguration
	resp, err := cl.Get(srv.URL)
	if err == nil {
		resp.Body.Close()
	}
	// now allow the client to connect to the ad-hoc test server
	tr.TLSClientConfig.InsecureSkipVerify = true

	return cl, srv
}

func makeRequest(ctx context.Context, cl *http.Client, url string) error {
	ctx, cancel := context.WithTimeout(ctx, 100*time.Millisecond)
	defer cancel()

	req, err := http.NewRequest("GET", url, nil)
	if err != nil {
		return err
	}

	resp, err := cl.Do(req.WithContext(ctx))
	if err != nil {
		return err
	}
	defer resp.Body.Close()

	// read only part of the response
	for i := 0; i < 2; i++ {
		_, err := io.ReadFull(resp.Body, make([]byte, itemSize))
		if err != nil && err != io.EOF {
			return err
		}
	}

	return nil
}

What did you expect to see?

I expected no data races to be detected.

What did you see instead?

With both go1.8.3 and go1.9beta1, the race detector finds data races within net/http's HTTP/2 code (bundled from x/net/http2).

With go1.8 build -gcflags="-trimpath=$PWD" -i -race -o /tmp/h2_race . && stress -p=1 /tmp/h2_race, the race detector finds data races in about 1 in 5 runs of the program. Here's one from go1.8.3:

==================
WARNING: DATA RACE
Write at 0x00c420791000 by goroutine 105:
  runtime.slicecopy()
      /usr/local/go/src/runtime/slice.go:160 +0x0
  bufio.(*Writer).Write()
      /usr/local/go/src/bufio/bufio.go:610 +0x431
  net/http.(*http2responseWriter).write()
      /usr/local/go/src/net/http/h2_bundle.go:5095 +0x18f
  net/http.(*http2responseWriter).Write()
      /usr/local/go/src/net/http/h2_bundle.go:5069 +0x7a
  main.main.func1()
      h2_race.go:21 +0x8f
  net/http.HandlerFunc.ServeHTTP()
      /usr/local/go/src/net/http/server.go:1942 +0x51
  net/http.serverHandler.ServeHTTP()
      /usr/local/go/src/net/http/server.go:2568 +0xbc
  net/http.initNPNRequest.ServeHTTP()
      /usr/local/go/src/net/http/server.go:3088 +0x109
  net/http.(*initNPNRequest).ServeHTTP()
      <autogenerated>:312 +0x98
  net/http.(Handler).ServeHTTP-fm()
      /usr/local/go/src/net/http/h2_bundle.go:4331 +0x64
  net/http.(*http2serverConn).runHandler()
      /usr/local/go/src/net/http/h2_bundle.go:4611 +0x96

Previous read at 0x00c420791000 by goroutine 58:
  runtime.slicecopy()
      /usr/local/go/src/runtime/slice.go:160 +0x0
  net/http.(*http2Framer).WriteDataPadded()
      /usr/local/go/src/net/http/h2_bundle.go:1180 +0x329
  net/http.(*http2Framer).WriteData()
      /usr/local/go/src/net/http/h2_bundle.go:1151 +0x8f
  net/http.(*http2writeData).writeFrame()
      /usr/local/go/src/net/http/h2_bundle.go:7478 +0xbd
  net/http.(*http2serverConn).writeFrameAsync()
      /usr/local/go/src/net/http/h2_bundle.go:3471 +0x58

Goroutine 105 (running) created at:
  net/http.(*http2serverConn).processHeaders()
      /usr/local/go/src/net/http/h2_bundle.go:4343 +0x847
  net/http.(*http2serverConn).processFrame()
      /usr/local/go/src/net/http/h2_bundle.go:3985 +0x71e
  net/http.(*http2serverConn).processFrameFromReader()
      /usr/local/go/src/net/http/h2_bundle.go:3944 +0x8be
  net/http.(*http2serverConn).serve()
      /usr/local/go/src/net/http/h2_bundle.go:3562 +0xad0
  net/http.(*http2Server).ServeConn()
      /usr/local/go/src/net/http/h2_bundle.go:3187 +0xcd5
  net/http.http2ConfigureServer.func1()
      /usr/local/go/src/net/http/h2_bundle.go:3065 +0xe7
  net/http.(*conn).serve()
      /usr/local/go/src/net/http/server.go:1746 +0x1952

Goroutine 58 (running) created at:
  net/http.(*http2serverConn).startFrameWrite()
      /usr/local/go/src/net/http/h2_bundle.go:3775 +0x33f
  net/http.(*http2serverConn).scheduleFrameWrite()
      /usr/local/go/src/net/http/h2_bundle.go:3862 +0x3e1
  net/http.(*http2serverConn).wroteFrame()
      /usr/local/go/src/net/http/h2_bundle.go:3823 +0x1d3
  net/http.(*http2serverConn).serve()
      /usr/local/go/src/net/http/h2_bundle.go:3560 +0xa1d
  net/http.(*http2Server).ServeConn()
      /usr/local/go/src/net/http/h2_bundle.go:3187 +0xcd5
  net/http.http2ConfigureServer.func1()
      /usr/local/go/src/net/http/h2_bundle.go:3065 +0xe7
  net/http.(*conn).serve()
      /usr/local/go/src/net/http/server.go:1746 +0x1952
==================
Found 1 data race(s)

And an example of the problem in go1.9beta1:

==================
WARNING: DATA RACE
Write at 0x00c421154000 by goroutine 73:
  runtime.slicecopy()
      /usr/local/go/src/runtime/slice.go:160 +0x0
  bufio.(*Writer).Write()
      /usr/local/go/src/bufio/bufio.go:611 +0x420
  net/http.(*http2responseWriter).write()
      /usr/local/go/src/net/http/h2_bundle.go:6230 +0x1e5
  net/http.(*http2responseWriter).Write()
      /usr/local/go/src/net/http/h2_bundle.go:6204 +0x7a
  main.main.func1()
      h2_race.go:21 +0x8b
  net/http.HandlerFunc.ServeHTTP()
      /usr/local/go/src/net/http/server.go:1914 +0x51
  net/http.serverHandler.ServeHTTP()
      /usr/local/go/src/net/http/server.go:2610 +0xbc
  net/http.initNPNRequest.ServeHTTP()
      /usr/local/go/src/net/http/server.go:3155 +0x109
  net/http.(*initNPNRequest).ServeHTTP()
      <autogenerated>:1 +0x8f
  net/http.(Handler).ServeHTTP-fm()
      /usr/local/go/src/net/http/h2_bundle.go:5460 +0x64
  net/http.(*http2serverConn).runHandler()
      /usr/local/go/src/net/http/h2_bundle.go:5745 +0x96

Previous read at 0x00c421154000 by goroutine 116:
  runtime.slicecopy()
      /usr/local/go/src/runtime/slice.go:160 +0x0
  net/http.(*http2Framer).WriteDataPadded()
      /usr/local/go/src/net/http/h2_bundle.go:1965 +0x35c
  net/http.(*http2Framer).WriteData()
      /usr/local/go/src/net/http/h2_bundle.go:1925 +0x8f
  net/http.(*http2writeData).writeFrame()
      /usr/local/go/src/net/http/h2_bundle.go:8722 +0xbd
  net/http.(*http2serverConn).writeFrameAsync()
      /usr/local/go/src/net/http/h2_bundle.go:4414 +0x58

Goroutine 73 (running) created at:
  net/http.(*http2serverConn).processHeaders()
      /usr/local/go/src/net/http/h2_bundle.go:5479 +0x82c
  net/http.(*http2serverConn).processFrame()
      /usr/local/go/src/net/http/h2_bundle.go:5026 +0x5d0
  net/http.(*http2serverConn).processFrameFromReader()
      /usr/local/go/src/net/http/h2_bundle.go:4984 +0x80b
  net/http.(*http2serverConn).serve()
      /usr/local/go/src/net/http/h2_bundle.go:4509 +0xfec
  net/http.(*http2Server).ServeConn()
      /usr/local/go/src/net/http/h2_bundle.go:4118 +0xd81
  net/http.http2ConfigureServer.func1()
      /usr/local/go/src/net/http/h2_bundle.go:3956 +0xe7
  net/http.(*conn).serve()
      /usr/local/go/src/net/http/server.go:1718 +0x19fa

Goroutine 116 (finished) created at:
  net/http.(*http2serverConn).startFrameWrite()
      /usr/local/go/src/net/http/h2_bundle.go:4786 +0x327
  net/http.(*http2serverConn).scheduleFrameWrite()
      /usr/local/go/src/net/http/h2_bundle.go:4887 +0x383
  net/http.(*http2serverConn).wroteFrame()
      /usr/local/go/src/net/http/h2_bundle.go:4848 +0x1d3
  net/http.(*http2serverConn).serve()
      /usr/local/go/src/net/http/h2_bundle.go:4507 +0x1092
  net/http.(*http2Server).ServeConn()
      /usr/local/go/src/net/http/h2_bundle.go:4118 +0xd81
  net/http.http2ConfigureServer.func1()
      /usr/local/go/src/net/http/h2_bundle.go:3956 +0xe7
  net/http.(*conn).serve()
      /usr/local/go/src/net/http/server.go:1718 +0x19fa
==================
Found 1 data race(s)

@vcabbage
Copy link
Member

Simplified reproduction. Race is triggered on the first run for me.

package main

import (
	"crypto/tls"
	"net/http"
	"net/http/httptest"

	"golang.org/x/net/http2"
)

const (
	itemSize  = 1 << 10
	itemCount = 100
)

func main() {
	handler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		for i := 0; i < itemCount; i++ {
			_, err := w.Write(make([]byte, itemSize))
			if err != nil {
				return
			}
		}
	})

	srv := httptest.NewUnstartedServer(handler)
	srv.TLS = &tls.Config{
		NextProtos: []string{"h2"},
	}
	srv.StartTLS()

	cl := &http.Client{
		Transport: &http2.Transport{
			TLSClientConfig: &tls.Config{
				InsecureSkipVerify: true,
			},
		},
	}

	for i := 0; i < 10000; i++ {
		resp, err := cl.Get(srv.URL)
		if err != nil {
			panic(err)
		}
		resp.Body.Close()
	}
}

@bradfitz bradfitz added this to the Go1.9 milestone Jun 17, 2017
@bradfitz bradfitz added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 17, 2017
@bradfitz
Copy link
Contributor

/cc @tombergan

@bradfitz
Copy link
Contributor

@rhysh, @vcabbage, thanks for the repro on a silver platter. Repros great for me. I converted it into a http2 unit test where it fails. Looking into it now.

@gopherbot
Copy link

CL https://golang.org/cl/46008 mentions this issue.

gopherbot pushed a commit to golang/net that referenced this issue Jun 19, 2017
With Tom Bergan.

Updates golang/go#20704

Change-Id: Ib71202801f8c72af2f22865899c93df1f3753fdd
Reviewed-on: https://go-review.googlesource.com/46008
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Tom Bergan <tombergan@google.com>
@gopherbot
Copy link

CL https://golang.org/cl/46093 mentions this issue.

c3mb0 pushed a commit to c3mb0/net that referenced this issue Apr 2, 2018
With Tom Bergan.

Updates golang/go#20704

Change-Id: Ib71202801f8c72af2f22865899c93df1f3753fdd
Reviewed-on: https://go-review.googlesource.com/46008
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Tom Bergan <tombergan@google.com>
@golang golang locked and limited conversation to collaborators Jun 19, 2018
gopherbot pushed a commit to golang/net that referenced this issue Nov 6, 2023
CL 46008 fixed golang/go#20704 by not recycling the responseWriterState
if any previous Write call failed, as there could be outstanding
goroutines referencing the responseWriterState memory.

More recently, CL 467355 fixed a variant of the same issue by not
referencing that memory after exiting the Write call. This fix
supersedes the fix in CL 46008, as it is more general and does not
require the caller to know whether any previous Write calls failed.

This CL partially reverts CL 46008 just leaving the test case to ensure
that golang/go#20704 does not regress.

Change-Id: I18ea4d27420265a94cc7af21f1dffa3f7dc3bd34
Reviewed-on: https://go-review.googlesource.com/c/net/+/534315
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Damien Neil <dneil@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Run-TryBot: Quim Muntal <quimmuntal@gmail.com>
Commit-Queue: Damien Neil <dneil@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

5 participants