Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bufio: nil pointer dereference #53808

Open
szuecs opened this issue Jul 12, 2022 · 31 comments · May be fixed by #66594
Open

bufio: nil pointer dereference #53808

szuecs opened this issue Jul 12, 2022 · 31 comments · May be fixed by #66594
Assignees
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@szuecs
Copy link

szuecs commented Jul 12, 2022

What version of Go are you using (go version)?

$ go version
go1.18.3

Does this issue reproduce with the latest release?

I don't no it's a single occurrence in a system running about 500 instances every day 24x7.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE="on"
GOARCH="amd64"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/share/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/share/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.18.3"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOWORK=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build813518630=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Run the binary in a container and this was a single occurrence after months of running different versions of the binary (we regularly update it ~weekly).

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x52160a]
container=skipper-ingress version=v0.13.220 container_id=22847bac9b2e73e49c21b85e150d93c67919bfa4e306ba2442d9d2ee48e1d6f6 pod=skipper-ingress-7d7bc75576-5nzn9 stream=stderr component=ingress application=skipper-ingress namespace=kube-system
goroutine 31240317 [running]:
bufio.(*Writer).Available(...)
/usr/local/go/src/bufio/bufio.go:645
bufio.(*Writer).WriteString(0x0, {0x1118d77?, 0x7fdacf6cf0f8?})
/usr/local/go/src/bufio/bufio.go:735 +0x6a
net/http.(*expectContinueReader).Read(0xc009fd4560, {0xc002833980, 0x5c, 0x5c})
/usr/local/go/src/net/http/server.go:908 +0xfd
io.(*LimitedReader).Read(0xc009fbee28, {0xc002833980?, 0xc0056d3580?, 0xc005368930?})
/usr/local/go/src/io/io.go:476 +0x45
io.copyBuffer({0x129a7a0, 0xc0056d3580}, {0x1298f00, 0xc009fbee28}, {0x0, 0x0, 0x0})
/usr/local/go/src/io/io.go:426 +0x1b2
io.Copy(...)
/usr/local/go/src/io/io.go:385
net.genericReadFrom({0x1299120?, 0xc009f91e00?}, {0x1298f00, 0xc009fbee28})
/usr/local/go/src/net/net.go:662 +0x6a
net.(*TCPConn).readFrom(0xc009f91e00, {0x1298f00, 0xc009fbee28})
/usr/local/go/src/net/tcpsock_posix.go:54 +0x78
net.(*TCPConn).ReadFrom(0xc009f91e00, {0x1298f00?, 0xc009fbee28?})
/usr/local/go/src/net/tcpsock.go:130 +0x36
io.copyBuffer({0x1299120, 0xc009f91e00}, {0x1298f00, 0xc009fbee28}, {0x0, 0x0, 0x0})
/usr/local/go/src/io/io.go:412 +0x14b
io.Copy(...)
/usr/local/go/src/io/io.go:385
net/http.persistConnWriter.ReadFrom({0x438f3c?}, {0x1298f00?, 0xc009fbee28?})
/usr/local/go/src/net/http/transport.go:1775 +0x5b
bufio.(*Writer).ReadFrom(0xc0051a1d40, {0x1298f00, 0xc009fbee28})
/usr/local/go/src/bufio/bufio.go:768 +0x1ab
io.copyBuffer({0x1296cc0, 0xc0051a1d40}, {0x1298f00, 0xc009fbee28}, {0x0, 0x0, 0x0})
/usr/local/go/src/io/io.go:412 +0x14b
io.Copy(...)
/usr/local/go/src/io/io.go:385
net/http.(*transferWriter).doBodyCopy(0xc000d999a0, {0x1296cc0?, 0xc0051a1d40?}, {0x1298f00?, 0xc009fbee28?})
/usr/local/go/src/net/http/transfer.go:411 +0x4d
net/http.(*transferWriter).writeBody(0xc000d999a0, {0x1296cc0, 0xc0051a1d40})
/usr/local/go/src/net/http/transfer.go:369 +0x3d5
net/http.(*Request).write(0xc009880a00, {0x1296cc0, 0xc0051a1d40}, 0x0, 0x0, 0xc009fbede0)
/usr/local/go/src/net/http/request.go:698 +0xb46
net/http.(*persistConn).writeLoop(0xc003a07e60)
/usr/local/go/src/net/http/transport.go:2395 +0x185
created by net/http.(*Transport).dialConn
/usr/local/go/src/net/http/transport.go:1751 +0x1791

What did you expect to see?

no panic withing the stdlib

What did you see instead?

A panic

@seankhliao seankhliao added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jul 12, 2022
@mdlayher mdlayher changed the title affected/package: net/io/bufio bufio: nil pointer dereference Jul 12, 2022
@mvdan
Copy link
Member

mvdan commented Jul 12, 2022

It looks like your program may have a data race causing memory corruption. Please try running your program with -race: https://go.dev/doc/articles/race_detector

@szuecs
Copy link
Author

szuecs commented Jul 12, 2022

@mvdan Where do you see a data race?
@mdlayher are you sure it's a nil ptr dereference in bufio?
I would assume that address would show 0x0 and the message says ".. or nil ptr dereference"

@mvdan
Copy link
Member

mvdan commented Jul 12, 2022

A segmentation violation means you tried to dereference a bad pointer. Unless your code uses unsafe, that is most likely caused by memory corruption, often caused by data races.

@dominikh
Copy link
Member

dominikh commented Jul 13, 2022

I feel the need to clarify @mvdan's statement. An ordinary nil pointer dereference also causes a segmentation violation, and that of course doesn't only happen due to unsafe or data races. But as @szuecs points out, in this instance, the bad pointer has address 0x18, not 0, which would indeed point towards memory corruption of some sort.

The access that panics is in

func (b *Writer) Available() int { return len(b.buf) - b.n }

and I don't see any safe way in which b could be 0x18.

Edit: both @mvdan and I need to freshen up on what nil pointer dereferences can look like. 0x18 is of course the offset of the field being accessed.

@AlexanderYastrebov
Copy link
Contributor

The cause looks to be body read after connection close which nullifies bufw on final flush

c.bufw = nil

The read then fails on attempt to write

w.conn.bufw.WriteString("HTTP/1.1 100 Continue\r\n\r\n")

Could be reproduced via:

package main

import (
	"log"
	"net/http"
	"time"
)

func main() {
	s := &http.Server{
		Addr: ":8888",
		Handler: http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
			go func() {
				time.Sleep(1000 * time.Millisecond)
				r.Body.Read(make([]byte, 1))
			}()
		}),
	}
	log.Fatal(s.ListenAndServe())
}
# trigger use of expectContinueReader
curl -v -H "Expect: 100-continue" -d"test" localhost:8888
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x4fb40a]

goroutine 5 [running]:
bufio.(*Writer).Available(...)
        /usr/lib/go-1.18/src/bufio/bufio.go:645
bufio.(*Writer).WriteString(0x0, {0x666d76?, 0x0?})
        /usr/lib/go-1.18/src/bufio/bufio.go:735 +0x6a
net/http.(*expectContinueReader).Read(0xc00006a040, {0xc00012c27b, 0x1, 0x1})
        /usr/lib/go-1.18/src/net/http/server.go:908 +0xfd
main.main.func1.1()

@szuecs
Copy link
Author

szuecs commented Jul 13, 2022

@AlexanderYastrebov thanks. I also tested with 1.18.4 same panic triggered.

@seankhliao
Copy link
Member

does it trigger with a race detector?

@szuecs
Copy link
Author

szuecs commented Jul 13, 2022

[sszuecs@sandor-lab:~/tmp/go/issue-53808-panic-expect-continue]% go run -race main.go                                                      suess-new
==================
WARNING: DATA RACE
Read at 0x00c000094a58 by goroutine 8:
  net/http.(*expectContinueReader).Read()
      /usr/share/go/src/net/http/server.go:908 +0x12e
  main.main.func1.1()
      /home/sszuecs/tmp/go/issue-53808-panic-expect-continue/main.go:15 +0x87

Previous write at 0x00c000094a58 by goroutine 7:
  net/http.(*conn).finalFlush()
      /usr/share/go/src/net/http/server.go:1698 +0xee
  net/http.(*conn).close()
      /usr/share/go/src/net/http/server.go:1704 +0x2e
  net/http.(*conn).serve.func1()
      /usr/share/go/src/net/http/server.go:1836 +0x2d1
  runtime.deferreturn()
      /usr/share/go/src/runtime/panic.go:436 +0x32
  net/http.(*Server).Serve.func3()
      /usr/share/go/src/net/http/server.go:3071 +0x58

Goroutine 8 (running) created at:
  main.main.func1()
      /home/sszuecs/tmp/go/issue-53808-panic-expect-continue/main.go:13 +0x8e
  net/http.HandlerFunc.ServeHTTP()
      /usr/share/go/src/net/http/server.go:2084 +0x4d
  net/http.serverHandler.ServeHTTP()
      /usr/share/go/src/net/http/server.go:2916 +0x896
  net/http.(*conn).serve()
      /usr/share/go/src/net/http/server.go:1966 +0xbaa
  net/http.(*Server).Serve.func3()
      /usr/share/go/src/net/http/server.go:3071 +0x58

Goroutine 7 (finished) created at:
  net/http.(*Server).Serve()
      /usr/share/go/src/net/http/server.go:3071 +0x80c
  net/http.(*Server).ListenAndServe()
      /usr/share/go/src/net/http/server.go:2968 +0xbc
  main.main()
      /home/sszuecs/tmp/go/issue-53808-panic-expect-continue/main.go:19 +0x104
==================
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x57d99c]

goroutine 18 [running]:
panic({0x7516a0, 0x997010})
        /usr/share/go/src/runtime/panic.go:941 +0x397 fp=0xc00004ce58 sp=0xc00004cd98 pc=0x4668d7
runtime.panicmem(...)
        /usr/share/go/src/runtime/panic.go:220
runtime.sigpanic()
        /usr/share/go/src/runtime/signal_unix.go:818 +0x336 fp=0xc00004cea8 sp=0xc00004ce58 pc=0x47d976
bufio.(*Writer).Available(...)
        /usr/share/go/src/bufio/bufio.go:645
bufio.(*Writer).WriteString(0x0?, {0x79490f?, 0x19?})
        /usr/share/go/src/bufio/bufio.go:735 +0x9c fp=0xc00004cf28 sp=0xc00004cea8 pc=0x57d99c
net/http.(*expectContinueReader).Read(0xc000110020, {0xc0001000af, 0x1, 0x1})
        /usr/share/go/src/net/http/server.go:908 +0x149 fp=0xc00004cfa0 sp=0xc00004cf28 pc=0x709369
main.main.func1.1()
        /home/sszuecs/tmp/go/issue-53808-panic-expect-continue/main.go:15 +0x88 fp=0xc00004cfe0 sp=0xc00004cfa0 pc=0x72c5a8
runtime.goexit()
        /usr/share/go/src/runtime/asm_amd64.s:1571 +0x1 fp=0xc00004cfe8 sp=0xc00004cfe0 pc=0x497141
created by main.main.func1
        /home/sszuecs/tmp/go/issue-53808-panic-expect-continue/main.go:13 +0x8f

yes

@seankhliao
Copy link
Member

To me that example looks like #30580, which right now doesn't really look like the reported issue

@AlexanderYastrebov
Copy link
Contributor

The example #53808 (comment) is clearly wrong as it reuses body after leaving Handler.

The subject software (Skipper) is an HTTP proxy and passes incoming request to the http.Transport RoundTrip.
E.g. in #29321 the problem was that transport goroutine (in ReverseProxy) still used the request body after RoundTrip returned.
I am wondering if there is still a similar edge case in the transport.

@szuecs
Copy link
Author

szuecs commented Jul 13, 2022

I can't easily reproduce the problem with skipper build with -race. I also don't see where we would have a goroutine similar to your example @AlexanderYastrebov in the request/response handling.

@seankhliao seankhliao added WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Jul 29, 2022
@seankhliao seankhliao added this to the Unplanned milestone Aug 20, 2022
@szuecs
Copy link
Author

szuecs commented Aug 26, 2022

@seankhliao what info are you looking for?
I think it's a clear bug as #53808 (comment) pointed out maybe more clear than me.

@seankhliao
Copy link
Member

#53808 (comment) pointed out maybe more clear than me.

That comment (and the one above) points to data corruption which more likely comes from your code rather than anything in the standard library (or we'd see more reports of similar problems).

@szuecs
Copy link
Author

szuecs commented Sep 5, 2022

@seankhliao I have to disagree if the panic traces are only showing up stdlib code. It would be something else if the trigger would be part of our code in the trace.

It's one of the issues you won't see if you would not run hundreds of instances with high traffic. Most people will not follow up if you have a one time occurrence like this.

@seankhliao seankhliao added WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Dec 17, 2022
@gopherbot
Copy link

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

@szuecs
Copy link
Author

szuecs commented Jan 23, 2023

I have no way how to reproduce it but bugs won't disappear if a bot closes them. I think it's not a healthy strategy to have bots closing bugs. The panic doesn't show code related to my project so unlikely to be a bug on our side.

@randall77
Copy link
Contributor

I'll reopen, no problem. But I don't think we have any way to move forward with diagnosing/fixing this bug without more information. Particularly, with a way to reproduce.

@randall77 randall77 reopened this Jan 23, 2023
@mvdan mvdan removed the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jan 23, 2023
@mvdan
Copy link
Member

mvdan commented Jan 23, 2023

To add to what @randall77 said: if you're the only user running into a segmentation violation panic, and there are no steps to reproduce the panic, that usually means that there's a problem with the end user code, like a data race or memory corruption.

You're right that bots closing issues doesn't mean the bug disappears. But leaving the issue open without more information or investigation on your part means that no progress will be made. That's why the "waiting for info" label was added, and the issue automatically closed after some time.

@szuecs
Copy link
Author

szuecs commented Jan 23, 2023

@randall77 @mvdan I agree with you, but if we find another occurrence then it will be easier to correlate if this is open. Finding closed issues is hard because you have to filter all issues that were fixed, too.
I am fine for having a label that can be filtered from Go project point of view, that this does not need to be reviewed until you/we find another bug that looks similar that might be better reproducible.

@randall77
Copy link
Contributor

@szuecs Ok, I'm going to assign you as owner so no one has to look at this until more information arrives.

@seankhliao seankhliao added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jan 28, 2023
@szuecs
Copy link
Author

szuecs commented Sep 21, 2023

Found another occurrence with Go v1.21.1

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x1457a8] 
goroutine 6459559 [running]:
bufio.(*Writer).Available(...)
/usr/local/go/src/bufio/bufio.go:659
bufio.(*Writer).WriteString(0x0, {0x1305a03?, 0x40036b17d8?})
/usr/local/go/src/bufio/bufio.go:752 +0x68
net/http.(*expectContinueReader).Read(0x4002739440, {0x40044f7000, 0x7a8, 0x7a8})
/usr/local/go/src/net/http/server.go:919 +0x140
io.(*LimitedReader).Read(0x4005003878, {0x40044f7000?, 0x4002f1f170?, 0x1ccbc?})
/usr/local/go/src/io/io.go:480 +0x50
io.copyBuffer({0x157e460, 0x4002f1f170}, {0x157dfc0, 0x4005003878}, {0x0, 0x0, 0x0})
/usr/local/go/src/io/io.go:430 +0x168
io.Copy(...)
/usr/local/go/src/io/io.go:389
net.genericReadFrom({0x157e500?, 0x40034727d8?}, {0x157dfc0, 0x4005003878})
/usr/local/go/src/net/net.go:671 +0x68
net.(*TCPConn).readFrom(0x40034727d8, {0x157dfc0, 0x4005003878})
/usr/local/go/src/net/tcpsock_posix.go:54 +0x70
net.(*TCPConn).ReadFrom(0x40034727d8, {0x157dfc0?, 0x4005003878?})
/usr/local/go/src/net/tcpsock.go:130 +0x30
io.copyBuffer({0x157e500, 0x40034727d8}, {0x157dfc0, 0x4005003878}, {0x0, 0x0, 0x0})
/usr/local/go/src/io/io.go:416 +0x108
io.Copy(...)
/usr/local/go/src/io/io.go:389
net/http.persistConnWriter.ReadFrom({0x40036b1ab8?}, {0x157dfc0, 0x4005003878})
/usr/local/go/src/net/http/transport.go:1801 +0x5c
bufio.(*Writer).ReadFrom(0x40036fcc00, {0x157dfc0, 0x4005003878})
/usr/local/go/src/bufio/bufio.go:797 +0x148
io.copyBuffer({0x157e220, 0x40036fcc00}, {0x157dfc0, 0x4005003878}, {0x0, 0x0, 0x0})
/usr/local/go/src/io/io.go:416 +0x108
io.Copy(...)
/usr/local/go/src/io/io.go:389
net/http.(*transferWriter).doBodyCopy(0x4000500be0, {0x157e220?, 0x40036fcc00?}, {0x157dfc0?, 0x4005003878?})
/usr/local/go/src/net/http/transfer.go:412 +0x50
net/http.(*transferWriter).writeBody(0x4000500be0, {0x157e220, 0x40036fcc00})
/usr/local/go/src/net/http/transfer.go:370 +0x344
net/http.(*Request).write(0x400134c600, {0x157e220, 0x40036fcc00}, 0x0, 0x0, 0x40036b1f50)
/usr/local/go/src/net/http/request.go:738 +0x960
net/http.(*persistConn).writeLoop(0x4004bcb560)
/usr/local/go/src/net/http/transport.go:2424 +0x110
created by net/http.(*Transport).dialConn in goroutine 6459539
/usr/local/go/src/net/http/transport.go:1777 +0x118c

@szuecs
Copy link
Author

szuecs commented Mar 13, 2024

We observed this 2 times again (fleet size is 700 instances so it's really rare edge case also given 6 Months later from the last report):

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x156238]
goroutine 129605465 [running]:
bufio.(*Writer).Available(...)
/usr/local/go/src/bufio/bufio.go:656
bufio.(*Writer).WriteString(0x0, {0x13cbe35?, 0x1000180?})
/usr/local/go/src/bufio/bufio.go:749 +0x68
net/http.(*expectContinueReader).Read(0x40035498a0, {0x4005ca67e0, 0x5c, 0x5c})
/usr/local/go/src/net/http/server.go:924 +0x140
io.(*LimitedReader).Read(0x4005bb0408, {0x4005ca67e0?, 0x40005f5998?, 0x20be64?})
/usr/local/go/src/io/io.go:479 +0x50
io.copyBuffer({0xffff6be508f8, 0x400326f508}, {0x165b160, 0x4005bb0408}, {0x0, 0x0, 0x0})
/usr/local/go/src/io/io.go:429 +0x18c
io.Copy(...)
/usr/local/go/src/io/io.go:388
net/http.persistConnWriter.ReadFrom({0x40005f5a58?}, {0x165b160?, 0x4005bb0408?})
/usr/local/go/src/net/http/transport.go:1823 +0x6c
bufio.(*Writer).ReadFrom(0x4008769b40, {0x165b160, 0x4005bb0408})
/usr/local/go/src/bufio/bufio.go:794 +0x140
io.copyBuffer({0x165b460, 0x4008769b40}, {0x165b160, 0x4005bb0408}, {0x4000c68000, 0x8000, 0x8000})
/usr/local/go/src/io/io.go:415 +0x14c
io.CopyBuffer({0x165b460?, 0x4008769b40?}, {0x165b160?, 0x4005bb0408?}, {0x4000c68000?, 0x400051b008?, 0x4005bb0408?})
/usr/local/go/src/io/io.go:402 +0x38
net/http.(*transferWriter).doBodyCopy(0x4000449d60, {0x165b460, 0x4008769b40}, {0x165b160, 0x4005bb0408})
/usr/local/go/src/net/http/transfer.go:416 +0xb4
net/http.(*transferWriter).writeBody(0x4000449d60, {0x165b460, 0x4008769b40})
/usr/local/go/src/net/http/transfer.go:371 +0x308
net/http.(*Request).write(0x4004f85d40, {0x165b460, 0x4008769b40}, 0x0, 0x0, 0x40005f5f20)
/usr/local/go/src/net/http/request.go:755 +0x8f0
net/http.(*persistConn).writeLoop(0x4004f85e60)
/usr/local/go/src/net/http/transport.go:2446 +0x114
created by net/http.(*Transport).dialConn in goroutine 129605454
/usr/local/go/src/net/http/transport.go:1799 +0x1060

@randall77
Copy link
Contributor

@neild
I just looked at this a bit and I think it might have something to do with 100-continue processing.
The server is trying to write 100-continue but the output buffer has already been dropped, as the connection has already been closed? I don't entirely follow the logic, but there's definitely some subtle multithreading going on here.

@AlexanderYastrebov
Copy link
Contributor

AlexanderYastrebov commented Mar 13, 2024

@randall77 I also suspect that this is related to the proxy usecase where incoming request body is sent to the backend and backend response is sent to the server client.

@AlexanderYastrebov
Copy link
Contributor

I've found a reproducer:

package main

import (
	"bytes"
	"io"
	"log"
	"net/http"
	"strings"
	"time"
)

// Broken backend returns a response with a truncated body.
func brokenBackend() {
	http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
		conn, bfw, err := w.(http.Hijacker).Hijack()
		if err != nil {
			log.Fatalf("hijack failed: %v", err)
		}
		bfw.WriteString("HTTP/1.1 200 OK\r\n")
		bfw.WriteString("Content-Length: 10\r\n\r\n")
		// body truncated
		bfw.Flush()

		conn.Close()
	})
	http.ListenAndServe("localhost:8081", nil)
}

// Proxy forwards requests to the broken backend.
func proxy() {
	tr := http.Transport{}
	p := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		pr, _ := http.NewRequestWithContext(r.Context(), r.Method, "http://localhost:8081", r.Body)
		pr.ContentLength = r.ContentLength

		resp, err := tr.RoundTrip(pr)
		if err != nil {
			w.WriteHeader(http.StatusBadGateway)
			return
		}

		w.WriteHeader(resp.StatusCode)
		io.Copy(w, resp.Body)

		resp.Body.Close()
	})
	http.ListenAndServe("localhost:8080", p)
}

func main() {
	log.Println("Starting proxy and backend")

	go brokenBackend()
	go proxy()

	// wait for the servers to start
	time.Sleep(1 * time.Second)

	log.Println("Start sending requests")

	const contentLength = 100
	body := make([]byte, contentLength)
	for {
		req, err := http.NewRequest("PUT", "http://localhost:8080", bytes.NewReader(body))
		if err != nil {
			log.Fatalf("create request failed: %v", err)
		}
		// activate use of *expectContinueReader
		req.Header.Set("Expect", "100-continue")
		// without Content-Length, the proxy will panic
		// at a different place inside of net/http.(*transferWriter).writeBody
		req.ContentLength = contentLength

		resp, err := http.DefaultClient.Do(req)
		if err != nil {
			if strings.Contains(err.Error(), "connection refused") {
				log.Println("Done sending requests")
				break
			}
		} else {
			resp.Body.Close()
		}
	}
}
$ go version
go version go1.22.1 linux/amd64
$ go run main.go
2024/03/13 18:31:52 Starting proxy and backend
2024/03/13 18:31:53 Start sending requests
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x5d2c84]

goroutine 5476 [running]:
bufio.(*Writer).Available(...)
        /redacted/bin/go/src/bufio/bufio.go:656
bufio.(*Writer).WriteString(0x0, {0x6df733?, 0x70?})
        /redacted/bin/go/src/bufio/bufio.go:749 +0x64
net/http.(*expectContinueReader).Read(0xc0003800c0, {0xc0003922a0, 0x64, 0x64})
        /redacted/bin/go/src/net/http/server.go:924 +0xfd
io.(*LimitedReader).Read(0xc000012228, {0xc0003922a0?, 0x4161bf?, 0xc0000e0078?})
        /redacted/bin/go/src/io/io.go:479 +0x43
io.copyBuffer({0x757c20, 0xc0000e0078}, {0x756f60, 0xc000012228}, {0x0, 0x0, 0x0})
        /redacted/bin/go/src/io/io.go:429 +0x191
io.Copy(...)
        /redacted/bin/go/src/io/io.go:388
net.genericReadFrom(0xc0002b0280?, {0x756f60, 0xc000012228})
        /redacted/bin/go/src/net/net.go:689 +0x58
net.(*TCPConn).readFrom(0xc0000e0070, {0x756f60, 0xc000012228})
        /redacted/bin/go/src/net/tcpsock_posix.go:54 +0x65
net.(*TCPConn).ReadFrom(0xc0000e0070, {0x756f60?, 0xc000012228?})
        /redacted/bin/go/src/net/tcpsock.go:130 +0x30
io.copyBuffer({0x7f850dfe78b8, 0xc0000e0070}, {0x756f60, 0xc000012228}, {0x0, 0x0, 0x0})
        /redacted/bin/go/src/io/io.go:415 +0x151
io.Copy(...)
        /redacted/bin/go/src/io/io.go:388
net/http.persistConnWriter.ReadFrom({0x0?}, {0x756f60?, 0xc000012228?})
        /redacted/bin/go/src/net/http/transport.go:1824 +0x5b
bufio.(*Writer).ReadFrom(0xc00047e5c0, {0x756f60, 0xc000012228})
        /redacted/bin/go/src/bufio/bufio.go:794 +0x17c
io.copyBuffer({0x7570e0, 0xc00047e5c0}, {0x756f60, 0xc000012228}, {0xc0003aa000, 0x8000, 0x8000})
        /redacted/bin/go/src/io/io.go:415 +0x151
io.CopyBuffer({0x7570e0?, 0xc00047e5c0?}, {0x756f60?, 0xc000012228?}, {0xc0003aa000?, 0x18?, 0xc00008e808?})
        /redacted/bin/go/src/io/io.go:402 +0x36
net/http.(*transferWriter).doBodyCopy(0xc000388280, {0x7570e0, 0xc00047e5c0}, {0x756f60, 0xc000012228})
        /redacted/bin/go/src/net/http/transfer.go:416 +0xe8
net/http.(*transferWriter).writeBody(0xc000388280, {0x7570e0, 0xc00047e5c0})
        /redacted/bin/go/src/net/http/transfer.go:371 +0x370
net/http.(*Request).write(0xc000386b40, {0x7570e0, 0xc00047e5c0}, 0x0, 0xc000516c00, 0x0)
        /redacted/bin/go/src/net/http/request.go:755 +0xb0d
net/http.(*persistConn).writeLoop(0xc000386c60)
        /redacted/bin/go/src/net/http/transport.go:2447 +0x18e
created by net/http.(*Transport).dialConn in goroutine 5119
        /redacted/bin/go/src/net/http/transport.go:1800 +0x1585
exit status 2

@AlexanderYastrebov
Copy link
Contributor

AlexanderYastrebov commented Mar 13, 2024

I could not reproduce with stdlib reverse proxy

func httputilProxy() {
	target, _ := url.Parse("http://localhost:8081")
	p := httputil.NewSingleHostReverseProxy(target)
	p.ErrorLog = log.New(io.Discard, "", 0)
	http.ListenAndServe("localhost:8080", p)
}

and it looks like because proxy request body is closed before returning from the handler, see https://go-review.googlesource.com/c/go/+/333191 and #46866 (comment)

The reproducer also does not work anymore after I added:

if pr.Body != nil {
	defer pr.Body.Close()
}

AlexanderYastrebov added a commit to zalando/skipper that referenced this issue Mar 14, 2024
Broken backend that closes connection before sending response body
may trigger proxy panic if client sends `Expect: 100-continue` header,
see golang/go#53808

Signed-off-by: Alexander Yastrebov <alexander.yastrebov@zalando.de>
AlexanderYastrebov added a commit to zalando/skipper that referenced this issue Mar 14, 2024
Broken backend that closes connection before sending response body
may trigger proxy panic if client sends `Expect: 100-continue` header,
see golang/go#53808

Signed-off-by: Alexander Yastrebov <alexander.yastrebov@zalando.de>
@randall77 randall77 modified the milestones: Unplanned, Go1.23 Mar 15, 2024
@AlexanderYastrebov
Copy link
Contributor

AlexanderYastrebov commented Mar 18, 2024

The reproducer also does not work anymore after I added pr.Body.Close()

OTOH from https://pkg.go.dev/net/http#Request

// For client requests, a nil body means the request has no
// body, such as a GET request. The HTTP Client's Transport
// is responsible for calling the Close method.

I.e. we have a valid reproducer #53808 (comment) that causes panic.

AlexanderYastrebov added a commit to zalando/skipper that referenced this issue Mar 19, 2024
Broken backend that closes connection before sending response body
may trigger proxy panic if client sends `Expect: 100-continue` header,
see golang/go#53808

Signed-off-by: Alexander Yastrebov <alexander.yastrebov@zalando.de>
@neild
Copy link
Contributor

neild commented Mar 20, 2024

Simpler reproduction case:

func TestServerReadAfterHandlerDone100Continue(t *testing.T) {
        run(t, testServerReadAfterHandlerDone100Continue)
}
func testServerReadAfterHandlerDone100Continue(t *testing.T, mode testMode) {
        readyc := make(chan struct{})
        cst := newClientServerTest(t, mode, HandlerFunc(func(w ResponseWriter, r *Request) {
                go func() {
                        <-readyc
                        io.ReadAll(r.Body)
                        <-readyc
                }()
        }))

        req, _ := NewRequest("GET", cst.ts.URL, strings.NewReader("body"))
        req.Header.Set("Expect", "100-continue")
        res, err := cst.c.Do(req)
        if err != nil {
                t.Fatalf("Get(%q) = %v", cst.ts.URL, err)
        }
        res.Body.Close()
        readyc <- struct{}{} // server starts reading from the request body
        readyc <- struct{}{} // server finishes reading from the request body
}

As a general statement, writing to a response body after a server handler returns can panic.

Reading from a request body when Expect: 100-continue is set can trigger a write, which can cause a panic if the read happens after the server handler has returned. (Reading from the request body after a server handler returns is unsafe for other reasons, but more on that below.)

In the reverse proxy case, the body of an inbound request is being used as the body of an outbound request, RoundTrip does not guarantee that it is done reading from a request body when it returns, and ReverseProxy doesn't take the (rather convoluted) steps necessary to verify that RoundTrip has finished with the body.

ReverseProxy does close the inbound request body before returning, with a comment stating that "in practice, it's safe to read after closing it":
https://go.googlesource.com/go/+/refs/tags/go1.22.1/src/net/http/httputil/reverseproxy.go#369

It does seem to be the case that it's safe to read from a request body after closing it, even after the server handler has returned. I believe what's happening here is a race condition where the outbound RoundTrip is reading from the body at the same time as the ReverseProxy handler is closing it.


I'm not immediately sure what the right fix is.

A fairly limited fix might be to say that closing an inbound request body and returning from a server handler should always be safe, no matter what operations are in progress on that body. This would, I think, fix ReverseProxy. I think this might just require a bit better synchronization between expectContinueHandler.Read and expectContinueHandler.Close.

Or perhaps we should step back a bit and say that operations on request/response bodies after a handler returns should be safe: They can return errors, but they shouldn't panic. This would be a larger change, but would make it harder for misbehaving server handlers to cause crashes. (By redefining current "misbehavior" as "perfectly safe".)

AlexanderYastrebov added a commit to zalando/skipper that referenced this issue Mar 22, 2024
Broken backend that closes connection before sending response body
may trigger proxy panic if client sends `Expect: 100-continue` header,
see golang/go#53808

Signed-off-by: Alexander Yastrebov <alexander.yastrebov@zalando.de>
AlexanderYastrebov added a commit to zalando/skipper that referenced this issue Mar 26, 2024
Broken backend that closes connection before sending response body
may trigger proxy panic if client sends `Expect: 100-continue` header,
see golang/go#53808

Signed-off-by: Alexander Yastrebov <alexander.yastrebov@zalando.de>
@AlexanderYastrebov
Copy link
Contributor

AlexanderYastrebov commented Mar 29, 2024

I think this might just require a bit better synchronization between expectContinueHandler.Read and expectContinueHandler.Close.

Looks like expectContinueReader is never closed, I changed code to:

diff --git a/src/net/http/server.go b/src/net/http/server.go
index 32b4130c22..18bd091b1e 100644
--- a/src/net/http/server.go
+++ b/src/net/http/server.go
@@ -935,8 +935,7 @@ func (ecr *expectContinueReader) Read(p []byte) (n int, err error) {
 }
 
 func (ecr *expectContinueReader) Close() error {
-       ecr.closed.Store(true)
-       return ecr.readCloser.Close()
+       panic("expectContinueReader.Close")
 }

and tests still pass.

This is because server stores and closes reference to the original request body only (added by https://go-review.googlesource.com/c/go/+/16572)

go/src/net/http/server.go

Lines 1673 to 1675 in bb523c9

// Close the body (regardless of w.closeAfterReply) so we can
// re-use its bufio.Reader later safely.
w.reqBody.Close()

If it would close expectContinueReader then I think workaround in ReverseProxy (added by https://go-review.googlesource.com/c/go/+/333191) would not be needed.

Subject expectContinueReader.Read nil pointer panic is caused by nil w.conn.bufw

w.conn.bufw.WriteString("HTTP/1.1 100 Continue\r\n\r\n")

which is nullified by finalFlush
c.bufw = nil

called from either deferred connection c.close

go/src/net/http/server.go

Lines 1894 to 1912 in bb523c9

defer func() {
if err := recover(); err != nil && err != ErrAbortHandler {
const size = 64 << 10
buf := make([]byte, size)
buf = buf[:runtime.Stack(buf, false)]
c.server.logf("http: panic serving %v: %v\n%s", c.remoteAddr, err, buf)
}
if inFlightResponse != nil {
inFlightResponse.cancelCtx()
}
if !c.hijacked() {
if inFlightResponse != nil {
inFlightResponse.conn.r.abortPendingRead()
inFlightResponse.reqBody.Close()
}
c.close()
c.setState(c.rwc, StateClosed, runHooks)
}
}()

or from c.closeWriteAndWait before body is wrapped with by expectContinueReader

go/src/net/http/server.go

Lines 1982 to 1985 in bb523c9

const publicErr = "431 Request Header Fields Too Large"
fmt.Fprintf(c.rwc, "HTTP/1.1 "+publicErr+errorHeaders+publicErr)
c.closeWriteAndWait()
return

or after w.finishRequest() that closes original request body:

go/src/net/http/server.go

Lines 2048 to 2055 in bb523c9

w.finishRequest()
c.rwc.SetWriteDeadline(time.Time{})
if !w.shouldReuseConnection() {
if w.requestBodyLimitHit || w.closedRequestBodyEarly() {
c.closeWriteAndWait()
}
return
}

all of which happen at serve exit.

Maybe the proper fix would be to close expectContinueReader right after w.finishRequest() - before serve exit.

To avoid double closing of original request body expectContinueReader could provide unexported close/finish and have empty Close() (in case handler call req.Close).

Another option is to reset canWriteContinue after w.finishRequest():

if w.canWriteContinue.Load() {
    w.writeContinueMu.Lock()
    w.canWriteContinue.Store(false)
    w.writeContinueMu.Unlock()
}

like its done in couple of places already (added by https://go-review.googlesource.com/c/go/+/242598 and https://go-review.googlesource.com/c/go/+/269997).

AlexanderYastrebov added a commit to AlexanderYastrebov/go that referenced this issue Mar 29, 2024
When client supplies "Expect: 100-continue" header,
server wraps request body into expectContinueReader
that writes 100 Continue status on the first body read.

When handler acts as a reverse proxy and passes incoming
request (or body) to the client (or transport) it may happen
that request body is read after handler exists which may
cause nil pointer panic on connection write if server
already closed the connection.

This change disables write of 100 Continue status by expectContinueReader
after handler exist and before connection is closed.

It also fixes racy access to w.wroteContinue.

Fixes golang#53808
Updates golang#46866
AlexanderYastrebov added a commit to AlexanderYastrebov/go that referenced this issue Mar 29, 2024
When client supplies "Expect: 100-continue" header,
server wraps request body into expectContinueReader
that writes 100 Continue status on the first body read.

When handler acts as a reverse proxy and passes incoming
request (or body) to the client (or transport) it may happen
that request body is read after handler exists which may
cause nil pointer panic on connection write if server
already closed the connection.

This change disables write of 100 Continue status by expectContinueReader
after handler finished and before connection is closed.

It also fixes racy access to w.wroteContinue.

Fixes golang#53808
Updates golang#46866
@gopherbot
Copy link

Change https://go.dev/cl/575196 mentions this issue: net/http: disable 100 continue status after handler finished

@AlexanderYastrebov
Copy link
Contributor

AlexanderYastrebov commented Mar 29, 2024

Another option is to reset canWriteContinue after w.finishRequest()

Implemented this in https://go.dev/cl/575196

AlexanderYastrebov added a commit to AlexanderYastrebov/go that referenced this issue Mar 30, 2024
When client supplies "Expect: 100-continue" header,
server wraps request body into expectContinueReader
that writes 100 Continue status on the first body read.

When handler acts as a reverse proxy and passes incoming
request (or body) to the client (or transport) it may happen
that request body is read after handler exists which may
cause nil pointer panic on connection write if server
already closed the connection.

This change disables write of 100 Continue status by expectContinueReader
after handler finished and before connection is closed.

It also fixes racy access to w.wroteContinue.

Fixes golang#53808
Updates golang#46866
AlexanderYastrebov added a commit to AlexanderYastrebov/go that referenced this issue Mar 30, 2024
When client supplies "Expect: 100-continue" header,
server wraps request body into expectContinueReader
that writes 100 Continue status on the first body read.

When handler acts as a reverse proxy and passes incoming
request (or body) to the client (or transport) it may happen
that request body is read after handler exists which may
cause nil pointer panic on connection write if server
already closed the connection.

This change disables write of 100 Continue status by expectContinueReader
after handler finished and before connection is closed.

It also fixes racy access to w.wroteContinue.

Fixes golang#53808
Updates golang#46866
AlexanderYastrebov added a commit to AlexanderYastrebov/go that referenced this issue Apr 2, 2024
When client supplies "Expect: 100-continue" header,
server wraps request body into expectContinueReader
that writes 100 Continue status on the first body read.

When handler acts as a reverse proxy and passes incoming
request (or body) to the client (or transport) it may happen
that request body is read after handler exists which may
cause nil pointer panic on connection write if server
already closed the connection.

This change disables write of 100 Continue status by expectContinueReader
after handler finished and before connection is closed.

It also fixes racy access to w.wroteContinue.

Fixes golang#53808
Updates golang#46866
AlexanderYastrebov added a commit to AlexanderYastrebov/go that referenced this issue Apr 8, 2024
When client supplies "Expect: 100-continue" header,
server wraps request body into expectContinueReader
that writes 100 Continue status on the first body read.

When handler acts as a reverse proxy and passes incoming
request (or body) to the client (or transport) it may happen
that request body is read after handler exists which may
cause nil pointer panic on connection write if server
already closed the connection.

This change disables write of 100 Continue status by expectContinueReader
after handler finished and before connection is closed.

It also fixes racy access to w.wroteContinue.

Fixes golang#53808
Updates golang#46866
AlexanderYastrebov added a commit to AlexanderYastrebov/go that referenced this issue Apr 8, 2024
When client supplies "Expect: 100-continue" header,
server wraps request body into expectContinueReader
that writes 100 Continue status on the first body read.

When handler acts as a reverse proxy and passes incoming
request (or body) to the client (or transport) it may happen
that request body is read after handler exists which may
cause nil pointer panic on connection write if server
already closed the connection.

This change disables write of 100 Continue status by expectContinueReader
after handler finished and before connection is closed.

It also fixes racy access to w.wroteContinue.

Fixes golang#53808
Updates golang#46866
AlexanderYastrebov added a commit to AlexanderYastrebov/go that referenced this issue Apr 8, 2024
When client supplies "Expect: 100-continue" header,
server wraps request body into expectContinueReader
that writes 100 Continue status on the first body read.

When handler acts as a reverse proxy and passes incoming
request (or body) to the client (or transport) it may happen
that request body is read after handler exists which may
cause nil pointer panic on connection write if server
already closed the connection.

This change disables write of 100 Continue status by expectContinueReader
after handler finished and before connection is closed.

It also fixes racy access to w.wroteContinue.

Fixes golang#53808
Updates golang#46866
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants