Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net, net/http: unpredictable slowness of TestSendfile and TestOnlyWriteTimeout w/ squeezed TCP windows #19032

Open
mikioh opened this issue Feb 11, 2017 · 2 comments
Milestone

Comments

@mikioh
Copy link
Contributor

mikioh commented Feb 11, 2017

This is probably not an issue for the standard library; for the runtime package which retrieves data readable/writable notifications from the kernel, for the net package which invokes read/write system calls, and for the net/http package which adds, drops and manages in-flight connections and IO calls for customer traffic. This happens, as fa as I can observe, on any combination of FreeBSD 10 through 12 and go1.7 through tip.

When we run a script like the following:

#!/usr/bin/env bash

function die() {
        rc=$?
        sudo sysctl -w net.inet.tcp.recvspace=65536
        sudo sysctl -w net.inet.tcp.sendspace=32768
        exit $rc
}

for (( wnd=65536; wnd >= 1024; wnd /= 2 )); do
        sudo sysctl -w net.inet.tcp.recvspace=$wnd
        sudo sysctl -w net.inet.tcp.sendspace=$wnd
        go test -v -run TestSendfile net || die
#        go test -v -run TestOnlyWriteTimeout net/http || die
done

die

TestSendfile takes a bit long time with some window size.

net.inet.tcp.recvspace: 65536 -> 65536
net.inet.tcp.sendspace: 32768 -> 65536
=== RUN   TestSendfile
--- PASS: TestSendfile (0.11s)
net.inet.tcp.recvspace: 65536 -> 32768
net.inet.tcp.sendspace: 65536 -> 32768
=== RUN   TestSendfile
--- PASS: TestSendfile (15.24s)
net.inet.tcp.recvspace: 32768 -> 16384
net.inet.tcp.sendspace: 32768 -> 16384
=== RUN   TestSendfile
--- PASS: TestSendfile (0.21s)
net.inet.tcp.recvspace: 16384 -> 8192
net.inet.tcp.sendspace: 16384 -> 8192
=== RUN   TestSendfile
--- PASS: TestSendfile (83.76s)
net.inet.tcp.recvspace: 8192 -> 4096
net.inet.tcp.sendspace: 8192 -> 4096
=== RUN   TestSendfile
--- PASS: TestSendfile (14.93s)
net.inet.tcp.recvspace: 4096 -> 2048
net.inet.tcp.sendspace: 4096 -> 2048
=== RUN   TestSendfile
--- PASS: TestSendfile (25.12s)
net.inet.tcp.recvspace: 2048 -> 1024
net.inet.tcp.sendspace: 2048 -> 1024
=== RUN   TestSendfile
--- PASS: TestSendfile (45.75s)

With a quick glance at system call trace, it looks like the kevent system call sometimes takes a long breath.

Not sure the reason; sb{r,w}wakeup miss some important condition? tcp_input miscalculates some threshold for sb{r,w}wakeup? tcp over ip over loopback interface does some special short-circuit/fuse stuff? or runtime-integrated network poller is not invoked appropriately? (the last one is unlikely)

Note that I ran the above test under some VM stuff on my '09 laptop and I guess that test results may vary on circumstances.

@mikioh
Copy link
Contributor Author

mikioh commented Feb 11, 2017

One thing to be added; with the script TestOnlyWriteTimeout crashes like the following:

httptest.Server blocked in Close after 5 seconds, waiting for connections:
  *net.TCPConn 0xc42000e0e8 127.0.0.1:54827 in state active
http: panic serving 127.0.0.1:54827: runtime error: invalid memory address or nil pointer dereference
goroutine 12 [running]:
net/http.(*conn).serve.func1(0xc420019b80)
	/go/src/net/http/server.go:1721 +0xd0
panic(0x866020, 0xb08d60)
	/go/src/runtime/panic.go:489 +0x26e
net/http_test.TestOnlyWriteTimeout.func1(0xae37a0, 0xc4200880e0, 0xc420102000)
	/go/src/net/http/serve_test.go:603 +0x18b
net/http.HandlerFunc.ServeHTTP(0xc4200d5280, 0xae37a0, 0xc4200880e0, 0xc420102000)
	/go/src/net/http/server.go:1942 +0x44
net/http.serverHandler.ServeHTTP(0xc420072420, 0xae37a0, 0xc4200880e0, 0xc420102000)
	/go/src/net/http/server.go:2572 +0x92
net/http.(*conn).serve(0xc420019b80, 0xae4060, 0xc42004b640)
	/go/src/net/http/server.go:1825 +0x6b7
created by net/http.(*Server).Serve
	/go/src/net/http/server.go:2672 +0x2ce
--- FAIL: TestOnlyWriteTimeout (15.54s)
	serve_test.go:631: timeout waiting for Get error
FAIL
exit status 1
FAIL	net/http	15.564s

@gopherbot
Copy link

CL https://golang.org/cl/36735 mentions this issue.

@bradfitz bradfitz added this to the Unplanned milestone Mar 21, 2017
gopherbot pushed a commit that referenced this issue Mar 22, 2017
TestOnlyWriteTimeout assumes wrongly that:
- the Accept method of trackLastConnListener is called only once
- the shared variable conn never becomes nil
and crashes on some circumstances.

Updates #19032.

Change-Id: I61de22618cd90b84a2b6401afdb6e5d9b3336b12
Reviewed-on: https://go-review.googlesource.com/36735
Run-TryBot: Mikio Hara <mikioh.mikioh@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants