Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: TestServerConnState is flaky #32329

Closed
bcmills opened this issue May 30, 2019 · 13 comments
Closed

net/http: TestServerConnState is flaky #32329

bcmills opened this issue May 30, 2019 · 13 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Testing An issue that has been verified to require only test changes, not just a test failure. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented May 30, 2019

dragonfly-amd64: https://build.golang.org/log/c79adae2958dc28dd62ca163328fd6dc526ea37f

2019/05/27 18:04:00 http: TLS handshake error from 127.0.0.1:53789: read tcp 127.0.0.1:59376->127.0.0.1:53789: use of closed network connection
2019/05/27 18:04:00 http: TLS handshake error from 127.0.0.1:50713: read tcp 127.0.0.1:59376->127.0.0.1:50713: use of closed network connection
2019/05/27 18:04:00 http: TLS handshake error from 127.0.0.1:56210: read tcp 127.0.0.1:59376->127.0.0.1:56210: use of closed network connection
2019/05/27 18:04:00 http: TLS handshake error from 127.0.0.1:53175: read tcp 127.0.0.1:59376->127.0.0.1:53175: use of closed network connection
2019/05/27 18:04:00 http: TLS handshake error from 127.0.0.1:59361: write tcp 127.0.0.1:59376->127.0.0.1:59361: use of closed network connection
--- FAIL: TestServerConnState (1.00s)
    serve_test.go:4256: Unexpected events.
        Got log:
        Conn 1: new active idle active 
        Conn 2: new active idle active closed 
        Conn 3: new active hijacked 
        Conn 4: new active hijacked 
        Conn 5: new closed 
        Conn 6: new active closed 
        Conn 7: new active idle closed 
        
           Want:
        Conn 1: new active idle active closed 
        Conn 2: new active idle active closed 
        Conn 3: new active hijacked 
        Conn 4: new active hijacked 
        Conn 5: new closed 
        Conn 6: new active closed 
        Conn 7: new active idle closed 
        
FAIL
FAIL	net/http	12.507s

CC @bradfitz

@bcmills bcmills added Testing An issue that has been verified to require only test changes, not just a test failure. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels May 30, 2019
@bcmills bcmills added this to the Go1.13 milestone May 30, 2019
@bradfitz
Copy link
Contributor

Maybe the test is wrong, but Dragonfly network tests in general are flaky (#29583 (comment)) so I'm not super eager to debug this until I've seen it fail on another OS.

@andybons andybons modified the milestones: Go1.13, Go1.14 Jul 8, 2019
@bcmills
Copy link
Contributor Author

bcmills commented Oct 1, 2019

until I've seen it fail on another OS.

solaris-amd64-smartosbuildlet, with the same failure mode: https://build.golang.org/log/44bf73c2404dd6260a9d545b0e5f3a291e399b00

@bcmills bcmills changed the title net/http: TestServerConnState flake on dragonfly-amd64 net/http: TestServerConnState is flaky Oct 1, 2019
@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
@bcmills
Copy link
Contributor Author

bcmills commented Oct 25, 2019

@hbhasker
Copy link

hbhasker commented Dec 9, 2019

just as an FYI. This also flakes under gVisor (https://github.com/google/gvisor). But the failure is pretty consistent. The states for connection 5 and 6 are swapped . This might totally be due to difference in implementation of gvisor network stack vs linux but I think given that this test is flaky on other OS as well it probably is depending on some implementation detail of linux.

I will try and debug to see if I can understand why this happens under gVisor as we do want to be linux compliant.

--- FAIL: TestServerConnState (0.52s)
serve_test.go:4250: Unexpected events.
Got log:
Conn 1: new active idle active closed
Conn 2: new active idle active closed
Conn 3: new active hijacked
Conn 4: new active hijacked
Conn 5: new active closed
Conn 6: new closed
Conn 7: new active idle closed

       Want:
    Conn 1: new active idle active closed
    Conn 2: new active idle active closed
    Conn 3: new active hijacked
    Conn 4: new active hijacked
    Conn 5: new closed
    Conn 6: new active closed
    Conn 7: new active idle closed

@bcmills
Copy link
Contributor Author

bcmills commented Dec 9, 2019

There appear to be multiple distinct failure modes, resulting in different diffs between the traces.

Also note that the crypto/tls flakes on Dragonfly seem to be confined to net.Dial, and have been worked around in the crypto/tls test (by retrying net.Dial errors that correspond to known flakes).

2019-12-06T03:14:41-5fd7581/freebsd-amd64-race
2019-11-23T00:06:55-688aa74/dragonfly-amd64
2019-11-05T00:00:43-ea0b4e7/dragonfly-amd64
2019-11-01T20:45:04-d0160bc/dragonfly-amd64
2019-10-29T20:10:15-1fb7d54/dragonfly-amd64
2019-10-29T19:58:24-cc47b0d/dragonfly-amd64-5_6
2019-10-28T17:28:12-503bccb/dragonfly-amd64
2019-10-25T23:25:31-1b79afe/dragonfly-amd64
2019-10-25T20:46:57-46e0d72/dragonfly-amd64
2019-10-24T03:34:24-a52c0a1/dragonfly-amd64-5_6
2019-10-21T23:36:02-a59808e/dragonfly-amd64-5_6
2019-10-17T19:53:52-9b80791/dragonfly-amd64
2019-10-01T07:16:47-4f13a9c/solaris-amd64-smartosbuildlet
2019-09-26T18:27:38-87e2b34/plan9-arm
2019-09-11T20:34:54-d5df4d6/dragonfly-amd64
2019-09-11T03:32:35-904f046/dragonfly-amd64
2019-09-06T12:15:16-c7dc5e9/dragonfly-amd64
2019-09-04T15:53:59-7b294cd/dragonfly-amd64
2019-09-04T15:10:03-033299f/dragonfly-amd64
2019-09-02T09:45:08-9c295bd/dragonfly-amd64
2019-08-30T17:07:53-9d480ed/dragonfly-amd64
2019-08-28T19:47:04-515bb01/dragonfly-amd64
2019-08-28T04:44:32-e7a4ab4/dragonfly-amd64
2019-08-01T15:39:45-ea8b0ac/dragonfly-amd64
2019-07-14T20:15:47-89d300b/dragonfly-amd64
2019-07-09T19:21:43-84fce98/dragonfly-amd64
2019-06-26T19:48:42-9caaac2/dragonfly-amd64
2019-06-23T22:20:39-c290cb6/dragonfly-amd64
2019-06-05T17:53:53-e9a136d/dragonfly-amd64
2019-06-04T17:35:20-fff4f59/dragonfly-amd64
2019-05-27T17:58:39-1531192/dragonfly-amd64
2019-05-17T19:17:29-5c3f3fb/dragonfly-amd64
2019-05-16T22:00:47-4e7bef8/dragonfly-amd64
2019-05-14T15:01:49-1956b28/dragonfly-amd64
2019-05-03T18:41:45-1033065/dragonfly-amd64
2019-05-02T14:47:22-f03b333/dragonfly-amd64
2019-04-30T19:22:08-c66ab9b/dragonfly-amd64
2019-04-27T18:05:25-4abf359/dragonfly-amd64

@hbhasker
Copy link

hbhasker commented Dec 9, 2019

In case of gVisor I believe I understand what's going on. What happens is connection 6 get's delivered and accepted sometimes before connection 5 (see strace logs below accept4() clearly shows that connection 6 was delivered before 5). In gVisor this can happen because each connection runs in a goroutine and the scheduling of which can sometimes cause this reordering. So even though handshake was completed for 5 before it got delivered after 6. This is what causes the reordering.

I believe this reordering is not possible on linux loopback and is probably why the test is consistent. I believe other failures might be similarly related due to a different ordering of how connections are delivered/accepted.

In case of other OSes it's possible there are slight differences in how connections are completed and delivered. An strace w/ some tcpdump traces of the tests would be useful to debug it.

A better solution to deflake this would just be to break it up into separate tests then it should not matter what order the connections are accepted.

// New->Closed
{
	c, err := net.Dial("tcp", ts.Listener.Addr().String())
	if err != nil {
		t.Fatal(err)
	}
	c.Close()
}

// New->Active->Closed
{
	c, err := net.Dial("tcp", ts.Listener.Addr().String())
	if err != nil {
		t.Fatal(err)
	}
	if _, err := io.WriteString(c, "BOGUS REQUEST\r\n\r\n"); err != nil {
		t.Fatal(err)
	}
	c.Read(make([]byte, 1)) // block until server hangs up on us
	c.Close()
}

I1209 21:51:32.865461 1 x:0] send tcp 127.0.0.1:26122 -> 127.0.0.1:27015 len:0 id:0000 flags:0x02 ( S ) seqnum: 4281293669 ack: 0 win: 65408 xsum:0x0 options: {MSS:65495 WS:7 TS:true TSVal:1110474215 TSEcr:0 SACKPermitted:true}
I1209 21:51:32.865647 1 x:0] recv tcp 127.0.0.1:26122 -> 127.0.0.1:27015 len:0 id:0000 flags:0x02 ( S ) seqnum: 4281293669 ack: 0 win: 65408 xsum:0x0 options: {MSS:65495 WS:7 TS:true TSVal:1110474215 TSEcr:0 SACKPermitted:true}
I1209 21:51:32.865666 1 x:0] [ 1687] http.test X madvise(0xc000a38000, 0x12000, 0x4) = 0x0 (61.175µs)
I1209 21:51:32.865705 1 x:0] [ 1687] http.test E futex(0xc000050f48, FUTEX_WAKE|FUTEX_PRIVATE_FLAG, 0x1, null, 0x0, 0x0)
I1209 21:51:32.865723 1 x:0] [ 1687] http.test X futex(0xc000050f48, FUTEX_WAKE|FUTEX_PRIVATE_FLAG, 0x1, null, 0x0, 0x0) = 0x0 (1.174µs)
I1209 21:51:32.865344 1 x:0] [ 1697] http.test E epoll_pwait(0x5, 0x7f5ad9fff738, 0x80, 0xffffffff, null, 0x3)
I1209 21:51:32.865595 1 x:0] [ 1691] http.test E futex(0xc000050f48, FUTEX_WAIT|FUTEX_PRIVATE_FLAG, 0x0, null, 0x0, 0x0)
I1209 21:51:32.865766 1 x:0] [ 1687] http.test E futex(0xdbe720, FUTEX_WAIT|FUTEX_PRIVATE_FLAG, 0x0, 0xc000660ed8 {sec=0 nsec=1440985}, 0x0, 0x0)
I1209 21:51:32.865774 1 x:0] [ 1691] http.test X futex(0xc000050f48, FUTEX_WAIT|FUTEX_PRIVATE_FLAG, 0x0, null, 0x0, 0x0) = 0x0 errno=11 (resource temporarily unavailable) (3.568µs)
I1209 21:51:32.865804 1 x:0] [ 1688] http.test E nanosleep(0x7f5ae3dffdd8 {sec=0 nsec=20000}, 0x0)
I1209 21:51:32.865808 1 x:0] [ 1691] http.test E futex(0xc000050f48, FUTEX_WAIT|FUTEX_PRIVATE_FLAG, 0x0, null, 0x0, 0x0)
I1209 21:51:32.865744 1 x:0] send tcp 127.0.0.1:27015 -> 127.0.0.1:26122 len:0 id:0000 flags:0x12 ( S A ) seqnum: 2833956016 ack: 4281293670 win: 65408 xsum:0x0 options: {MSS:65495 WS:7 TS:true TSVal:2278112982 TSEcr:1110474215 SACKPermitted:true}
I1209 21:51:32.865855 1 x:0] recv tcp 127.0.0.1:27015 -> 127.0.0.1:26122 len:0 id:0000 flags:0x12 ( S A ) seqnum: 2833956016 ack: 4281293670 win: 65408 xsum:0x0 options: {MSS:65495 WS:7 TS:true TSVal:2278112982 TSEcr:1110474215 SACKPermitted:true}
I1209 21:51:32.865910 1 x:0] send tcp 127.0.0.1:26122 -> 127.0.0.1:27015 len:0 id:0000 flags:0x10 ( A ) seqnum: 4281293670 ack: 2833956017 win: 511 xsum:0x0 options: {TS:true TSVal:1110474215 TSEcr:2278112982 SACKBlocks:[]}
I1209 21:51:32.865920 1 x:0] [ 1688] http.test X nanosleep(0x7f5ae3dffdd8 {sec=0 nsec=20000}, null) = 0x0 (98.306µs)
I1209 21:51:32.865952 1 x:0] [ 1688] http.test E futex(0xdba750, FUTEX_WAIT|FUTEX_PRIVATE_FLAG, 0x0, 0x7f5ae3dffd78 {sec=60 nsec=0}, 0x0, 0x0)
I1209 21:51:32.865923 1 x:0] recv tcp 127.0.0.1:26122 -> 127.0.0.1:27015 len:0 id:0000 flags:0x10 ( A ) seqnum: 4281293670 ack: 2833956017 win: 511 xsum:0x0 options: {TS:true TSVal:1110474215 TSEcr:2278112982 SACKBlocks:[]}
I1209 21:51:32.865993 1 x:0] [ 1697] http.test X epoll_pwait(0x5, 0x7f5ad9fff738, 0x80, 0xffffffff, null, 0x3) = 0x1 (237.1µs)
I1209 21:51:32.866018 1 x:0] [ 1697] http.test E futex(0xdba750, FUTEX_WAKE|FUTEX_PRIVATE_FLAG, 0x1, null, 0x0, 0x0)
I1209 21:51:32.866029 1 x:0] [ 1697] http.test X futex(0xdba750, FUTEX_WAKE|FUTEX_PRIVATE_FLAG, 0x1, null, 0x0, 0x0) = 0x1 (1.227µs)
I1209 21:51:32.866058 1 x:0] [ 1697] http.test E getsockopt(0x4 socket:[4540], 0x1, 0x4, 0xc0001373dc, 0xc0001373d8)
I1209 21:51:32.866072 1 x:0] [ 1697] http.test X getsockopt(0x4 socket:[4540], 0x1, 0x4, 0xc0001373dc, 0xc0001373d8) = 0x0 (4.746µs)
I1209 21:51:32.866095 1 x:0] [ 1697] http.test E getpeername(0x4 socket:[4540], 0xc000137370, 0xc00013736c {length=112})
I1209 21:51:32.866112 1 x:0] [ 1697] http.test X getpeername(0x4 socket:[4540], 0xc000137370 {Family: AF_INET, Addr: 127.0.0.1, Port: 27015}, 0xc00013736c {length=16}) = 0x0 (3.861µs)
I1209 21:51:32.866135 1 x:0] [ 1697] http.test E getsockname(0x4 socket:[4540], 0xc0001374d0, 0xc0001374cc {length=112})
I1209 21:51:32.866151 1 x:0] [ 1697] http.test X getsockname(0x4 socket:[4540], 0xc0001374d0 {Family: AF_INET, Addr: 127.0.0.1, Port: 26122}, 0xc0001374cc {length=16}) = 0x0 (3.139µs)
I1209 21:51:32.866175 1 x:0] [ 1697] http.test E setsockopt(0x4 socket:[4540], 0x6, 0x1, 0xc0001376dc, 0x4)
I1209 21:51:32.866188 1 x:0] [ 1697] http.test X setsockopt(0x4 socket:[4540], 0x6, 0x1, 0xc0001376dc, 0x4) = 0x0 (3.31µs)
I1209 21:51:32.866209 1 x:0] [ 1697] http.test E setsockopt(0x4 socket:[4540], 0x1, 0x9, 0xc0001379c4, 0x4)
I1209 21:51:32.866221 1 x:0] [ 1697] http.test X setsockopt(0x4 socket:[4540], 0x1, 0x9, 0xc0001379c4, 0x4) = 0x0 (2.931µs)
I1209 21:51:32.866241 1 x:0] [ 1697] http.test E setsockopt(0x4 socket:[4540], 0x6, 0x5, 0xc0001379ac, 0x4)
I1209 21:51:32.866253 1 x:0] [ 1697] http.test X setsockopt(0x4 socket:[4540], 0x6, 0x5, 0xc0001379ac, 0x4) = 0x0 (2.92µs)
I1209 21:51:32.866272 1 x:0] [ 1697] http.test E setsockopt(0x4 socket:[4540], 0x6, 0x4, 0xc0001379ac, 0x4)
I1209 21:51:32.866284 1 x:0] [ 1697] http.test X setsockopt(0x4 socket:[4540], 0x6, 0x4, 0xc0001379ac, 0x4) = 0x0 (2.686µs)
I1209 21:51:32.866306 1 x:0] [ 1697] http.test E epoll_ctl(0x5, 0x2, 0x4 socket:[4540], 0xc000137bcc)
I1209 21:51:32.866318 1 x:0] [ 1697] http.test X epoll_ctl(0x5, 0x2, 0x4 socket:[4540], 0xc000137bcc) = 0x0 (3.156µs)
I1209 21:51:32.866336 1 x:0] [ 1697] http.test E close(0x4 socket:[4540])
I1209 21:51:32.866357 1 x:0] send tcp 127.0.0.1:26122 -> 127.0.0.1:27015 len:0 id:0000 flags:0x11 (F A ) seqnum: 4281293670 ack: 2833956017 win: 8192 xsum:0x0 options: {TS:true TSVal:1110474216 TSEcr:2278112982 SACKBlocks:[]}
I1209 21:51:32.866368 1 x:0] recv tcp 127.0.0.1:26122 -> 127.0.0.1:27015 len:0 id:0000 flags:0x11 (F A ) seqnum: 4281293670 ack: 2833956017 win: 8192 xsum:0x0 options: {TS:true TSVal:1110474216 TSEcr:2278112982 SACKBlocks:[]}
I1209 21:51:32.866392 1 x:0] [ 1697] http.test X close(0x4 socket:[4540]) = 0x0 (44.855µs)
I1209 21:51:32.866427 1 x:0] [ 1697] http.test E socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP)
I1209 21:51:32.866454 1 x:0] [ 1697] http.test X socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 0x4 (17.305µs)
I1209 21:51:32.866475 1 x:0] [ 1697] http.test E setsockopt(0x4 socket:[4541], 0x1, 0x6, 0xc000137618, 0x4)
I1209 21:51:32.866486 1 x:0] [ 1697] http.test X setsockopt(0x4 socket:[4541], 0x1, 0x6, 0xc000137618, 0x4) = 0x0 (2.983µs)
I1209 21:51:32.866512 1 x:0] [ 1697] http.test E connect(0x4 socket:[4541], 0xc0000d428c {Family: AF_INET, Addr: 127.0.0.1, Port: 27015}, 0x10)
I1209 21:51:32.866533 1 x:0] [ 1697] http.test X connect(0x4 socket:[4541], 0xc0000d428c {Family: AF_INET, Addr: 127.0.0.1, Port: 27015}, 0x10) = 0x0 errno=115 (operation now in progress) (11.551µs)
I1209 21:51:32.866554 1 x:0] [ 1697] http.test E epoll_ctl(0x5, 0x1, 0x4 socket:[4541], 0xc000137334)
I1209 21:51:32.866568 1 x:0] [ 1697] http.test X epoll_ctl(0x5, 0x1, 0x4 socket:[4541], 0xc000137334) = 0x0 (4.921µs)
I1209 21:51:32.866588 1 x:0] [ 1697] http.test E epoll_pwait(0x5, 0x7f5ad9fff748, 0x80, 0x0, null, 0x1)
I1209 21:51:32.866598 1 x:0] [ 1697] http.test X epoll_pwait(0x5, 0x7f5ad9fff748, 0x80, 0x0, null, 0x1) = 0x0 (908ns)
I1209 21:51:32.866624 1 x:0] [ 1697] http.test E epoll_pwait(0x5, 0x7f5ad9fff748, 0x80, 0xffffffff, null, 0x1)
I1209 21:51:32.866675 1 x:0] send tcp 127.0.0.1:26123 -> 127.0.0.1:27015 len:0 id:0000 flags:0x02 ( S ) seqnum: 2424997976 ack: 0 win: 65408 xsum:0x0 options: {MSS:65495 WS:7 TS:true TSVal:3538454480 TSEcr:0 SACKPermitted:true}
I1209 21:51:32.866687 1 x:0] recv tcp 127.0.0.1:26123 -> 127.0.0.1:27015 len:0 id:0000 flags:0x02 ( S ) seqnum: 2424997976 ack: 0 win: 65408 xsum:0x0 options: {MSS:65495 WS:7 TS:true TSVal:3538454480 TSEcr:0 SACKPermitted:true}
I1209 21:51:32.866754 1 x:0] send tcp 127.0.0.1:27015 -> 127.0.0.1:26123 len:0 id:0000 flags:0x12 ( S A ) seqnum: 3903942871 ack: 2424997977 win: 65408 xsum:0x0 options: {MSS:65495 WS:7 TS:true TSVal:2326455496 TSEcr:3538454480 SACKPermitted:true}
I1209 21:51:32.866783 1 x:0] recv tcp 127.0.0.1:27015 -> 127.0.0.1:26123 len:0 id:0000 flags:0x12 ( S A ) seqnum: 3903942871 ack: 2424997977 win: 65408 xsum:0x0 options: {MSS:65495 WS:7 TS:true TSVal:2326455496 TSEcr:3538454480 SACKPermitted:true}
I1209 21:51:32.866816 1 x:0] send tcp 127.0.0.1:26123 -> 127.0.0.1:27015 len:0 id:0000 flags:0x10 ( A ) seqnum: 2424997977 ack: 3903942872 win: 511 xsum:0x0 options: {TS:true TSVal:3538454480 TSEcr:2326455496 SACKBlocks:[]}
I1209 21:51:32.866826 1 x:0] recv tcp 127.0.0.1:26123 -> 127.0.0.1:27015 len:0 id:0000 flags:0x10 ( A ) seqnum: 2424997977 ack: 3903942872 win: 511 xsum:0x0 options: {TS:true TSVal:3538454480 TSEcr:2326455496 SACKBlocks:[]}
I1209 21:51:32.866847 1 x:0] [ 1697] http.test X epoll_pwait(0x5, 0x7f5ad9fff748, 0x80, 0xffffffff, null, 0x1) = 0x1 (213.659µs)
I1209 21:51:32.866871 1 x:0] [ 1697] http.test E getsockopt(0x4 socket:[4541], 0x1, 0x4, 0xc0001373dc, 0xc0001373d8)
I1209 21:51:32.866885 1 x:0] [ 1697] http.test X getsockopt(0x4 socket:[4541], 0x1, 0x4, 0xc0001373dc, 0xc0001373d8) = 0x0 (4.632µs)
I1209 21:51:32.866907 1 x:0] [ 1697] http.test E getpeername(0x4 socket:[4541], 0xc000137370, 0xc00013736c {length=112})
I1209 21:51:32.866924 1 x:0] [ 1697] http.test X getpeername(0x4 socket:[4541], 0xc000137370 {Family: AF_INET, Addr: 127.0.0.1, Port: 27015}, 0xc00013736c {length=16}) = 0x0 (3.461µs)
I1209 21:51:32.866946 1 x:0] [ 1697] http.test E getsockname(0x4 socket:[4541], 0xc0001374d0, 0xc0001374cc {length=112})
I1209 21:51:32.866962 1 x:0] [ 1697] http.test X getsockname(0x4 socket:[4541], 0xc0001374d0 {Family: AF_INET, Addr: 127.0.0.1, Port: 26123}, 0xc0001374cc {length=16}) = 0x0 (3.037µs)
I1209 21:51:32.866985 1 x:0] [ 1697] http.test E setsockopt(0x4 socket:[4541], 0x6, 0x1, 0xc0001376dc, 0x4)
I1209 21:51:32.866998 1 x:0] [ 1697] http.test X setsockopt(0x4 socket:[4541], 0x6, 0x1, 0xc0001376dc, 0x4) = 0x0 (4.077µs)
I1209 21:51:32.867019 1 x:0] [ 1697] http.test E setsockopt(0x4 socket:[4541], 0x1, 0x9, 0xc0001379c4, 0x4)
I1209 21:51:32.867030 1 x:0] [ 1697] http.test X setsockopt(0x4 socket:[4541], 0x1, 0x9, 0xc0001379c4, 0x4) = 0x0 (2.839µs)
I1209 21:51:32.867050 1 x:0] [ 1697] http.test E setsockopt(0x4 socket:[4541], 0x6, 0x5, 0xc0001379ac, 0x4)
I1209 21:51:32.867062 1 x:0] [ 1697] http.test X setsockopt(0x4 socket:[4541], 0x6, 0x5, 0xc0001379ac, 0x4) = 0x0 (2.718µs)
I1209 21:51:32.867083 1 x:0] [ 1697] http.test E setsockopt(0x4 socket:[4541], 0x6, 0x4, 0xc0001379ac, 0x4)
I1209 21:51:32.867094 1 x:0] [ 1697] http.test X setsockopt(0x4 socket:[4541], 0x6, 0x4, 0xc0001379ac, 0x4) = 0x0 (2.536µs)
I1209 21:51:32.867120 1 x:0] [ 1697] http.test E write(0x4 socket:[4541], 0xc0000d4300 "BOGUS REQUEST\r\n\r\n", 0x11)
I1209 21:51:32.867138 1 x:0] send tcp 127.0.0.1:26123 -> 127.0.0.1:27015 len:17 id:1531 flags:0x18 ( PA ) seqnum: 2424997977 ack: 3903942872 win: 8192 xsum:0x0 options: {TS:true TSVal:3538454481 TSEcr:2326455496 SACKBlocks:[]}
I1209 21:51:32.867149 1 x:0] recv tcp 127.0.0.1:26123 -> 127.0.0.1:27015 len:17 id:1531 flags:0x18 ( PA ) seqnum: 2424997977 ack: 3903942872 win: 8192 xsum:0x0 options: {TS:true TSVal:3538454481 TSEcr:2326455496 SACKBlocks:[]}
I1209 21:51:32.867171 1 x:0] [ 1697] http.test X write(0x4 socket:[4541], ..., 0x11) = 0x11 (42.47µs)
I1209 21:51:32.867192 1 x:0] [ 1697] http.test E read(0x4 socket:[4541], 0xc000168158, 0x1)
I1209 21:51:32.867204 1 x:0] [ 1697] http.test X read(0x4 socket:[4541], 0xc000168158, 0x1) = 0x0 errno=11 (request would block) (2.817µs)
I1209 21:51:32.867224 1 x:0] [ 1697] http.test E epoll_pwait(0x5, 0x7f5ad9fff748, 0x80, 0x0, null, 0x0)
I1209 21:51:32.867235 1 x:0] [ 1697] http.test X epoll_pwait(0x5, 0x7f5ad9fff748, 0x80, 0x0, null, 0x0) = 0x0 (884ns)
I1209 21:51:32.867254 1 x:0] [ 1697] http.test E epoll_pwait(0x5, 0x7f5ad9fff748, 0x80, 0xffffffff, null, 0x0)
I1209 21:51:32.867282 1 x:0] [ 1697] http.test X epoll_pwait(0x5, 0x7f5ad9fff748, 0x80, 0xffffffff, null, 0x0) = 0x1 (17.186µs)
I1209 21:51:32.867282 1 x:0] [ 1688] http.test X futex(0xdba750, FUTEX_WAIT|FUTEX_PRIVATE_FLAG, 0x0, 0x7f5ae3dffd78 {sec=60 nsec=0}, 0x0, 0x0) = 0x0 (1.314134ms)
I1209 21:51:32.867315 1 x:0] [ 1697] http.test E accept4(0x3 socket:[4531], 0xc000170ab0, 0xc000170a94 {length=112}, SOCK_CLOEXEC|SOCK_NONBLOCK)
I1209 21:51:32.867349 1 x:0] [ 1697] http.test X accept4(0x3 socket:[4531], 0xc000170ab0 {Family: AF_INET, Addr: 127.0.0.1, Port: 26123}, 0xc000170a94 {length=16}, SOCK_CLOEXEC|SOCK_NONBLOCK) = 0x6 (13.501µs)
I1209 21:51:32.867354 1 x:0] send tcp 127.0.0.1:27015 -> 127.0.0.1:26122 len:0 id:0000 flags:0x10 ( A ) seqnum: 2833956017 ack: 4281293671 win: 8192 xsum:0x0 options: {TS:true TSVal:2278112984 TSEcr:1110474215 SACKBlocks:[]}
I1209 21:51:32.867368 1 x:0] send tcp 127.0.0.1:27015 -> 127.0.0.1:26123 len:0 id:0000 flags:0x10 ( A ) seqnum: 3903942872 ack: 2424997994 win: 8191 xsum:0x0 options: {TS:true TSVal:2326455497 TSEcr:3538454481 SACKBlocks:[]}
I1209 21:51:32.867386 1 x:0] [ 1697] http.test E epoll_ctl(0x5, 0x1, 0x6 socket:[4542], 0xc000170b94)
I1209 21:51:32.867395 1 x:0] recv tcp 127.0.0.1:27015 -> 127.0.0.1:26123 len:0 id:0000 flags:0x10 ( A ) seqnum: 3903942872 ack: 2424997994 win: 8191 xsum:0x0 options: {TS:true TSVal:2326455497 TSEcr:3538454481 SACKBlocks:[]}
I1209 21:51:32.867410 1 x:0] [ 1697] http.test X epoll_ctl(0x5, 0x1, 0x6 socket:[4542], 0xc000170b94) = 0x0 (7.202µs)
I1209 21:51:32.867351 1 x:0] [ 1688] http.test E nanosleep(0x7f5ae3dffdd8 {sec=0 nsec=20000}, 0x0)
I1209 21:51:32.867441 1 x:0] [ 1697] http.test E getsockname(0x6 socket:[4542], 0xc000170bd0, 0xc000170bcc {length=112})
I1209 21:51:32.867385 1 x:0] recv tcp 127.0.0.1:27015 -> 127.0.0.1:26122 len:0 id:0000 flags:0x10 ( A ) seqnum: 2833956017 ack: 4281293671 win: 8192 xsum:0x0 options: {TS:true TSVal:2278112984 TSEcr:1110474215 SACKBlocks:[]}
I1209 21:51:32.867349 1 x:0] [ 1687] http.test X futex(0xdbe720, FUTEX_WAIT|FUTEX_PRIVATE_FLAG, 0x0, 0xc000660ed8 {sec=0 nsec=1440985}, 0x0, 0x0) = 0x0 errno=110 (connection timed out) (1.562874ms)
I1209 21:51:32.867460 1 x:0] [ 1697] http.test X getsockname(0x6 socket:[4542], 0xc000170bd0 {Family: AF_INET, Addr: 127.0.0.1, Port: 27015}, 0xc000170bcc {length=16}) = 0x0 (3.256µs)
I1209 21:51:32.867547 1 x:0] [ 1687] http.test E futex(0xc000050f48, FUTEX_WAKE|FUTEX_PRIVATE_FLAG, 0x1, null, 0x0, 0x0)
I1209 21:51:32.867565 1 x:0] [ 1687] http.test X futex(0xc000050f48, FUTEX_WAKE|FUTEX_PRIVATE_FLAG, 0x1, null, 0x0, 0x0) = 0x1 (1.724µs)
I1209 21:51:32.867579 1 x:0] [ 1688] http.test X nanosleep(0x7f5ae3dffdd8 {sec=0 nsec=20000}, null) = 0x0 (132.134µs)
I1209 21:51:32.867592 1 x:0] [ 1687] http.test E madvise(0xc000a36000, 0x2000, 0x4)
I1209 21:51:32.867603 1 x:0] [ 1697] http.test E setsockopt(0x6 socket:[4542], 0x6, 0x1, 0xc000170c8c, 0x4)
I1209 21:51:32.867622 1 x:0] [ 1691] http.test X futex(0xc000050f48, FUTEX_WAIT|FUTEX_PRIVATE_FLAG, 0x0, null, 0x0, 0x0) = 0x0 (1.782404ms)
I1209 21:51:32.867634 1 x:0] [ 1697] http.test X setsockopt(0x6 socket:[4542], 0x6, 0x1, 0xc000170c8c, 0x4) = 0x0 (8.721µs)
I1209 21:51:32.867640 1 x:0] [ 1688] http.test E nanosleep(0x7f5ae3dffdd8 {sec=0 nsec=20000}, 0x0)
I1209 21:51:32.867712 1 x:0] [ 1697] http.test E setsockopt(0x6 socket:[4542], 0x1, 0x9, 0xc000170c8c, 0x4)
I1209 21:51:32.867742 1 x:0] [ 1697] http.test X setsockopt(0x6 socket:[4542], 0x1, 0x9, 0xc000170c8c, 0x4) = 0x0 (7.016µs)
I1209 21:51:32.867655 1 x:0] [ 1691] http.test E epoll_pwait(0x5, 0x7f5ae1fff748, 0x80, 0x0, null, 0x0)
I1209 21:51:32.867793 1 x:0] [ 1691] http.test X epoll_pwait(0x5, 0x7f5ae1fff748, 0x80, 0x0, null, 0x0) = 0x2 (3.236µs)
I1209 21:51:32.867796 1 x:0] [ 1697] http.test E setsockopt(0x6 socket:[4542], 0x6, 0x5, 0xc000170c74, 0x4)
I1209 21:51:32.867822 1 x:0] [ 1691] http.test E epoll_pwait(0x5, 0x7f5ae1fff748, 0x80, 0xffffffff, null, 0x3)
I1209 21:51:32.867631 1 x:0] [ 1687] http.test X madvise(0xc000a36000, 0x2000, 0x4) = 0x0 (26.832µs)
I1209 21:51:32.867844 1 x:0] [ 1688] http.test X nanosleep(0x7f5ae3dffdd8 {sec=0 nsec=20000}, null) = 0x0 (133.88µs)
I1209 21:51:32.867824 1 x:0] [ 1697] http.test X setsockopt(0x6 socket:[4542], 0x6, 0x5, 0xc000170c74, 0x4) = 0x0 (6.93µs)
I1209 21:51:32.867870 1 x:0] [ 1688] http.test E nanosleep(0x7f5ae3dffdd8 {sec=0 nsec=20000}, 0x0)
I1209 21:51:32.867878 1 x:0] [ 1687] http.test E futex(0xc000080bc8, FUTEX_WAKE|FUTEX_PRIVATE_FLAG, 0x1, null, 0x0, 0x0)
I1209 21:51:32.867896 1 x:0] [ 1697] http.test E setsockopt(0x6 socket:[4542], 0x6, 0x4, 0xc000170c74, 0x4)
I1209 21:51:32.867902 1 x:0] [ 1687] http.test X futex(0xc000080bc8, FUTEX_WAKE|FUTEX_PRIVATE_FLAG, 0x1, null, 0x0, 0x0) = 0x1 (3.413µs)
I1209 21:51:32.867916 1 x:0] [ 1697] http.test X setsockopt(0x6 socket:[4542], 0x6, 0x4, 0xc000170c74, 0x4) = 0x0 (5.755µs)
I1209 21:51:32.867953 1 x:0] [ 1687] http.test E futex(0xdbe720, FUTEX_WAIT|FUTEX_PRIVATE_FLAG, 0x0, 0xc000660ed8 {sec=0 nsec=1379809}, 0x0, 0x0)
I1209 21:51:32.867965 1 x:0] [ 1697] http.test E accept4(0x3 socket:[4531], 0xc000170ab0, 0xc000170a94 {length=112}, SOCK_CLOEXEC|SOCK_NONBLOCK)
I1209 21:51:32.868044 1 x:0] [ 1699] http.test X futex(0xc000080bc8, FUTEX_WAIT|FUTEX_PRIVATE_FLAG, 0x0, null, 0x0, 0x0) = 0x0 (2.450194ms)
I1209 21:51:32.868065 1 x:0] [ 1688] http.test X nanosleep(0x7f5ae3dffdd8 {sec=0 nsec=20000}, null) = 0x0 (178.336µs)
I1209 21:51:32.868085 1 x:0] [ 1699] http.test E futex(0xc000050848, FUTEX_WAKE|FUTEX_PRIVATE_FLAG, 0x1, null, 0x0, 0x0)
I1209 21:51:32.868102 1 x:0] [ 1699] http.test X futex(0xc000050848, FUTEX_WAKE|FUTEX_PRIVATE_FLAG, 0x1, null, 0x0, 0x0) = 0x1 (2.051µs)
I1209 21:51:32.868120 1 x:0] [ 1688] http.test E nanosleep(0x7f5ae3dffdd8 {sec=0 nsec=20000}, 0x0)
I1209 21:51:32.868140 1 x:0] [ 1689] http.test X futex(0xc000050848, FUTEX_WAIT|FUTEX_PRIVATE_FLAG, 0x0, null, 0x0, 0x0) = 0x0 (2.733808ms)
I1209 21:51:32.868045 1 x:0] [ 1697] http.test X accept4(0x3 socket:[4531], 0xc000170ab0 {Family: AF_INET, Addr: 127.0.0.1, Port: 26122}, 0xc000170a94 {length=16}, SOCK_CLOEXEC|SOCK_NONBLOCK) = 0x7 (27.904µs)
I1209 21:51:32.868150 1 x:0] [ 1699] http.test E read(0x6 socket:[4542], 0xc000226000, 0x1000)
I1209 21:51:32.868191 1 x:0] [ 1697] http.test E epoll_ctl(0x5, 0x1, 0x7 socket:[4543], 0xc000170b94)
I1209 21:51:32.868196 1 x:0] [ 1689] http.test E futex(0xc000050848, FUTEX_WAIT|FUTEX_PRIVATE_FLAG, 0x0, null, 0x0, 0x0)
I1209 21:51:32.868203 1 x:0] [ 1699] http.test X read(0x6 socket:[4542], 0xc000226000 "BOGUS REQUEST\r\n\r\n", 0x1000) = 0x11 (7.087µs)
I1209 21:51:32.868247 1 x:0] [ 1688] http.test X nanosleep(0x7f5ae3dffdd8 {sec=0 nsec=20000}, null) = 0x0 (110.405µs)
I1209 21:51:32.868223 1 x:0] [ 1691] http.test X epoll_pwait(0x5, 0x7f5ae1fff748, 0x80, 0xffffffff, null, 0x3) = 0x1 (386.699µs)
I1209 21:51:32.868310 1 x:0] [ 1691] http.test E epoll_pwait(0x5, 0x7f5ae1fff748, 0x80, 0xffffffff, null, 0x3)
I1209 21:51:32.868214 1 x:0] [ 1697] http.test X epoll_ctl(0x5, 0x1, 0x7 socket:[4543], 0xc000170b94) = 0x0 (10.954µs)
I1209 21:51:32.868322 1 x:0] [ 1699] http.test E write(0x6 socket:[4542], 0xc0000fc230 "HTTP/1.1 400 Bad Request\r\nContent-Type: text/plain; charset=utf-8\r\nConnection: close\r\n\r\n400 Bad Request", 0x67)
I1209 21:51:32.868371 1 x:0] [ 1697] http.test E getsockname(0x7 socket:[4543], 0xc000170bd0, 0xc000170bcc {length=112})
I1209 21:51:32.868379 1 x:0] send tcp 127.0.0.1:27015 -> 127.0.0.1:26123 len:103 id:1532 flags:0x18 ( PA ) seqnum: 3903942872 ack: 2424997994 win: 8192 xsum:0x0 options: {TS:true TSVal:2326455498 TSEcr:3538454481 SACKBlocks:[]}
I1209 21:51:32.868395 1 x:0] [ 1697] http.test X getsockname(0x7 socket:[4543], 0xc000170bd0 {Family: AF_INET, Addr: 127.0.0.1, Port: 27015}, 0xc000170bcc {length=16}) = 0x0 (5.569µs)

@gopherbot
Copy link

Change https://golang.org/cl/210618 mentions this issue: net/http: deflake TestServerConnState

@gopherbot
Copy link

Change https://golang.org/cl/210717 mentions this issue: net/http: await state traces earlier in TestServerConnState

gopherbot pushed a commit that referenced this issue Dec 10, 2019
This approach attempts to ensure that the log for each connection is
complete before the next sequence of states begins.

Updates #32329

Change-Id: I25150d3ceab6568af56a40d2b14b5f544dc87f61
Reviewed-on: https://go-review.googlesource.com/c/go/+/210717
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@bcmills
Copy link
Contributor Author

bcmills commented Dec 10, 2019

@hbhasker, can you confirm whether this is still flaky under gVisor? (We can watch the Go builders for the other platforms.)

@hbhasker
Copy link

Will run the modified test and update.

@hbhasker
Copy link

Confirming that the test is not flaky under gVisor w/ the new changes. Thanks!

@hbhasker
Copy link

root@9fed391e8304:/go# dmesg
[ 0.000000] Starting gVisor...
[ 0.279882] Checking naughty and nice process list...
[ 0.471310] Segmenting fault lines...
[ 0.721855] Waiting for children...
[ 0.995291] Reticulating splines...
[ 1.175507] Synthesizing system calls...
[ 1.551559] Granting licence to kill(2)...
[ 1.617623] Creating bureaucratic processes...
[ 1.896969] Mounting deweydecimalfs...
[ 2.343648] Generating random numbers by fair dice roll...
[ 2.479958] Searching for needles in stacks...
[ 2.576862] Ready!
root@9fed391e8304:/go# go test net/http -run TestServerConnState -count 1000
ok net/http 49.684s
root@9fed391e8304:/go#

@bcmills bcmills added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Dec 12, 2019
@bcmills
Copy link
Contributor Author

bcmills commented Dec 12, 2019

Closing on the theory that the failures on the other platforms have a similar root cause as the one on gVisor. (We can open a new issue if it recurs.)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Testing An issue that has been verified to require only test changes, not just a test failure. WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

6 participants