Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: error message in case of bad certificate leaks implementation details in Go 1.13 #34066

Open
cuonglm opened this issue Sep 4, 2019 · 12 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@cuonglm
Copy link
Member

cuonglm commented Sep 4, 2019

What version of Go are you using (go version)?

$ go version
go version go1.13 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/cuonglm/Library/Caches/go-build"
GOENV="/Users/cuonglm/Library/Application Support/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/cuonglm/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/Users/cuonglm/sdk/go1.13"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/Users/cuonglm/sdk/go1.13/pkg/tool/darwin_amd64"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/y4/hs76ltbn7sb66lw_6934kq4m0000gn/T/go-build839999394=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

$ export GO111MODULE=off
$ go1.13 get -d github.com/loadimpact/k6
$ cd $GOPATH/src/github.com/loadimpact/k6/js
$ go1.13 test -count=1 -race -run=TestVUIntegrationClientCerts/Unauthenticated
--- FAIL: TestVUIntegrationClientCerts (5.16s)
    --- FAIL: TestVUIntegrationClientCerts/Unauthenticated (0.56s)
        --- FAIL: TestVUIntegrationClientCerts/Unauthenticated/Source (0.11s)
        Error Trace:    runner_test.go:1168
        Error:      	Error message not equal:
                    	expected: "GoError: Get https://127.0.0.1:51407: remote error: tls: bad certificate"
                    	actual  : "GoError: Get https://127.0.0.1:51407: readLoopPeekFailLocked: remote error: tls: bad certificate"
        Test:       	TestVUIntegrationClientCerts/Unauthenticated/Source
        --- FAIL: TestVUIntegrationClientCerts/Unauthenticated/Archive (0.11s)
        Error Trace:    runner_test.go:1168
        Error:      	Error message not equal:
                    	expected: "GoError: Get https://127.0.0.1:51407: remote error: tls: bad certificate"
                    	actual  : "GoError: Get https://127.0.0.1:51407: readLoopPeekFailLocked: remote error: tls: bad certificate"
        Test:       	TestVUIntegrationClientCerts/Unauthenticated/Archive
FAIL
exit status 1
FAIL	github.com/loadimpact/k6/js	5.344s

What did you expect to see?

Test passed.

What did you see instead?

Test failed.

Note

  • The test will pass with go1.11 to go1.12.9 or go1.13 without -race.
  • The problem also occurs in linux/amd64.
@mvdan
Copy link
Member

mvdan commented Sep 4, 2019

If you write tests that compare error string values for exact matches, don't you already run into failures every other Go release?

@cuonglm
Copy link
Member Author

cuonglm commented Sep 4, 2019

@mvdan No, it works consistently at least from go1.11.x. Also, go1.13 does not have the issue without -race, the problem happens when run test under -race.

I chased down the code and point to https://github.com/golang/go/blob/release-branch.go1.13/src/net/http/transport.go#L2060

It seems that readLoopPeekFailLocked: was appended because peekError == io.EOF is false.

@cuonglm
Copy link
Member Author

cuonglm commented Sep 4, 2019

@mvdan those lines are there since 7fa9846, so it's high chance that something changes in go1.13 that break the test.

@bcmills
Copy link
Contributor

bcmills commented Sep 4, 2019

The line in question hasn't changed in four years, but this may be related to CL 179457 (#32310), which changed some nearby code.

@bcmills
Copy link
Contributor

bcmills commented Sep 4, 2019

I tend to agree with @mvdan that tests should not rely on the error strings.

On the other hand, the fact that the string now contains readLoopPeekFailLocked (an implementation detail of the net/http package!) seems like a broken abstraction boundary, at the very least.

@bcmills
Copy link
Contributor

bcmills commented Sep 4, 2019

CC @bradfitz @dmitshur @ianlancetaylor, plus @FiloSottile for TLS.

@bcmills bcmills changed the title go1.13 break k6 test net/http: error message in case of bad certificate leaks implementation details in Go 1.13 Sep 4, 2019
@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 4, 2019
@bcmills bcmills added this to the Go1.14 milestone Sep 4, 2019
@FiloSottile
Copy link
Contributor

I think what changed is that the TLS error used to be returned by conn.Handshake() but is now returned by a conn.Read() because of the structure of TLS 1.3. See the last paragraph of https://golang.org/doc/go1.12#tls_1_3. Not sure why -race makes a difference though, maybe scheduler timing?

This is not a regression because we don't guarantee error messages across versions, but we should figure out the -race difference and make the error prettier.

@cuonglm
Copy link
Member Author

cuonglm commented Sep 4, 2019

@FiloSottile From I observed, the error seems to be the same, but the condition peekErr == io.EOF false causing readLoopPeekFailLocked: was appended to current error https://github.com/golang/go/blob/release-branch.go1.13/src/net/http/transport.go#L2060

	if peekErr == io.EOF {
		// common case.
		pc.closeLocked(errServerClosedIdle)
	} else {
		pc.closeLocked(fmt.Errorf("readLoopPeekFailLocked: %v", peekErr))
	}

@ianlancetaylor
Copy link
Contributor

I agree that an error message reported back to the user for a failure like this shouldn't say readLoopPeekFailLocked. @bradfitz this was added in https://golang.org/cl/18282 back in 2016. Can we just remove it?

I also agree that this test should be fixed to not expect a specific error message. The Go 1 compatibility guarantee does not extend to the exact text of error messages. Especially not from the net/http package.

@dmitshur
Copy link
Contributor

dmitshur commented Sep 4, 2019

In that CL, the new errServerClosedIdle variable was documented with this comment:

// error values for debugging and testing, not seen by users.

If errServerClosedIdle wasn't meant to be seen by users, then it's likely fmt.Errorf("readLoopPeekFailLocked: %v", peekErr) wasn't intended to be seen by users either.

@cuonglm
Copy link
Member Author

cuonglm commented Sep 10, 2019

I agree that an error message reported back to the user for a failure like this shouldn't say readLoopPeekFailLocked. @bradfitz this was added in https://golang.org/cl/18282 back in 2016. Can we just remove it?

I also agree that this test should be fixed to not expect a specific error message. The Go 1 compatibility guarantee does not extend to the exact text of error messages. Especially not from the net/http package.

I fixed the test already.

I just want to insist that something internal was changed, because the test is still passed without -race.

cuonglm added a commit to grafana/k6 that referenced this issue Sep 13, 2019
cuonglm added a commit to grafana/k6 that referenced this issue Sep 13, 2019
@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
srguglielmo pushed a commit to srguglielmo/k6 that referenced this issue Nov 3, 2019
@cuonglm
Copy link
Member Author

cuonglm commented Nov 17, 2019

I'm trying digging more into this problem. As I can see, with -race, this condition:

if pc.numExpectedResponses == 0 {

becomes true. The only place that increase numExpectedResponse is:

pc.numExpectedResponses++

Without race, roundTrip always happens before readLoop. What's the right logic for numExpectedResponse? Must it be increased before readLoop run?

cc @bradfitz @bcmills

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

7 participants