Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build/kubernetes/gke: investigate if viable to get 4 tests to run on builders again #28543

Open
dmitshur opened this issue Nov 2, 2018 · 7 comments
Labels
Builders x/build issues (builders, bots, dashboards) NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@dmitshur
Copy link
Contributor

dmitshur commented Nov 2, 2018

What version of Go are you using (go version)?

$ go version
go version go1.11.1 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/dmitshur/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/dmitshur/go"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build290116553=/tmp/go-build -gno-record-gcc-switches"

What did you do?

I ran go test golang.org/x/build/... on GCE.

Note this doesn't happen in other environments, because those tests get skipped on non-GCE. E.g., here's verbose test output on my Mac:

$ go test -v golang.org/x/build/kubernetes/gke
=== RUN   TestNewClient
--- SKIP: TestNewClient (0.05s)
    gke_test.go:119: not on GCE; skipping
=== RUN   TestDialPod
--- SKIP: TestDialPod (0.00s)
    gke_test.go:119: not on GCE; skipping
=== RUN   TestDialService
--- SKIP: TestDialService (0.00s)
    gke_test.go:119: not on GCE; skipping
=== RUN   TestGetNodes
--- SKIP: TestGetNodes (0.00s)
    gke_test.go:119: not on GCE; skipping
PASS
ok  	golang.org/x/build/kubernetes/gke	0.103s

What did you expect to see?

All tests to pass.

What did you see instead?

--- FAIL: TestNewClient (0.67s)
    gke_test.go:154: x509 client key pair could not be generated: tls: failed to find any PEM data in certificate input
--- FAIL: TestDialPod (0.60s)
    gke_test.go:154: x509 client key pair could not be generated: tls: failed to find any PEM data in certificate input
--- FAIL: TestDialService (0.60s)
    gke_test.go:154: x509 client key pair could not be generated: tls: failed to find any PEM data in certificate input
--- FAIL: TestGetNodes (0.59s)
    gke_test.go:154: x509 client key pair could not be generated: tls: failed to find any PEM data in certificate input
FAIL
FAIL    golang.org/x/build/kubernetes/gke       2.519s
@dmitshur dmitshur added Builders x/build issues (builders, bots, dashboards) NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Nov 2, 2018
@dmitshur dmitshur added this to the Unreleased milestone Nov 2, 2018
@dmitshur
Copy link
Contributor Author

dmitshur commented Dec 18, 2019

These tests no longer fail on GCE (with a newer version of Go):

$ go version
go version go1.13.5 linux/amd64
$ go test -v golang.org/x/build/kubernetes/gke
=== RUN   TestNewClient
--- PASS: TestNewClient (0.70s)
=== RUN   TestDialPod
--- PASS: TestDialPod (0.64s)
    gke_test.go:61: Dialed "go"/"coordinator-deployment-64bd4ccd94-vt6vs"/80.
=== RUN   TestDialService
--- PASS: TestDialService (0.53s)
    gke_test.go:104: Dialed cluster "buildlets" service "kubernetes".
=== RUN   TestGetNodes
--- PASS: TestGetNodes (0.56s)
    gke_test.go:175: 2 nodes in cluster buildlets
PASS
ok  	golang.org/x/build/kubernetes/gke	2.504s

I wonder what changed. The only recent change in kubernetes/gke directory was golang/build@366373d, which is not relevant (these tests are not being skipped).

We can close this issue after confirming that the fix is intentional.

@dmitshur
Copy link
Contributor Author

These tests no longer fail on GCE (with a newer version of Go)

I tested with the same Go 1.11.1 version as from the original report, and they pass with that version of Go too now:

$ go1.11.1 test -v golang.org/x/build/kubernetes/gke
=== RUN   TestNewClient
--- PASS: TestNewClient (0.62s)
=== RUN   TestDialPod
--- PASS: TestDialPod (0.57s)
    gke_test.go:61: Dialed "go"/"coordinator-deployment-864c84478b-2x4r7"/80.
=== RUN   TestDialService
--- PASS: TestDialService (0.57s)
    gke_test.go:104: Dialed cluster "buildlets" service "kubernetes".
=== RUN   TestGetNodes
--- PASS: TestGetNodes (0.58s)
    gke_test.go:175: 2 nodes in cluster buildlets
PASS
ok  	golang.org/x/build/kubernetes/gke	2.393s

@dmitshur
Copy link
Contributor Author

dmitshur commented Dec 18, 2019

Perhaps this change is related to us starting to use a dedicated service account on GCE (CL 210958 plus other changes that happened as part of that). /cc @toothrot @bradfitz

@dmitshur
Copy link
Contributor Author

The test is failing on on the linux-amd64-longtest builder, as can be seen on the build dashboard:

--- FAIL: TestNewClient (0.78s)
    gke_test.go:148: googleapi: Error 403: Required "container.clusters.list" permission(s) for "projects/symbolic-datum-552"., forbidden
--- FAIL: TestDialPod (0.60s)
    gke_test.go:148: googleapi: Error 403: Required "container.clusters.list" permission(s) for "projects/symbolic-datum-552"., forbidden
--- FAIL: TestDialService (0.60s)
    gke_test.go:148: googleapi: Error 403: Required "container.clusters.list" permission(s) for "projects/symbolic-datum-552"., forbidden
--- FAIL: TestGetNodes (0.60s)
    gke_test.go:148: googleapi: Error 403: Required "container.clusters.list" permission(s) for "projects/symbolic-datum-552"., forbidden
FAIL
FAIL	golang.org/x/build/kubernetes/gke	2.586s

(Source: https://build.golang.org/log/940578001742621d3e9378b5e5a2a6bd95e61fc1\)

I'm not familiar with what the test is trying to do and the relevant constraints, but it seems like it currently requires the "container.clusters.list" permission in the local environment, and that's not a portable testing strategy.

@bradfitz
Copy link
Contributor

Yeah, I think this broke when the service account lost some permissions.

I'd just put this behind an extra -manual-tests flag for now.

@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Dec 19, 2019
@dmitshur dmitshur changed the title x/build/kubernetes/gke: 4 test failures x/build/kubernetes/gke: 4 tests that require container.clusters.list permission should be made opt-in Dec 19, 2019
@gopherbot
Copy link

Change https://golang.org/cl/215299 mentions this issue: kubernetes/gke: skip 4 tests on builders

@dmitshur dmitshur changed the title x/build/kubernetes/gke: 4 tests that require container.clusters.list permission should be made opt-in x/build/kubernetes/gke: investigate if viable to get 4 tests to run on builders again Jan 17, 2020
@dmitshur dmitshur added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. and removed NeedsFix The path to resolution is known, but the work has not been done. labels Jan 17, 2020
gopherbot pushed a commit to golang/build that referenced this issue Jan 17, 2020
The current GKE tests require to be run on GCE and with Application
Default Credentials that have at least the container.clusters.list
permission, at least one GKE cluster, and possibly more.

The builders that run these tests don't have sufficient permissions,
which means these tests never pass. Skip these tests, they can be
re-enabled if/when we decide they're worth running automatically and
make it possible to do so. Until then, they can be run manually.

Updates golang/go#28543
Updates golang/go#11811

Change-Id: Ib76f9d4f93ece2b922c099a21dec4ceefb45b546
Reviewed-on: https://go-review.googlesource.com/c/build/+/215299
Reviewed-by: Alexander Rakoczy <alex@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Alexander Rakoczy <alex@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
@gopherbot
Copy link

Change https://go.dev/cl/408956 mentions this issue: kubernetes/gke: skip tests unless specifically requested

gopherbot pushed a commit to golang/build that referenced this issue May 27, 2022
Just running on GCE is not enough for them to work.

Updates golang/go#28543.

Change-Id: I79a0702f9c2dfaf256a872557836258ba2ab4d0d
Reviewed-on: https://go-review.googlesource.com/c/build/+/408956
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Run-TryBot: Heschi Kreinick <heschi@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Auto-Submit: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Builders x/build issues (builders, bots, dashboards) NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

3 participants