cmd/go/internal/test: test concurrency limited by sequential printing #61233

howardjohn · 2023-07-07T20:05:14Z

What version of Go are you using (`go version`)?

$ go version
go version go1.20 linux/amd64

Does this issue reproduce with the latest release?

Yes

What did you do?

Ran go test -debug-trace=trace -debug-actiongraph=actiongraph ./...

What did you expect to see?

Tests are run with maximum concurrency

What did you see instead?

test run tasks are started but not doing any work, blocking other work from progressing.

Example simple reproducer:

for i in a b c d; do
	mkdir -p $i
	cat <<EOF > "$i/${i}_test.go"
package $i

import (
	"testing"
	"time"
)

func TestSleep(t *testing.T) {
	if "$i" == "b" {
		time.Sleep(time.Second)
	}
}
EOF
done
for i in e f g; do
	mkdir -p $i
	cat <<EOF > "$i/${i}.go"
package $i

EOF
done
GOMAXPROCS=2 go test -debug-trace=trace ./...

Basically this just makes a few packages, one with a slow test, and a few empty packages.

When we run this, I would expect to have one worker blocked on the b test (sleep 1), while the rest execute in the other worker and quickly complete.

Instead, we see this:

This looks odd - the e package doesn't have any tests, so why is test run e taking 1.3s?

The reason seems to be related to https://go-review.googlesource.com/c/go/+/448357 (although I am not certain it worked different before that CL). One the runTestActor starts, it is immediately blocked on <-r.prev. This, in turn, means it is blocked until the previous test (d) finishes.

Ultimately, this blocks an entire worker doing nothing.

A real world example of this can be found here (warning: 170mb). You can see it occurring in a few places, most prominently around 300s we have almost all 16 workers blocked, most/all on "no test files" packages.

I think based on the way actions are pushed into the ready queue, any action that has zero remaining dependencies may be pushed onto the queue. In some of my testing, I saw the occasionally some of the very last packages in the test ordering would get scheduled first - blocking an entire worker for almost the entire process.

I believe this new logic is only need for -json mode. One option is to just enable it only if -json is enabled.

I tested this locally against the same repo as the "real world example" above.
With Go 1.20, the tests took 70s to execute. With the patch, they took 30s.

With the same patch, the simple reproducer looks as expected:

Before trace After trace (warning: 150mb)

Another approach is to add each test run action as a dependency for the previous, to serialize the ordering. However, this makes things far worse, as this is serializing things much stricter -- the current serialization just ensures the starts are serialized, while this approach would make sure the entire test execution is as well - not what we want.

In my testing this made things worse.

A similar approach may be to make another action type test start, but I didn't test this out

The text was updated successfully, but these errors were encountered:

howardjohn · 2023-07-07T20:23:35Z

I opened https://go-review.git.corp.google.com/c/go/+/508515 with approach 1

gopherbot · 2023-07-07T20:43:47Z

Change https://go.dev/cl/508515 mentions this issue: cmd/go: only serialize test start events in -json mode

howardjohn · 2023-09-01T15:35:21Z

I know this isn't the most productive comment but its a bit frustrating to have found a regression causing tests to take 2x as long, and submit a fix, and not have any comment at all within 2 months.

Is there anyway to move this forward?

ianlancetaylor · 2023-09-01T15:41:39Z

CC @bcmills @matloob

bcmills · 2023-09-01T16:15:59Z

Sorry for the delay — I think I had mentally lumped this in with #60203, although it's clearly a separate issue. 😅

I've replied on the CL.

evanj · 2023-10-02T17:18:00Z

In case it is helpful: I think we have run into this with the tests for https://github.com/DataDog/datadog-agent (thanks to @hush-hush for debugging this). Going from Go 1.19 to Go 1.20 makes the tests about 20-30% slower (an extra ~17 seconds for a 60 second test run). We can help verify a fix once one is ready, although it looks like the conversation on https://go.dev/cl/508515 has stalled (last comment Sept 6th).

howardjohn · 2024-09-12T20:11:05Z

I'd love to pick this up again if there is some guidance on what would be an acceptable solution.

matloob · 2024-09-27T18:19:22Z

I don't have a lot of context on this but I think bcmills's suggestion to log an event that contains the information about the ordering of the test packages sounds reasonable? @aclements , what do you think?

jsm · 2025-02-01T02:38:31Z

We've also run into this issue. Running a large set of test packages slowed down about 6x due to this, from 15 minutes to 90 minutes.

bcmills added ToolSpeed NeedsInvestigation GoCommand labels Jul 10, 2023

bcmills added this to the Backlog milestone Jul 10, 2023

bcmills changed the title ~~internal/test: test concurrency limited by sequential printing~~ cmd/go/internal/test: test concurrency limited by sequential printing Jul 10, 2023

howardjohn mentioned this issue Jul 12, 2023

proposal: cmd/go: add an option to sequentially run tests #61318

Open

howardjohn mentioned this issue Aug 15, 2023

New formatter idea: worker-oriented view gotestyourself/gotestsum#355

Open

jsm mentioned this issue Mar 31, 2025

proposal: testing: prevent worker starvation via ordered scheduling using barrier tasks #73106

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmd/go/internal/test: test concurrency limited by sequential printing #61233

cmd/go/internal/test: test concurrency limited by sequential printing #61233

howardjohn commented Jul 7, 2023 •

edited by gabyhelp

Loading

howardjohn commented Jul 7, 2023

gopherbot commented Jul 7, 2023

howardjohn commented Sep 1, 2023

ianlancetaylor commented Sep 1, 2023

bcmills commented Sep 1, 2023

evanj commented Oct 2, 2023

howardjohn commented Sep 12, 2024

matloob commented Sep 27, 2024

jsm commented Feb 1, 2025

cmd/go/internal/test: test concurrency limited by sequential printing #61233

cmd/go/internal/test: test concurrency limited by sequential printing #61233

Comments

howardjohn commented Jul 7, 2023 • edited by gabyhelp Loading

What version of Go are you using (go version)?

Does this issue reproduce with the latest release?

What did you do?

What did you expect to see?

What did you see instead?

howardjohn commented Jul 7, 2023

gopherbot commented Jul 7, 2023

howardjohn commented Sep 1, 2023

ianlancetaylor commented Sep 1, 2023

bcmills commented Sep 1, 2023

evanj commented Oct 2, 2023

howardjohn commented Sep 12, 2024

matloob commented Sep 27, 2024

jsm commented Feb 1, 2025

howardjohn commented Jul 7, 2023 •

edited by gabyhelp

Loading

What version of Go are you using (`go version`)?