New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
time: TestAfterQueuing fails on windows builders: #10680
Comments
I think for the new style builders, the 386 and amd64 are building on
different VMs.
|
Correct. Each Windows build is isolated. |
Yes, builders are running in separate VMs now. Thanks for correcting me @minux and @bradfitz. I am not sure why this test fails more often in VMs. Like I mentioned before, I can break this test reliably if I add this:
the output
Timers are fired fine - I tested this by adding debug prints in runtime/time.go (not shown). But then when sendTime puts timer events in the channel (returned by time.After), the events are received (in await) out of order - see the output. I think it just the way our scheduler works when there is some runnable code or, perhaps, when in VM. I don't know why this only happens on windows. Perhaps it is to do with Windows time granularity being around 15ms. The only thing I can think of is to increase delta further from 100ms to 200ms or even more. This makes adjusted test PASS. Looking for suggestions. Thank you. Alex |
OpenBSD is still flaky too, even after bumping retries up to 100ms (#9903, d12b532). It probably makes sense that for a second retry (i.e., third attempt) we can try an even longer delta period, like 500ms or a full second. Since the tests are only occasionally flaky right now, an extra conservative delta duration shouldn't affect our average testing times much. Overall, the test as written is just inherently flaky. There's no guarantee the An alternative non-flaky way to write the test would be to |
Thank you @mdempsky for the input. I agree with your conclusions, that it is reasonable to increase delta up to 0.5s or 1s. These flukes are exception to the rule and shouldn't increase test time too much. I will see what others say and send the change after. Alex |
It looks like general heavy load can throw this test off. Is there any way to get this test run in isolation? Maybe in test/... somewhere instead of a Test* (which gets run in parallel with lots of other stuff?). |
I really don't want package specific tests to be placed under /test.
Tests under /test are supposed to test the compiler and to a lesser extent,
the runtime.
|
Can we use testing.TestMain somehow? Alex |
The problem is that on multiple core machines, go test -short std will
build and
run multiple tests in parallel. It's not about contention from tests in the
same
package.
|
I'm not sure that's happening on our builders, but regardless the test shouldn't care. Let's focus on making the test robust and not running it in isolation. |
CL https://golang.org/cl/9795 mentions this issue. |
The TestAfterQueuing fails on windows builders recently:
http://build.golang.org/log/b9cb3c3357a4a8d874bf0669d6ee52f39f0be277
http://build.golang.org/log/62b0ebbe08b4b4b099b301a426e793286c2ed155
http://build.golang.org/log/bb3bbce19d8e09e3377629c491ca584faecc8ed0
with
The test does not fail for me here. But I suspect builders are CPU busy building Go and running tests - we are running both windows-386 and windows-amd64 in parallel. I added code similar to runtime/pprof/cpuHogger to TestAfterQueuing and I can make TestAfterQueuing fail on VM here (not on real hardware). I suggest we increase delta to 200 * Millisecond (It is 100 * Millisecond at this moment). That fixes my failure. It might make builders more reliable. How does that sound?
Alex
The text was updated successfully, but these errors were encountered: