New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build: misc/cgo/testcshared failure #23784
Comments
That test runs these commands from the misc/cgo/testcshared directory:
(Running those commands will leave the files Can you confirm that running those commands by themselves demonstrates the same failure. |
It's possible that the test will run the wrong version of |
This version is taken from
Yes, it gives:
See also |
Thanks for the strace output. That was going to be my next question. But what we need is actually the output from |
@ianlancetaylor, oh, you're right, here it is: |
Thanks. That is strange. I see this:
and then the call to
What could possibly cause |
No, there are no FUSE file systems mounted inside my home directory ( I've also noticed that increasing the number of iterations in |
Do you have an unusually slow system? 1000 iterations of a sleep for 1000000 nanoseconds is a full second. How could it possibly take a full second for |
I have 32×2.2GHz CPU cores with small
I'm confused because now I can't reproduce the bug even with 1000 iterations. However, I reproduced it with 300 iterations, here is the |
Well, I have reproduced it with 1000 iterations again after 340 more runs of |
Thanks. That output does clearly show that
By comparison, on my laptop, I see
so the I'm grasping at straws, but can you try adding sched_yield(); before the call to |
Since it's very difficult to reproduce the bug now (for unknown reasons), I've collected some statistics instead on how much iterations it takes for different versions of code to finish with Here is the histogram for n = 5000 runs, the X-axis (tail stripped) denotes the number of iterations to finish, the Y-axis denotes the number of runs. See also five extremal values:
So, according to the above mentioned, I don't see any significant difference at all. |
Change https://golang.org/cl/93895 mentions this issue: |
Well, I don't know what's going on, but it sounds like an approach like https;//golang.org/cl/93895 will at least make it more likely to pass. |
Yep, seems like a reasonable approach. |
This is still failing:
Trybot failures that just happened: |
The test currently uses file descriptor 100, which requires the process's fdtable to be expanded. Somehow that seems to take a long time. The other threads running (the main thread written in C and the sysmon thread) are somehow blocking that. I don't have enough kernel-fu to understand how. On my system the I'll try a different workaround, which is to only dup to file descriptor 30. That is less than the Linux kernel's default fdtable size of 32 or 64, so the |
Change https://golang.org/cl/108537 mentions this issue: |
@ianlancetaylor this breaks the trybots on the 1.10 release branch as well, and is probably (?) also an issue on 1.9, can you look into backporting the fix? https://go-review.googlesource.com/c/go/+/111715#message-c7b7c5f6de1d1ad25c8c7db15599e99e29a392a7 @gopherbot please open a backport issue for 1.10 and 1.9. |
Backport issue(s) opened: #25277 (for 1.10), #25278 (for 1.9). Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://golang.org/wiki/MinorReleases. |
Change https://golang.org/cl/111995 mentions this issue: |
Change https://golang.org/cl/111996 mentions this issue: |
…for TestUnexportedSymbols Backport of CL 108537 to 1.10 release branch. We were using file descriptor 100, which requires the Linux kernel to grow the fdtable size. That step may sometimes require a long time, causing the test to fail. Switch to file descriptor 30, which should not require growing the fdtable. Updates #23784 Fixes #25277 Change-Id: I9d25986f3b59bdeb04aa52407b24aa94712aedff Reviewed-on: https://go-review.googlesource.com/111995 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
…or TestUnexportedSymbols Backport of CL 108537 to 1.9 release branch. We were using file descriptor 100, which requires the Linux kernel to grow the fdtable size. That step may sometimes require a long time, causing the test to fail. Switch to file descriptor 30, which should not require growing the fdtable. Updates #23784 Fixes #25278 Change-Id: I19ea6ab1724ec1807643d5111c44631e20be76b0 Reviewed-on: https://go-review.googlesource.com/111996 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Hi everyone! I've just downloaded the go code and ran the
I've tried two other times and the message on the last line (after "cshared_test.go:376") changed all times, suggesting some random values were being read from the fd 30. I've recompiled the binaries to use fd 100 like it was originally and I got the tests to pass. go env:
I'm testing Go on the new Google Pixelbook. I wonder if instead of choosing an arbitrary number we could have a more safe way to select the fd? Or use another technique to run this test? |
@danicat This issue is closed. Please open a new issue for that problem. Thanks. |
Sure @ianlancetaylor. Done! |
What version of Go are you using (
go version
)?I'm using
go version go1.9.4 linux/amd64
to build the sources from the latestmaster
, i.e.go version devel +829b64c1ea linux/amd64
.Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (
go env
)?What did you do?
cd go/src ./all.bash
What did you expect to see?
What did you see instead?
...
This test is very unstable, though, and may fail or may pass with approximate probability of 1/2.
The text was updated successfully, but these errors were encountered: