-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime/pprof: new test TestLabelSystemstack is flaky #50007
Comments
It looks like this is exposing a legitimate bug. Every one of these logs I've looked at looks something like this:
Notice that the first entry is missing the label. Every runs seems to have the first sample missing the label. Perhaps we are dropping that somewhere? |
😞 Sorry this is flaky again. I'll do some debugging now to see if I can figure out what's going on. Thanks for the details you provided @prattmic . Looks like there are no linux/amd64 failures so far? |
I am looking as well to see if I can find the discrepancy. |
Found the bug!
I will send a CL shortly. |
@prattmic wow, that was fast! I only got as far as verifying that the tagPtr is still intact when Given what you found, do you think this this will fix the test flakiness? Or would it just help with the tag accuracy of the first batch of samples? |
I'm not certain. I hope it is the only bug and perhaps could even resolve this TODO, but I don't know yet. |
Change https://golang.org/cl/369741 mentions this issue: |
Just took your patch for a spin, and it looks very promising! I'd be happy to send a change for removing that TODO once your patch lands. Or feel free to make it part of your change if you think it's worth a shot. https://gist.github.com/felixge/b3a2436035ccfc24d3a0aa0fe98b3c57 |
Change https://golang.org/cl/369744 mentions this issue: |
Change https://golang.org/cl/369984 mentions this issue: |
https://golang.org/cl/369744 is expected to fix this. Will file a new bug if more issues come up. |
profBuf.write uses an index in b.tags for each entry, even if that entry has no tag (that slice entry just remains 0). profBuf.read similarly returns a tags slice with exactly as many entries as there are records in data. profileBuilder.addCPUData iterates through the tags in lockstep with the data records. Except in the special case of the first record, where it forgets to increment tags. Thus the first read of profiling data has all tags off-by-one. To help avoid regressions, addCPUData is changed to assert that tags contains exactly the correct number of tags. For #50007. Change-Id: I5f32f93003297be8d6e33ad472c185d924a63256 Reviewed-on: https://go-review.googlesource.com/c/go/+/369741 Reviewed-by: Austin Clements <austin@google.com> Trust: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
With https://golang.org/issue/50007 resolved, there are no known issues with pprof labels remaining. Thus, the 10% allowed error in TestLabelSystemstack should not be required. Drop it in favor of an explicit assertion that all samples containing labelHog are properly labeled. This is no flaky in my local testing. It is possible that other bugs will appear at larger testing scale, in which case this CL will be reverted, but then at least we will be aware of additional failure modes. For #50007. Change-Id: I1ef530c303bd9a01af649b8b08d4b35505e8aada Reviewed-on: https://go-review.googlesource.com/c/go/+/369744 Reviewed-by: Bryan Mills <bcmills@google.com> Trust: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org>
Looks like there is at least one failure mode remaining — filed #50050. |
This makes TestLabelSystemstack much more strict, enabling it to detect any misplacement of labels. Unfortunately, there are several edge cases where we may not have an obviously correct stack trace, so we generally except the runtime package, with the exception of background goroutines that we know should not be labeled. For #50007 For #50032 Change-Id: I8dce7e7da04f278ce297422227901efe52782ca0 Reviewed-on: https://go-review.googlesource.com/c/go/+/369984 Trust: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
This makes TestLabelSystemstack much more strict, enabling it to detect any misplacement of labels. Unfortunately, there are several edge cases where we may not have an obviously correct stack trace, so we generally except the runtime package, with the exception of background goroutines that we know should not be labeled. For golang#50007 For golang#50032 Change-Id: I8dce7e7da04f278ce297422227901efe52782ca0 Reviewed-on: https://go-review.googlesource.com/c/go/+/369984 Trust: Michael Pratt <mpratt@google.com> Run-TryBot: Michael Pratt <mpratt@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Austin Clements <austin@google.com>
greplogs --dashboard -md -l -e 'TestLabelSystemstack'
:2021-12-05T12:50:44-ecf6b52/freebsd-arm-paulzhol
2021-12-05T12:50:44-ecf6b52/linux-mipsle-rtrk
2021-12-04T17:16:07-3396878/freebsd-arm-paulzhol
2021-12-04T17:16:07-3396878/openbsd-mips64-jsing
2021-12-04T11:59:35-1876b38/freebsd-arm-paulzhol
2021-12-04T11:59:35-1876b38/openbsd-mips64-jsing
2021-12-04T04:50:55-549cfef/freebsd-arm-paulzhol
2021-12-04T01:07:21-821bf04/linux-386
2021-12-03T21:25:05-d20a0bf/linux-ppc64le-buildlet
2021-12-03T21:25:05-d20a0bf/openbsd-mips64-jsing
2021-12-03T16:11:38-9b0de08/linux-riscv64-unmatched
Test originally added in https://golang.org/cl/351751 for #48577 and reverted in https://golang.org/cl/360757 because it was extremely flaky. Re-added https://golang.org/cl/367200. It is much less flaky now, but still a bit flaky.
cc @felixge @bcmills
The text was updated successfully, but these errors were encountered: