-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/build: add MSAN and ASAN builders #70054
Comments
Related Issues and Documentation
(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.) |
Change https://go.dev/cl/622322 mentions this issue: |
This change adds two new builder types. It depends on https://chromium-review.googlesource.com/c/infra/infra/+/5967115 to function. The goal of this change is to test ASAN and MSAN at a baseline, to make sure the Go parts work. We recently broke ASAN but didn't notice until we got feedback from someone else; our internal tests are too simplistic. While we're here, sort the RUN_MODS list in the source code. For golang/go#70054. Change-Id: Ib515a2c908fbfdc58abf6e9d4c9c7a170bec0499 Reviewed-on: https://go-review.googlesource.com/c/build/+/622322 Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Currently there are many tests that fail with ASAN/MSAN enabled, I looked a bit into this in separate issues. It seems that most of the failures are related to the number of allocations. |
Yeah, I expected some issues. The alloc numbers are probably because of when the instrumentation happens (before inlining?) and so some things don't get inlined as expected, and we get some more allocs. (Correct me if I'm wrong, anyone.) I don't think fixing the allocs is worth it; we should just skip those tests. There are other tests we should just skip as well. I'll send CLs today. |
Actually... that doesn't really make sense. Inlining and escape analysis happen long before SSA. Maybe it's calls in the runtime that aren't properly |
Okay, drilling down on one of the tests, |
Just kidding, I totally missed that the test is skipped entirely if |
Thanks for investigating! I believe I already have a branch with changes to skip the tests, but there are so many skips that, at some point, it felt like I wasn’t doing things right. I can open a CL with the changes. Let me know if you want to send one yourself. |
That is one of the reasons we introduced internal packages for msan/asan as well |
Thanks. I missed your message; I just sent a CL with (almost) all the skips. A good chunk of the skips do align with |
Change https://go.dev/cl/622855 mentions this issue: |
OK, so the vast majority of test changes are skips, but there were some real bugs with ASAN that just weren't caught because we weren't running all of our tests. I've fixed a bunch of them, but there may be more to fix. Good progress so far, definitely much cleaner than when I started. |
Change https://go.dev/cl/623176 mentions this issue: |
These fail for the same reason as for the race detector, and is the most frequently failing test in both. For #70054. For #64257. For #64256. Change-Id: I3649e58069190b4450f9d4deae6eb8eca5f827a3 Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-asan-clang15,gotip-linux-amd64-msan-clang15 Reviewed-on: https://go-review.googlesource.com/c/go/+/623176 TryBot-Bypass: Michael Knyszek <mknyszek@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
On a whim I just tried skipping |
Change https://go.dev/cl/623336 mentions this issue: |
On a whim I decided to investigate the possibility of whether the flakiness on the asan builder was due to a concurrently executing test. Of the most recent failures there were a few candidates, and this test was one of them. After disabling each candidate one by one, we had a winner: this test causes other concurrently executing tests, running pure Go code, to spuriously fail. I do not know why yet, but this test doesn't seem like it would have incredibly high value for ASAN, and does funky things like MAP_FIXED in recently unmapped regions, so I think it's fine. For #70054. For #64257. Change-Id: Ib9a84d9b69812e76c390d99b00698710ee1ece1a Cq-Include-Trybots: luci.golang.try:gotip-linux-amd64-asan-clang15 Reviewed-on: https://go-review.googlesource.com/c/go/+/623336 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Auto-Submit: Michael Knyszek <mknyszek@google.com> Reviewed-by: Cherry Mui <cherryyz@google.com>
Thanks for adding the builders and fixing tests at tip. The builders are currently failing on release branches 1.23 and 1.22: (CC @cagedmantis) https://ci.chromium.org/p/golang/g/go-go1.23/console @mknyszek Do you think it's viable and worthwhile to try to fix the release branches too? Or is it better to just constrain these new builders to only run at 1.24 branch and newer? Making these new builders 1.24+ only seems fine to me, but up to you. |
Hm... the benefit of backporting is to make sure that backports don't break MSAN/ASAN. But I think such a situation is quite rare. I also think it would be fine to make them 1.24+ only, though if it turns out the backports are easy then why not. I can try to just backport the changes naively and see what happens. |
Change https://go.dev/cl/623957 mentions this issue: |
Change https://go.dev/cl/627536 mentions this issue: |
Backporting the patches necessary to make these builders pass seem not worth the effort. The main problem is the patches touch many parts of std, which results in a lot of conflicts. Just disable them for now. For golang/go#70054. Change-Id: I5a304dbe71b4887d0a59ec2dbc7b1cc6e9129804 Reviewed-on: https://go-review.googlesource.com/c/build/+/627536 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org> Auto-Submit: Michael Knyszek <mknyszek@google.com> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
linux-arm64 is a first class port with no coverage of ASAN or MSAN modes. We already have a clang15 builder, so adding ASAN and MSAN should be trivial (fingers crossed). For golang/go#71395. For golang/go#70054. Change-Id: I6a6a636c7a41147b1b22933db946ca838d3696f4 Reviewed-on: https://go-review.googlesource.com/c/build/+/643918 LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Commit-Queue: Michael Pratt <mpratt@google.com> Auto-Submit: Michael Pratt <mpratt@google.com> Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org> Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
Change https://go.dev/cl/643918 mentions this issue: |
I recently broke ASAN mode because the tests in
cmd/cgo/internal/testsanitizers
are too simple and can't catch issues with Go's instrumentation of user code or issues with the instrumentation of the Go runtime. In this particular case, it was the latter: some allocation codepaths were broken under ASAN (did not notify ASAN of changes correctly), but more sophisticated allocation patterns were required to trigger the issue.I propose we set up ASAN and MSAN builders, akin to race mode builders, that run all Go tests under MSAN and ASAN. At the very least, this will catch problems with how Go hooks into the sanitizer runtimes. I suspect these weren't set up in the past because ASAN/MSAN do little to help catch issues in Go code (not much opportunity for a use-after-free with a garbage collector, nor is uninitialized memory access likely to occur in regular Go code (I'm not even sure if we instrument Go code for the latter at all)). As long as we have spare machine resources, I think it's worth it to try and catch these lower-level issues early. However, since they're not super high value builders, I think running them only in postsubmit is sufficient.
The text was updated successfully, but these errors were encountered: