-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/go: do not instrument packages which are non-deterministic #46410
Comments
When instrumenting for fuzzing, I wonder if we could make Some form of caching will always be possible - for example, encoding/json uses a global |
Object-reuse is a common source of One alternative might be to hook the scheduler to make it deterministic (based on some seed — #46221?) even for otherwise-nondeterministic code. My question is, though: what's the harm in exercising nondeterminism? That might mean that the fuzzer explores some redundant parts of the input space, but if the behavior of the function is nondeterministic then it's good to explore some redundant parts of the input space in order to catch scheduling-sensitive bugs (compare #22569, #43794). |
I think exercising non-deterministic behavior of the functions being fuzzed is definitely expected behavior, my concern is with non-deterministic behavior of code that isn't being fuzzed. Going back to my initial example, if I'm fuzzing a function which simply does Perhaps that level of noise is acceptable though? It's somewhat hard to tell without a good set of targets to test against whether this will significantly impede the performance of the fuzzing engine. |
I think when it comes to fuzzing, not all forms of non-determinism are equally helpful. Randomness or pseudo-randomness is important of course within the mutator, and randomness in how threads or goroutines are scheduled can be helpful, especially if running under a race detector, but it can be less helpful to have randomness in the coverage metrics or randomness in background code that is not under test (e.g., GC runtime code) that effectively drives randomness in the coverage. FWIW, here is a slightly older snapshot of some of the instrumentation exclusions for dvyukov/go-fuzz: ignore := map[string]bool{
"runtime": true, // lots of non-determinism and irrelevant code paths (e.g. different paths in mallocgc, chans and maps)
"runtime/internal/atomic": true, // runtime depends on it
"runtime/internal/sys": true, // runtime depends on it
"unsafe": true, // nothing to see here (also creates import cycle with go-fuzz-dep)
"errors": true, // nothing to see here (also creates import cycle with go-fuzz-dep)
"syscall": true, // creates import cycle with go-fuzz-dep (and probably nothing to see here)
"internal/syscall/windows/sysdll": true, //syscall depends on it
"sync": true, // non-deterministic and not interesting (also creates import cycle with go-fuzz-dep)
"sync/atomic": true, // not interesting (also creates import cycle with go-fuzz-dep)
"time": true, // creates import cycle with go-fuzz-dep
"internal/bytealg": true, // runtime depends on it
"internal/cpu": true, // runtime depends on it
"internal/race": true, // runtime depends on it
"runtime/cgo": true, // why would we instrument it?
"runtime/pprof": true, // why would we instrument it?
"runtime/race": true, // why would we instrument it?
} (You can of course also look at the current code, but it is more dynamic now, without the associated per-package comments). |
If you're importing a package that does non-trivial It still seems important to test the transient states, since (for example) a logical race between a background ticker and the API under test would not be diagnosed if we only tested steady states. But we could at least treat the coverage from transients different from the coverage for steady states, or try to run the program to a steady state before fuzzing to get a baseline for the “steady-state coverage”. |
What packages are currently getting instrumented? From the initial writeup of this issue, I was incorrectly interpreting that it was instrumenting the runtime and all of the stdlib, but it looks like that it currently avoids the runtime: https://github.com/golang/go/blob/dev.fuzz/src/cmd/compile/internal/base/flag.go#L240-L241 For example, is it instrumenting For a beta / early MVP, it might be better to start by notching out the "noisy" / non-deterministic packages that go-fuzz was notching out from the stdlib (e.g., |
Change https://golang.org/cl/342993 mentions this issue: |
…tation Counters in these packages are incremented by background goroutines for testing and internal/fuzz. They cause some inputs to seem "interesting" when they don't directly provide new coverage. Updates #46410 Change-Id: Ibe6bb3177f3b2ba23382a1693a4c6a576f94a423 Reviewed-on: https://go-review.googlesource.com/c/go/+/342993 Trust: Jay Conrod <jayconrod@google.com> Run-TryBot: Jay Conrod <jayconrod@google.com> Reviewed-by: Roland Shoemaker <roland@golang.org>
During development I've noticed that reported coverage will often decrease after restarting fuzzing, indicating that counters are being incremented by functionality not triggered by the code being fuzzed.
Doing some manual testing while working on a
encoding/json
indicates thatsync
in particular is quite noisy and will cause new edges to be triggered for the first couple of seconds.We probably can't automate this for third party code (although we should probably document that if you're fuzzing code that i.e. spawns background goroutines in an
init
it may be quite noisy) we may want to make an effort to find packages in the stdlib which have this problem. The simplest approach to this would be to run a fuzz target which does nothing (f.Fuzz(func(_ *testing.T, _b_ []byte) {})
) and incrementally instrument packages. Anything that then increases counted edges can probably be considered noisy, and should be considered for exclusion (there will, likely, be packages we cannot exclude, despite being noisy).The text was updated successfully, but these errors were encountered: