Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: testing: custom mutator support for fuzzing #48815

Open
s3nt3 opened this issue Oct 6, 2021 · 21 comments
Open

proposal: testing: custom mutator support for fuzzing #48815

s3nt3 opened this issue Oct 6, 2021 · 21 comments
Labels
FeatureRequest fuzz Issues related to native fuzzing support NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Proposal
Milestone

Comments

@s3nt3
Copy link

s3nt3 commented Oct 6, 2021

As the official fuzzer implementation provided by golang, the native fuzzer should be well suited for various usage scenarios. However, currently native fuzzers only support general mutation algorithms for built-in types. Therefore, in many cases, the native fuzzer cannot efficiently generate test inputs. For example, when testing a DSL parser, the mutator will generate a large amount of output that cannot pass the syntax or semantic check. A possible solution is to provide support for custom mutators, so that users can implement custom mutators for various fuzz targets and reuse other parts of the native fuzzer.

I tried on the existing code and designed the following interface:

 type CustomMutator interface {
     Marshal() ([]byte, error)
     Unmarshal([]byte) error
     Mutate() error
 }

The object that implements the above interface can be passed as an argument to the testing.F.Fuzz method, and it will call the Mutate method to use the custom mutation algorithm. The Marshal and Unmarshal methods ensure that it can be imported/exported to a corpus file.

I think supporting custom mutator will bring the following benefits:

  • the usage scenarios of native fuzzer are expanded
  • it is more convenient for the community to test and optimize mutation algorithms
  • the interface of custom mutator is very similar to the struct mutation interface mentioned in the draft proposal, so implementing a custom mutator may be a way to support struct mutation

In addition, custom mutator will also bring some side effects, such as the custom mutator code will be instrumented, which may affect performance and the accuracy of coverage statistics.

/cc @jayconrod

@jayconrod jayconrod changed the title [dev.fuzz] mutator: Add custom mutator support testing: custom mutator support for fuzzing Oct 6, 2021
@jayconrod jayconrod added fuzz Issues related to native fuzzing support NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Oct 6, 2021
@jayconrod jayconrod added this to the Backlog milestone Oct 6, 2021
@jayconrod
Copy link
Contributor

cc @golang/fuzzing

@s3nt3
Copy link
Author

s3nt3 commented Dec 1, 2021

A friendly ping: any update?

@rolandshoemaker
Copy link
Member

This is unlikely to be implemented for 1.18, but might be considered for 1.19.

@s3nt3
Copy link
Author

s3nt3 commented Dec 2, 2021

Thanks for your reply, I will maintain an implementation in my local branch(https://github.com/s3nt3/go/tree/dev.fuzz.custom_mutator). Looking forward to supporting this feature in the new development cycle.

@icholy
Copy link

icholy commented May 28, 2022

The interface should probably satisfy encoding.{TextMarshaler,TextUnmarshaler}.

 type CustomMutator interface {
     Mutate() error
     MarshalText() ([]byte, error)
     UnmarshalText([]byte) error
 }

@nevkontakte
Copy link
Contributor

I believe Go 1.20 development cycle is currently open, and implementation in https://github.com/s3nt3/go/tree/dev.fuzz.custom_mutator looks like a very compact change. Any chance of this being included in 1.20? I could definitely use this capability in some of my projects :)

@rhansen
Copy link

rhansen commented Apr 10, 2023

Additional background: Structure-Aware Fuzzing with libFuzzer

@rhansen
Copy link

rhansen commented Apr 27, 2023

The interface should probably satisfy encoding.{TextMarshaler,TextUnmarshaler}.

Users might be fuzzing binary data such as images or protobufs. Wouldn't binary marshaler/unmarshaler be a better choice?

type CustomMutator interface {
	encoding.BinaryMarshaler
	encoding.BinaryUnmarshaler
	Mutate() error
}

Also, would it be useful to supply an int64 seed, rand.Source, or rand.Rand to help with reproducibility? For example:

type CustomMutator interface {
	encoding.BinaryMarshaler
	encoding.BinaryUnmarshaler
	Mutate(seed int64) error
}

@icholy
Copy link

icholy commented Apr 27, 2023

@rhansen the corpus data is stored in a text format

@rhansen
Copy link

rhansen commented Apr 27, 2023

@rhansen the corpus data is stored in a text format

True, but if I understand correctly, that's an implementation detail. (At least I couldn't find an official specification of the testdata/fuzz/FuzzTestName/* corpus file format.) Either way, Go already encodes binary input values for writing to the corpus file. I would expect it to treat a CustomMutator value the same as a []byte value, aside from the marshal/unmarshal calls and a unique type identifier string.

rhansen added a commit to rhansen/go that referenced this issue May 7, 2023
This commit extends F.Fuzz to support fuzz functions with custom input
parameter types.  Custom input types must implement the MarshalBinary,
UnmarshalBinary, and Mutate methods.  This enables structure-aware
fuzzing; see [1] for background.

Custom values are encoded to disk like this:

    go test fuzz v1
    custom("*example.com/mod/pkg.Typename", "bytes go here")

[1] https://github.com/google/fuzzing/blob/master/docs/structure-aware-fuzzing.md

Fixes golang#48815
rhansen added a commit to rhansen/go that referenced this issue May 7, 2023
This commit extends F.Fuzz to support fuzz functions with custom input
parameter types.  Custom input types must implement the MarshalBinary,
UnmarshalBinary, and Mutate methods.  This enables structure-aware
fuzzing; see [1] for background.

Custom values are encoded to disk like this:

    go test fuzz v1
    custom("*example.com/mod/pkg.Typename", "bytes go here")

[1] https://github.com/google/fuzzing/blob/master/docs/structure-aware-fuzzing.md

Fixes golang#48815

Change-Id: Ib5c8dfadd2df9456b8a82ceec7d8e289a4c7c3de
@gopherbot
Copy link

Change https://go.dev/cl/493304 mentions this issue: testing: add support for fuzzing custom input types

rhansen added a commit to rhansen/go that referenced this issue May 7, 2023
This commit extends F.Fuzz to support fuzz functions with custom input
parameter types.  Custom input types must implement the MarshalBinary,
UnmarshalBinary, and Mutate methods.  This enables structure-aware
fuzzing; see [1] for background.

Custom values are encoded to disk like this:

    go test fuzz v1
    custom("*example.com/mod/pkg.Typename", "bytes go here")

[1] https://github.com/google/fuzzing/blob/master/docs/structure-aware-fuzzing.md

Fixes golang#48815

Change-Id: Ib5c8dfadd2df9456b8a82ceec7d8e289a4c7c3de
@bcmills bcmills changed the title testing: custom mutator support for fuzzing proposal: testing: custom mutator support for fuzzing May 8, 2023
@bcmills bcmills modified the milestones: Backlog, Proposal May 8, 2023
@gopherbot
Copy link

Change https://go.dev/cl/493637 mentions this issue: design/48815-custom-fuzz-input-types.md: new design proposal

@rhansen
Copy link

rhansen commented May 10, 2023

I started a design doc. Feedback would be appreciated.

@nightlyone
Copy link
Contributor

How should fuzzing of complex, nested data types work where you don't control the implementation of its methods but nevertheless want/need to use them as data types in your types?

@rhansen
Copy link

rhansen commented May 11, 2023

How should fuzzing of complex, nested data types work where you don't control the implementation of its methods but nevertheless want/need to use them as data types in your types?

@nightlyone As long as it is possible to marshal/unmarshal the type to/from binary (such as JSON or protobuf), you should be able to fuzz it. For example, if the type you want to fuzz is *foo.Foo, and that type implements the encoding/json.Marshaler and encoding/json.Unmarshaler interfaces, you could fuzz it like this:

package foo_test

import (
	"context"
	"testing"

	"me.example/mod/foo"
)

type fuzzInput struct {
	foo.Foo
}

func (v *fuzzInput) MarshalBinary() ([]byte, error) { return v.MarshalJSON() }
func (v *fuzzInput) UnmarshalBinary(d []byte) error { return v.UnmarshalJSON(d) }
func (v *fuzzInput) Mutate(ctx context.Context, seed int64) error {
	// mutate v.Foo as desired
	return nil
}

func FuzzFoo(f *testing.F) {
	// Assumption: the zero value of foo.Foo is valid. If that is not the
	// case, call f.Add or create a `testdata/fuzz/FuzzFoo/*` seed file
	// so that f.Fuzz has a valid value to start with.
	f.Fuzz(func(t *testing.T, v *fuzzInput) {
		if err := foo.FunctionYouWantToFuzz(&v.Foo); err != nil {
			t.Fatal(err)
		}
	})
}

If the third-party type is not directly marshalable, you could instead marshal a machine-readable sequence of instructions that recreate the desired state. (For example, marshal a protobuf message that lists function calls and their arguments. These calls would be replayed when unmarshaling to reconstruct the state from scratch.) The Mutate method would effectively mutate the sequence of instructions.

gopherbot pushed a commit to golang/proposal that referenced this issue May 11, 2023
For golang/go#48815.

Change-Id: I021e4517940ff073254d9d56fcca623f4e2ed460
Reviewed-on: https://go-review.googlesource.com/c/proposal/+/493637
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@rhansen
Copy link

rhansen commented May 11, 2023

Open issues in the design doc:

  • What is the best way to obtain a stable, globally unique, and marshalable identifier from a reflect.Type? The reflect.Type.String method does not guarantee global uniqueness. See https://go.dev/cl/493304 for an initial attempt.
  • Should MarshalBinary not return an error, forcing devs to call panic on error? We can always add support for a returned error in the future if desired.
  • Should Mutate not return an error, forcing devs to call panic on error? We can always add support for a returned error in the future if desired.
  • Should Mutate take a context.Context in case it wants to be cancelable? (Maybe it wants to send RPCs, or otherwise do something expensive.)

@ianlancetaylor
Copy link
Contributor

Personally I think it's clearly useful for a MarshalBinary method to match encoding.BinaryMarshaler. If it seems really important to have a different signature, then the method should have a different name. Otherwise it will be annoying to use fuzzing with a type that has to implement encoding.BinaryMarshaler for its own purposes.

@gopherbot
Copy link

Change https://go.dev/cl/501537 mentions this issue: design/48815-custom-fuzz-input-types.md: revise from feedback

@rhansen
Copy link

rhansen commented Jun 9, 2023

After thinking on this for a while, I think the two best options for Mutate are:

  • Mutate(seed int64): Simple. Naturally hints to developers that the method is expected to be fast, repeatable, and error-free, which increases the effectiveness of fuzzing. Adding a context parameter or error return value (or both) might be YAGNI, but their absence in this option makes complex mutation operations more difficult to implement. The lack of an error return value encourages the use of panic, which is problematic for the reasons discussed in the design doc.
  • Mutate(ctx context.Context, seed int64) error: The error return value discourages the use of panic. The context makes this more future-proof by enabling advanced techniques once Mutate's repeatability requirement is removed. For example, Mutate could send an RPC to a service that feeds automatic crash report data to fuzzing tasks to increase the likelihood of encountering an interesting value. The context parameter and error return value might be YAGNI, but I believe the added implementation complexity and developer cognitive load is minor enough to not worry about it (they can be ignored in most use cases).

The design doc currently proposes Marshal(seed int64) error, which now feels like a half measure. I think we should either fully support complex use cases by passing a context, or we should go for simplicity and remove the error return value. I slightly prefer the option with the context parameter and error return value, so I sent out a merge request to update the design doc.

Anyone else have any opinions?

@gopherbot
Copy link

Change https://go.dev/cl/493302 mentions this issue: internal/fuzz: convert custom mutator panics to returned errors

gopherbot pushed a commit to golang/proposal that referenced this issue Jun 9, 2023
  * Add rationale explaining why the `customMutator` interface is not
    exported.
  * Switch `Mutate` to take a `context.Context`, and expand the
    associated rationale.
  * Minor readability tweaks.
  * Delete the open issues. (They don't describe problems that lack a
    solution; they are more like discussion points that should be
    addressed in the linked issue or in merge requests.)

For golang/go#48815.

Change-Id: I704ab930710d1b82a42b7923dbabc4e10543f5ca
Reviewed-on: https://go-review.googlesource.com/c/proposal/+/501537
Reviewed-by: Ian Lance Taylor <iant@google.com>
@sacheendra
Copy link

Hi! Would it make sense to pass a defaultMutate func([]any) []any as another parameter to Mutate?
The resulting signature would be Mutate(ctx context.Context, seed int64, defaultMutate func([]any) []any) error.
defaultMutate would be the current mutation function for supported types: https://github.com/golang/go/blob/master/src/internal/fuzz/mutator.go#L48

My motivation is that I only want to custom mutate a few fields in a struct for example, and leave the rest to the default mutator.
This is similar to LLVMFuzzerMutate used in LLVMFuzzerCustomMutator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FeatureRequest fuzz Issues related to native fuzzing support NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Proposal
Projects
Status: No status
Status: Incoming
Development

No branches or pull requests