Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/html: fuzz this package #27848

Open
odeke-em opened this issue Sep 25, 2018 · 13 comments
Open

x/net/html: fuzz this package #27848

odeke-em opened this issue Sep 25, 2018 · 13 comments
Labels
help wanted Testing An issue that has been verified to require only test changes, not just a test failure.
Milestone

Comments

@odeke-em
Copy link
Member

Given a couple of bugs reported by @tr3ee from malformed/incomplete tags
like:

whose reproducers are quite simple and have caused runtime panics or infinite hangs, perhaps fuzzing could help us discover what lurks beyond and even such cases.

/cc @namusyaka @dgryski @dvyukov @bradfitz @nigeltao

@gopherbot gopherbot added this to the Unreleased milestone Sep 25, 2018
@yorelog
Copy link

yorelog commented Sep 25, 2018

This bug has fixed.
See: https://go-review.googlesource.com/136875

@namusyaka
Copy link
Member

The current implementation seems to be incompleted, will be fixed by conforming latest spec.

@bcmills bcmills added the Testing An issue that has been verified to require only test changes, not just a test failure. label Sep 25, 2018
@nigeltao
Copy link
Contributor

@yorelog that CL fixed #27702 in particular, but I agree with @odeke-em that, in general, it could be useful to fuzz x/net/html, over and above reacting to specific bugs.

That's easy for me to say, though. I don't have time to work on this myself.

@empijei
Copy link
Contributor

empijei commented Sep 26, 2018

What would you suggest as a fuzzing strategy? I could run domato against this lib and report crashes/hangs if this makes any sense.

@agnivade
Copy link
Contributor

The idea is to use go-fuzz.

@empijei
Copy link
Contributor

empijei commented Sep 26, 2018

I would happily use go-fuzz but I'm not sure fuzzing an html parser with just random data would cover all interesting paths. It's hard to produce stuff like <math><template><mo><template> (one of the bugs listed above was triggered by that) with a random sequence generator.

Maybe we could use both?

@agnivade
Copy link
Contributor

It's not completely random. You can specify an initial corpus data. go-fuzz will take it from there.

@dvyukov
Copy link
Member

dvyukov commented Sep 26, 2018

It's not just random data, see:
https://go-talks.appspot.com/github.com/dvyukov/go-fuzz/slides/go-fuzz.slide
https://go-talks.appspot.com/github.com/dvyukov/go-fuzz/slides/fuzzing.slide
Also you can pre-bootstrap corpus with some meaningful inputs.

@dvyukov
Copy link
Member

dvyukov commented Sep 26, 2018

Actually I think I already did this in 2015:
https://github.com/dvyukov/go-fuzz-corpus/blob/master/html/html.go
But the corpus is not checked in.

@empijei
Copy link
Contributor

empijei commented Sep 26, 2018

Update:

Running gofuzz but didn't find anything so far (except for the already reported bugs) but I will leave it running for a while

Ran domato against the patched html library and found 3 crashes with a sample size of 10K files. Is anyone interested in looking into the cause of the crash? (The files are big and messy to inspect, will probably take me some time to go through them).

@nigeltao
Copy link
Contributor

I'm not sure fuzzing an html parser with just random data would cover all interesting paths

There's at least a couple of approaches to addressing this.

One is to use a "fuzzing dictionary" and/or "seed corpus", described in https://github.com/google/oss-fuzz/blob/master/docs/ideal_integration.md

Two is to accept arbitrary random bytes as input, and map each byte to a string, a string more likely to tickle interesting code paths in the HTML parser. For example: https://play.golang.org/p/3QE4960bHsa

Doing the reverse map from the existing HTML test cases to this "compressed" format is left as an exercise for the reader.

Once you have a dense mapping like this, where each raw input byte is relatively independent, it might be relatively straightfoward to minimize the repro case, if go-fuzz doesn't already help you do so: cut out random sub-slices of the "compressed", backing off if it no longer crashes.

@empijei
Copy link
Contributor

empijei commented Sep 27, 2018

Nice idea, that is probably going to take a longer while. I'll add info when I have news. Thanks for this.

@empijei
Copy link
Contributor

empijei commented Mar 4, 2019

Just to give a quick update: I gave this a shot a couple of months ago and didn't find any relevant crashes in a couple of weeks of fuzzing.

My plan now is to wait and see how support for oss-fuzz and first-class citizenship for fuzzing discussions will unfold. If fuzzing becomes part of the testing flow in Go I'll provide the needed FuzzXyz functions and write the necessary configurations to have it run on some beefy hardware and cover it properly.
Otherwise I'll setup some machines to fuzz it in other ways.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Testing An issue that has been verified to require only test changes, not just a test failure.
Projects
None yet
Development

No branches or pull requests

9 participants