x/tools: random program generator for testing #7985

dvyukov · 2014-05-13T06:21:54Z

There is CSmith (http://embed.cs.utah.edu/csmith/) that generates random correct C
programs, and it's proved to be extremely useful in testing C compilers.

We could do something similar for Go.
The tool can use existing ast/types/ssa packages for program generation and
serialization. The tool can start very small, e.g generate only simple assignment
statement. And then gradually grow.

The tool could be used for testing of a range of programs and packages besides gc/gccgo
compilation:
- gofmt idempotentness (gofmt several times and compare results)
- ast/types/ssa robustness and idempotentness (generate, serialize, parse, serialize,
parse, compare).
- govet robustness
- compilation in race mode which had lots of bugs in past
- compare gc, gccgo, ssa/interp output

I foresee piles of bugs there.

davecheney · 2014-05-13T07:26:31Z

Comment 1:

I would love to see this, do you have any idea where to start? I've looked at the csmith
code before but the logic they use to generate code is very confusing -- i'm tempted to
feed random noise to go/ast and compile anything that can be written to disk validly.

dvyukov · 2014-05-13T07:37:00Z

Comment 2:

I would try recursively construct a random correct program as:
- choose random number of statements
- for each statement choose random statement type
- for each expression in top level statement choose random type
- do the same recursively for subexpressions
This way it's possible to generate random functions, which can be used as expressions in
other functions.
There are complications like that for a slice indexing operations you need to have a
slice in scope. So probably I would need to track what variables are currently in scope.
Similarly it's possible to generate top-level declarations like constants and global
vars.

griesemer · 2014-05-13T20:18:12Z

Comment 3:

Feeding random noise is extremely unlikely to parse w/o errors, and even less likely to
compile. That said, feeding random noise to a compiler is an extremely good way to test
a compiler against crashes.
If it's just random noise (text) the parser is likely to bail quickly and thus the test
won't be very good. Here's a few ideas for this first step:
- create random sequences of valid Go tokens
- make sure the parser passes at least the very first check (package clause present)
- use existing programs (std lib) and modify randomly (parse, and then make a few random
AST changes)
These all would test robustness of parsers/compilers in the presence of syntactically
erroneous input.
dvyukov's suggestion for generation of random programs would be the next step. Those
programs should always parse, but may not compile due to semantic requirements.
The hard part is to track scope contents and types; that information must be used when
generating random programs that actually compile.

dvyukov · 2014-05-14T03:45:43Z

Comment 4:

> That said, feeding random noise to a compiler is an extremely good way to test a
compiler against crashes.
Security people say that they take good inputs and then start fuzzing it, in particular
just do random bit flipping. This has much higher chances to trigger interesting code
paths in software.

dvyukov · 2014-05-14T03:47:08Z

Comment 5:

Robert, do you have suggestions on what packages one needs to use to generate programs
that compile?

griesemer · 2014-05-14T15:56:38Z

Comment 6:

Re #5: I'm not sure there are any packages one would need. One could use go/ast and thus
randomly generate an ast rather than source text, but the difference in effort seems
small. As mentioned before, the "hard" part is tracking of scopes and types; expression
generation will have to choose random identifiers from those scopes and combine them
using random (but valid) operations depending on the types. One could use go/types to
get a scope and type representation, but I'm not sure the (minor) gain is worth the
dependency - the operations on those data structures are likely quite different from
what go/types provides.

dvyukov · 2014-05-14T16:07:23Z

Comment 7:

> One could use go/ast and thus randomly generate an ast rather than source text, but
the difference in effort seems small.
I see. On a second thought, just dumping program text using recursive functions like
randExpr(w) and randStmt(w) indeed looks simpler.

dvyukov · 2014-05-14T19:14:23Z

Comment 8:

Here what I have in mind:
http://play.golang.org/p/5d-8mPjr_h
Currently it can generate programs like:
http://play.golang.org/p/0NhOj4PUJc
I would not say that it's too difficult, but it's... tiresome.
Do you think it's the right approach? Do you see any reasons why it can't scale to
handle all other types of statements and expressions?

griesemer · 2014-05-14T19:40:40Z

Comment 9:

Looks not unreasonable. I'd agree, with "hard" I meant "tedious". I don't see why it
shouldn't scale, it's just a matter of actually doing the work.

dvyukov · 2014-05-16T17:17:06Z

Comment 10:

I've commit the prototype to http://gosmith.googlecode.com
It already has found 2 new bugs:
https://golang.org/issue/8011
https://golang.org/issue/8012
and this known bug:
https://golang.org/issue/7998

dvyukov · 2014-05-16T19:08:41Z

Comment 11:

Here we go:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61204

dvyukov · 2014-05-16T20:19:06Z

Comment 12:

Another one in gccgo:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61205

davecheney · 2014-05-17T10:21:57Z

Comment 13:

> I've commit the prototype to http://gosmith.googlecode.com
This is outstanding Dmitry.

Owner changed to @dvyukov.

Status changed to Accepted.

dvyukov · 2014-05-17T11:15:27Z

Comment 14:

Another one:
https://golang.org/issue/8017

dvyukov · 2014-05-18T15:51:46Z

Comment 15:

gofmt was clean because... well, I forgot to actually call the function that verifies
gofmt:
https://golang.org/issue/8021

dvyukov · 2014-05-19T09:08:58Z

Comment 16:

I've started generating several packages with several source files, it instantly kills
gccgo:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61226

dvyukov · 2014-05-20T08:03:35Z

Comment 17:

Two more in gccgo, I've created a list for all GoSmith bugs:
https://gcc.gnu.org/bugzilla/buglist.cgi?quicksearch=GoSmith&list_id=90399

adonovan · 2014-05-21T01:55:19Z

Comment 18:

This is a great start.  Depending on which piece of the stack you're trying to stress, I
can imagine the output you'd want would look very different.  For example, to test the
scanner you'd want all kinds of odd tokens: nested comments, string literals with odd
escape sequences, various numeric literals, even just line noise.  To test the type
checker you'd want well-formed ASTs that are mostly well-typed, but with random types,
assignments, identifiers, constants, etc.  To test code generation, you'd want only
well-typed programs that exercise control flow, operator semantics, etc.  They might all
have a lot in common, but more intelligence or structure is needed in the sentence
generator for later passes.  Looking over the regression tests for each tool might give
you some hints for where to focus your efforts.

dvyukov · 2014-05-26T16:53:58Z

Comment 19:

re #18
Agree, I think the current code mostly targets compiler middle-end.
Generators for different parts of the stack needs to be different programs. A single
program that tries to cover all pieces will sacrifice expressiveness and/or will have
excessive complexity.

ALTree · 2017-01-30T21:14:53Z

Implemented: dvyukov/go-fuzz, so closing this.

dvyukov added accepted labels May 26, 2014

rsc added this to the Unplanned milestone Apr 10, 2015

rsc removed release-none labels Apr 10, 2015

rsc changed the title ~~misc: random program generator for testing~~ x/tools: random program generator for testing Apr 14, 2015

rsc modified the milestones: Unreleased, Unplanned Apr 14, 2015

rsc removed the repo-tools label Apr 14, 2015

ALTree closed this as completed Jan 30, 2017

golang locked and limited conversation to collaborators Jan 30, 2018

gopherbot added the FrozenDueToAge label Jan 30, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

x/tools: random program generator for testing #7985

x/tools: random program generator for testing #7985

dvyukov commented May 13, 2014

davecheney commented May 13, 2014

dvyukov commented May 13, 2014

griesemer commented May 13, 2014

dvyukov commented May 14, 2014

dvyukov commented May 14, 2014

griesemer commented May 14, 2014

dvyukov commented May 14, 2014

dvyukov commented May 14, 2014

griesemer commented May 14, 2014

dvyukov commented May 16, 2014

dvyukov commented May 16, 2014

dvyukov commented May 16, 2014

davecheney commented May 17, 2014

dvyukov commented May 17, 2014

dvyukov commented May 18, 2014

dvyukov commented May 19, 2014

dvyukov commented May 20, 2014

adonovan commented May 21, 2014

dvyukov commented May 26, 2014

ALTree commented Jan 30, 2017

x/tools: random program generator for testing #7985

x/tools: random program generator for testing #7985

Comments

dvyukov commented May 13, 2014

davecheney commented May 13, 2014

dvyukov commented May 13, 2014

griesemer commented May 13, 2014

dvyukov commented May 14, 2014

dvyukov commented May 14, 2014

griesemer commented May 14, 2014

dvyukov commented May 14, 2014

dvyukov commented May 14, 2014

griesemer commented May 14, 2014

dvyukov commented May 16, 2014

dvyukov commented May 16, 2014

dvyukov commented May 16, 2014

davecheney commented May 17, 2014

dvyukov commented May 17, 2014

dvyukov commented May 18, 2014

dvyukov commented May 19, 2014

dvyukov commented May 20, 2014

adonovan commented May 21, 2014

dvyukov commented May 26, 2014

ALTree commented Jan 30, 2017