cmd/compile: statically initialize compute-only data #18663

dsnet · 2017-01-15T04:37:49Z

Consider the LUT:

var ReverseLUT = []byte{
	0x00, 0x80, 0x40, 0xc0, 0x20, 0xa0, 0x60, 0xe0, 0x10, 0x90, 0x50, 0xd0, 0x30, 0xb0, 0x70, 0xf0,
	0x08, 0x88, 0x48, 0xc8, 0x28, 0xa8, 0x68, 0xe8, 0x18, 0x98, 0x58, 0xd8, 0x38, 0xb8, 0x78, 0xf8,
	0x04, 0x84, 0x44, 0xc4, 0x24, 0xa4, 0x64, 0xe4, 0x14, 0x94, 0x54, 0xd4, 0x34, 0xb4, 0x74, 0xf4,
	0x0c, 0x8c, 0x4c, 0xcc, 0x2c, 0xac, 0x6c, 0xec, 0x1c, 0x9c, 0x5c, 0xdc, 0x3c, 0xbc, 0x7c, 0xfc,
	0x02, 0x82, 0x42, 0xc2, 0x22, 0xa2, 0x62, 0xe2, 0x12, 0x92, 0x52, 0xd2, 0x32, 0xb2, 0x72, 0xf2,
	0x0a, 0x8a, 0x4a, 0xca, 0x2a, 0xaa, 0x6a, 0xea, 0x1a, 0x9a, 0x5a, 0xda, 0x3a, 0xba, 0x7a, 0xfa,
	0x06, 0x86, 0x46, 0xc6, 0x26, 0xa6, 0x66, 0xe6, 0x16, 0x96, 0x56, 0xd6, 0x36, 0xb6, 0x76, 0xf6,
	0x0e, 0x8e, 0x4e, 0xce, 0x2e, 0xae, 0x6e, 0xee, 0x1e, 0x9e, 0x5e, 0xde, 0x3e, 0xbe, 0x7e, 0xfe,
	0x01, 0x81, 0x41, 0xc1, 0x21, 0xa1, 0x61, 0xe1, 0x11, 0x91, 0x51, 0xd1, 0x31, 0xb1, 0x71, 0xf1,
	0x09, 0x89, 0x49, 0xc9, 0x29, 0xa9, 0x69, 0xe9, 0x19, 0x99, 0x59, 0xd9, 0x39, 0xb9, 0x79, 0xf9,
	0x05, 0x85, 0x45, 0xc5, 0x25, 0xa5, 0x65, 0xe5, 0x15, 0x95, 0x55, 0xd5, 0x35, 0xb5, 0x75, 0xf5,
	0x0d, 0x8d, 0x4d, 0xcd, 0x2d, 0xad, 0x6d, 0xed, 0x1d, 0x9d, 0x5d, 0xdd, 0x3d, 0xbd, 0x7d, 0xfd,
	0x03, 0x83, 0x43, 0xc3, 0x23, 0xa3, 0x63, 0xe3, 0x13, 0x93, 0x53, 0xd3, 0x33, 0xb3, 0x73, 0xf3,
	0x0b, 0x8b, 0x4b, 0xcb, 0x2b, 0xab, 0x6b, 0xeb, 0x1b, 0x9b, 0x5b, 0xdb, 0x3b, 0xbb, 0x7b, 0xfb,
	0x07, 0x87, 0x47, 0xc7, 0x27, 0xa7, 0x67, 0xe7, 0x17, 0x97, 0x57, 0xd7, 0x37, 0xb7, 0x77, 0xf7,
	0x0f, 0x8f, 0x4f, 0xcf, 0x2f, 0xaf, 0x6f, 0xef, 0x1f, 0x9f, 0x5f, 0xdf, 0x3f, 0xbf, 0x7f, 0xff,
}

This could represented with the compute-only function:

var ReverseLUT = func() (lut [256]byte) {
	for i := range lut {
		b := uint8(i)
		b = (b&0xaa)>>1 | (b&0x55)<<1
		b = (b&0xcc)>>2 | (b&0x33)<<2
		b = (b&0xf0)>>4 | (b&0x0f)<<4
		lut[i] = b
	}
	return lut
}()

For functions that are compute only at initialization, we could consider pre-computing their value at compilation time and statically assigning the resulting in the compiled binary (assuming the space-complexity is a better trade-off compared to the time-complexity).

While the above example is somewhat trivial, it is helpful for more complex LUTs. For example, this commit 5c78589 removed a more complex LUT for huffman decoding and perform the LUT computation at runtime. It was annoying keeping two separate huffman decoder implementations in sync along with using code-generation to create the LUT.

It would have been nice to just do:

var fixedHuffmanDecoder = func() (d huffmanDecoder) {
	var bits [288]int
	for i := 0; i < 144; i++ {
		bits[i] = 8
	}
	for i := 144; i < 256; i++ {
		bits[i] = 9
	}
	for i := 256; i < 280; i++ {
		bits[i] = 7
	}
	for i := 280; i < 288; i++ {
		bits[i] = 8
	}
	d.init(bits[:])
	return
}()

The text was updated successfully, but these errors were encountered:

minux · 2017-01-15T04:44:06Z

this kind of feature is very useful to create DoS attacks on the compiler. Just use //go:generate to write the table as a static data instead. Then the table only needs to be computed once, not every time the package is compiled.

dsnet · 2017-01-15T04:47:23Z

this kind of feature is very useful to create DoS attacks on the compiler.

Yea, It's only worth doing if the compiler has a built-in bound on how long it would execute initialization code for. There's no way for it to know if the function halts.

minux · 2017-01-15T04:58:41Z

On Sat, Jan 14, 2017 at 11:47 PM, Joe Tsai ***@***.***> wrote: this kind of feature is very useful to create DoS attacks on the compiler. Yea, It's only worth doing if the compiler has a built-in bound on how long it would execute initialization code for. There's no way for it to know if the function halts.

Any time bound will introduce non-determinism to the compiler output. e.g. a package compiled on slower machine won't have static initialization, whereas on a faster machine, it will. If we use a step based bound, how do we determine the bound? What's wrong with using go:generate to generate the static data once instead of wasting energy generating it every time the package is compiled? I think this kind of compiler "feature" is on a very slippy road. Soon people will ask for more and more function to be precomputed by the compiler.

dsnet · 2017-01-15T05:12:10Z

What's wrong with using go:generate to generate the static data

Some challenges that I run into:

The data being initialized depended on internal functionality that was not exported. So a generator tool could not trivially call the initialization code. You either had to copy the initialization logic (as was done in 5c78589), or the generator did some code re-writing to make it exported before directly calling it.
Even if the generator could somehow call the initialization code, sometimes the data generated included private data structures from another package. There is no way to get around this via generated code. (e.g., I have a general huffman decoder package, but I want to have statically initialized huffman LUTs, the contents of which is entirely dependent on the compression format).

Any time bound will introduce non-determinism to the compiler output.

That's a pretty fair argument against doing this in the compiler. I'm closing this.

dsnet added the Performance label Jan 15, 2017

dsnet added this to the Unplanned milestone Jan 15, 2017

dsnet closed this as completed Jan 15, 2017

golang locked and limited conversation to collaborators Jan 15, 2018

gopherbot added the FrozenDueToAge label Jan 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmd/compile: statically initialize compute-only data #18663

cmd/compile: statically initialize compute-only data #18663

dsnet commented Jan 15, 2017 •

edited

minux commented Jan 15, 2017 via email

dsnet commented Jan 15, 2017

minux commented Jan 15, 2017 via email

dsnet commented Jan 15, 2017 •

edited

cmd/compile: statically initialize compute-only data #18663

cmd/compile: statically initialize compute-only data #18663

Comments

dsnet commented Jan 15, 2017 • edited

minux commented Jan 15, 2017 via email

dsnet commented Jan 15, 2017

minux commented Jan 15, 2017 via email

dsnet commented Jan 15, 2017 • edited

dsnet commented Jan 15, 2017 •

edited

dsnet commented Jan 15, 2017 •

edited