Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: statically initialize compute-only data #18663

Closed
dsnet opened this issue Jan 15, 2017 · 4 comments
Closed

cmd/compile: statically initialize compute-only data #18663

dsnet opened this issue Jan 15, 2017 · 4 comments

Comments

@dsnet
Copy link
Member

dsnet commented Jan 15, 2017

Consider the LUT:

var ReverseLUT = []byte{
	0x00, 0x80, 0x40, 0xc0, 0x20, 0xa0, 0x60, 0xe0, 0x10, 0x90, 0x50, 0xd0, 0x30, 0xb0, 0x70, 0xf0,
	0x08, 0x88, 0x48, 0xc8, 0x28, 0xa8, 0x68, 0xe8, 0x18, 0x98, 0x58, 0xd8, 0x38, 0xb8, 0x78, 0xf8,
	0x04, 0x84, 0x44, 0xc4, 0x24, 0xa4, 0x64, 0xe4, 0x14, 0x94, 0x54, 0xd4, 0x34, 0xb4, 0x74, 0xf4,
	0x0c, 0x8c, 0x4c, 0xcc, 0x2c, 0xac, 0x6c, 0xec, 0x1c, 0x9c, 0x5c, 0xdc, 0x3c, 0xbc, 0x7c, 0xfc,
	0x02, 0x82, 0x42, 0xc2, 0x22, 0xa2, 0x62, 0xe2, 0x12, 0x92, 0x52, 0xd2, 0x32, 0xb2, 0x72, 0xf2,
	0x0a, 0x8a, 0x4a, 0xca, 0x2a, 0xaa, 0x6a, 0xea, 0x1a, 0x9a, 0x5a, 0xda, 0x3a, 0xba, 0x7a, 0xfa,
	0x06, 0x86, 0x46, 0xc6, 0x26, 0xa6, 0x66, 0xe6, 0x16, 0x96, 0x56, 0xd6, 0x36, 0xb6, 0x76, 0xf6,
	0x0e, 0x8e, 0x4e, 0xce, 0x2e, 0xae, 0x6e, 0xee, 0x1e, 0x9e, 0x5e, 0xde, 0x3e, 0xbe, 0x7e, 0xfe,
	0x01, 0x81, 0x41, 0xc1, 0x21, 0xa1, 0x61, 0xe1, 0x11, 0x91, 0x51, 0xd1, 0x31, 0xb1, 0x71, 0xf1,
	0x09, 0x89, 0x49, 0xc9, 0x29, 0xa9, 0x69, 0xe9, 0x19, 0x99, 0x59, 0xd9, 0x39, 0xb9, 0x79, 0xf9,
	0x05, 0x85, 0x45, 0xc5, 0x25, 0xa5, 0x65, 0xe5, 0x15, 0x95, 0x55, 0xd5, 0x35, 0xb5, 0x75, 0xf5,
	0x0d, 0x8d, 0x4d, 0xcd, 0x2d, 0xad, 0x6d, 0xed, 0x1d, 0x9d, 0x5d, 0xdd, 0x3d, 0xbd, 0x7d, 0xfd,
	0x03, 0x83, 0x43, 0xc3, 0x23, 0xa3, 0x63, 0xe3, 0x13, 0x93, 0x53, 0xd3, 0x33, 0xb3, 0x73, 0xf3,
	0x0b, 0x8b, 0x4b, 0xcb, 0x2b, 0xab, 0x6b, 0xeb, 0x1b, 0x9b, 0x5b, 0xdb, 0x3b, 0xbb, 0x7b, 0xfb,
	0x07, 0x87, 0x47, 0xc7, 0x27, 0xa7, 0x67, 0xe7, 0x17, 0x97, 0x57, 0xd7, 0x37, 0xb7, 0x77, 0xf7,
	0x0f, 0x8f, 0x4f, 0xcf, 0x2f, 0xaf, 0x6f, 0xef, 0x1f, 0x9f, 0x5f, 0xdf, 0x3f, 0xbf, 0x7f, 0xff,
}

This could represented with the compute-only function:

var ReverseLUT = func() (lut [256]byte) {
	for i := range lut {
		b := uint8(i)
		b = (b&0xaa)>>1 | (b&0x55)<<1
		b = (b&0xcc)>>2 | (b&0x33)<<2
		b = (b&0xf0)>>4 | (b&0x0f)<<4
		lut[i] = b
	}
	return lut
}()

For functions that are compute only at initialization, we could consider pre-computing their value at compilation time and statically assigning the resulting in the compiled binary (assuming the space-complexity is a better trade-off compared to the time-complexity).

While the above example is somewhat trivial, it is helpful for more complex LUTs. For example, this commit 5c78589 removed a more complex LUT for huffman decoding and perform the LUT computation at runtime. It was annoying keeping two separate huffman decoder implementations in sync along with using code-generation to create the LUT.

It would have been nice to just do:

var fixedHuffmanDecoder = func() (d huffmanDecoder) {
	var bits [288]int
	for i := 0; i < 144; i++ {
		bits[i] = 8
	}
	for i := 144; i < 256; i++ {
		bits[i] = 9
	}
	for i := 256; i < 280; i++ {
		bits[i] = 7
	}
	for i := 280; i < 288; i++ {
		bits[i] = 8
	}
	d.init(bits[:])
	return
}()
@dsnet dsnet added this to the Unplanned milestone Jan 15, 2017
@minux
Copy link
Member

minux commented Jan 15, 2017 via email

@dsnet
Copy link
Member Author

dsnet commented Jan 15, 2017

this kind of feature is very useful to create DoS attacks on the compiler.

Yea, It's only worth doing if the compiler has a built-in bound on how long it would execute initialization code for. There's no way for it to know if the function halts.

@minux
Copy link
Member

minux commented Jan 15, 2017 via email

@dsnet
Copy link
Member Author

dsnet commented Jan 15, 2017

What's wrong with using go:generate to generate the static data

Some challenges that I run into:

  • The data being initialized depended on internal functionality that was not exported. So a generator tool could not trivially call the initialization code. You either had to copy the initialization logic (as was done in 5c78589), or the generator did some code re-writing to make it exported before directly calling it.
  • Even if the generator could somehow call the initialization code, sometimes the data generated included private data structures from another package. There is no way to get around this via generated code. (e.g., I have a general huffman decoder package, but I want to have statically initialized huffman LUTs, the contents of which is entirely dependent on the compression format).

Any time bound will introduce non-determinism to the compiler output.

That's a pretty fair argument against doing this in the compiler. I'm closing this.

@dsnet dsnet closed this as completed Jan 15, 2017
@golang golang locked and limited conversation to collaborators Jan 15, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants