Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

image/draw: drawPaletted hot path function optimization #22375

Closed
artyom opened this issue Oct 21, 2017 · 5 comments
Closed

image/draw: drawPaletted hot path function optimization #22375

artyom opened this issue Oct 21, 2017 · 5 comments

Comments

@artyom
Copy link
Member

artyom commented Oct 21, 2017

Current implementation of drawPaletted() function has the following calls to sqDiff() function in its hot path:

sum := sqDiff(er, p[0]) + sqDiff(eg, p[1]) + sqDiff(eb, p[2]) + sqDiff(ea, p[3])

Number of executions of this line for each drawPaletted() call is between width×height and width×height×palette size.

Here's how sqDiff() currently implemented:

go/src/image/draw/draw.go

Lines 562 to 574 in a31e0a4

// sqDiff returns the squared-difference of x and y, shifted by 2 so that
// adding four of those won't overflow a uint32.
//
// x and y are both assumed to be in the range [0, 0xffff].
func sqDiff(x, y int32) uint32 {
var d uint32
if x > y {
d = uint32(x - y)
} else {
d = uint32(y - x)
}
return (d * d) >> 2
}

It can be reduced to:

func sqDiff(x, y int32) uint32 {
        d := uint32(x - y)
        return (d * d) >> 2
}

This relies on overflows but produces the same result, see https://play.golang.org/p/6q3Cvqk1k7

While the change itself is rather miniscule, the net effect of it being in the hot path of the drawPaletted() is noticeable in benchmarks, i.e. QuantizedEncode benchmark from the image/gif package shows significant improvements after such change applied:

name               old time/op    new time/op    delta
QuantizedEncode-4     880ms ± 2%     482ms ± 2%  -45.20%  (p=0.008 n=5+5)

name               old speed      new speed      delta
QuantizedEncode-4  1.40MB/s ± 2%  2.55MB/s ± 2%  +82.52%  (p=0.008 n=5+5)

name               old alloc/op   new alloc/op   delta
QuantizedEncode-4     417kB ± 0%     417kB ± 0%     ~     (all equal)

name               old allocs/op  new allocs/op  delta
QuantizedEncode-4      13.0 ± 0%      13.0 ± 0%     ~     (all equal)

Please let me know if this is something that can be accepted as a CL.

@ALTree
Copy link
Member

ALTree commented Oct 21, 2017

This relies on overflows

Looks like the spec guarantees good behaviour for unsigned ints:

For unsigned integer values, the operations +, -, *, and << are computed modulo 2n, where n is the bit width of the unsigned integer's type. Loosely speaking, these unsigned integer operations discard high bits upon overflow, and programs may rely on ``wrap around''.

@crawshaw
Copy link
Member

@nigeltao

@mvdan
Copy link
Member

mvdan commented Oct 21, 2017

Seems like there are two occurences, too:

 $ gogrep -x 'if $x > $y { $_ } else { $_ }; $_' -g '$_ * $_' image/...
color/color.go:316:2: if x > y { d = x - y; } else { d = y - x; }; return (d * d) >> 2
draw/draw.go:568:2: if x > y { d = uint32(x - y); } else { d = uint32(y - x); }; return (d * d) >> 2

@nigeltao
Copy link
Contributor

It's acceptable as a CL, but the optimized func sqDiff would need a comment to explain correctness wrt overflow, and some tests, especially for inputs near 0 (both positive and negative) and near -0x80000000.

If that func sqDiff is copy/pasted twice, just comment and test one of them (probably color/color.go being canonical) and add a shorter comment in the other pointing to the first one.

@gopherbot
Copy link

Change https://golang.org/cl/72373 mentions this issue: image/draw, image/color: optimize hot path sqDiff function

@golang golang locked and limited conversation to collaborators Oct 27, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants