Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crypto/cipher: optimize safeXORBytes #35381

Open
josharian opened this issue Nov 5, 2019 · 11 comments
Open

crypto/cipher: optimize safeXORBytes #35381

josharian opened this issue Nov 5, 2019 · 11 comments
Labels
help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance Suggested Issues that may be good for new contributors looking for work to do.
Milestone

Comments

@josharian
Copy link
Contributor

The discussion at #31586 includes several techniques for optimizing xor of byte slices, including using encoding/binary (with native endianness) and loop unrolling. It'd be nice to apply them to safeXORBytes, which is used on many non-amd64 platforms. An optimized safeXORBytes might even be fast enough that we could delete fastXORBytes.

Somewhat related: #30553

cc @nhooyr

@josharian josharian added Suggested Issues that may be good for new contributors looking for work to do. Performance help wanted labels Nov 5, 2019
@josharian josharian added this to the Unplanned milestone Nov 5, 2019
@josharian
Copy link
Contributor Author

(Aside: I've wanted encoding/binary.NativeEndian to exist on multiple occasions. Here's another one.)

@nhooyr
Copy link
Contributor

nhooyr commented Nov 6, 2019

Will give this a shot on the weekend.

@nhooyr
Copy link
Contributor

nhooyr commented Nov 6, 2019

(Aside: I've wanted encoding/binary.NativeEndian to exist on multiple occasions. Here's another one.)

I agree this would be very useful. Some discussion in the past: https://groups.google.com/forum/#!topic/golang-nuts/3GEzwKfRRQw

@josharian
Copy link
Contributor Author

Will give this a shot on the weekend.

Awesome. No rush, in any case—the freeze just started, so nothing will go in for three months.

Thanks for the pointer to previous discussion of native endianness. I’ll file a proposal to discuss.

@bcmills
Copy link
Contributor

bcmills commented Nov 6, 2019

If you implement a generic, fast xor function, it could probably stand to be used in x/crypto/sha3 — the current implementation there is somewhat dubious due to alignment assumptions.

(See CL 203837.)

CC @FiloSottile

@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Nov 6, 2019
@josharian
Copy link
Contributor Author

@bcmills good point, thanks. If we put it in a place that both x/crypto and crypto/cipher can reach it, that'd presumably also resolve #30553.

@josharian
Copy link
Contributor Author

Proposal at #35398.

nhooyr added a commit to nhooyr/go that referenced this issue Feb 9, 2020
@nhooyr
Copy link
Contributor

nhooyr commented Feb 9, 2020

Added in nhooyr@24ce00e

Results against xorBytesSSE2 on amd64:

$ /Users/nhooyr/src/golang/go/bin/go test -run=_ -bench=XORBytes
goos: darwin
goarch: amd64
pkg: crypto/cipher
BenchmarkXORBytes/8Bytes-8    	176461338	         6.70 ns/op	1194.11 MB/s
BenchmarkXORBytes/128Bytes-8  	100000000	        11.6 ns/op	11062.06 MB/s
BenchmarkXORBytes/2048Bytes-8 	12895827	        95.2 ns/op	21517.42 MB/s
BenchmarkXORBytes/32768Bytes-8         	  627291	      1596 ns/op	20527.27 MB/s
BenchmarkFastSafeXORBytes/8Bytes-8     	147817434	         7.92 ns/op	1009.53 MB/s
BenchmarkFastSafeXORBytes/128Bytes-8   	66896612	        17.2 ns/op	7438.88 MB/s
BenchmarkFastSafeXORBytes/2048Bytes-8  	 6217075	       185 ns/op	11069.10 MB/s
BenchmarkFastSafeXORBytes/32768Bytes-8 	  397446	      3007 ns/op	10897.15 MB/s
PASS
ok  	crypto/cipher	12.014s

So safe fast is still about 2x as slow. Still 10x faster in my testing than basic simple xor though.

@nhooyr
Copy link
Contributor

nhooyr commented Feb 9, 2020

nhooyr@68261d7

Added safe xor to the benchmark:

$ /Users/nhooyr/src/golang/go/bin/go test -run=_ -bench=XORBytes
goos: darwin
goarch: amd64
pkg: crypto/cipher
BenchmarkXORBytes/default/8B-8      	180542707	         6.79 ns/op	1178.83 MB/s
BenchmarkXORBytes/safe/8B-8         	135818607	         8.73 ns/op	 916.11 MB/s
BenchmarkXORBytes/fastSafe/8B-8     	139375618	         8.62 ns/op	 927.86 MB/s
BenchmarkXORBytes/default#01/128B-8 	100000000	        10.9 ns/op	11787.06 MB/s
BenchmarkXORBytes/safe#01/128B-8    	13410099	        83.8 ns/op	1527.65 MB/s
BenchmarkXORBytes/fastSafe#01/128B-8         	60092373	        19.7 ns/op	6490.28 MB/s
BenchmarkXORBytes/default#02/2048B-8         	13841852	        83.5 ns/op	24532.14 MB/s
BenchmarkXORBytes/safe#02/2048B-8            	  974487	      1180 ns/op	1735.13 MB/s
BenchmarkXORBytes/fastSafe#02/2048B-8        	 5222248	       220 ns/op	9314.25 MB/s
BenchmarkXORBytes/default#03/32768B-8        	  855936	      1448 ns/op	22634.24 MB/s
BenchmarkXORBytes/safe#03/32768B-8           	   61302	     19043 ns/op	1720.75 MB/s
BenchmarkXORBytes/fastSafe#03/32768B-8       	  332857	      3440 ns/op	9524.70 MB/s
PASS
ok  	crypto/cipher	17.433s

Sometimes nearly 3x as slow as the current default.

@josharian
Copy link
Contributor Author

cc @FiloSottile for removing assembly in favor of pure Go

Filippo, is 2x-3x in this case within range of switching to pure Go?

There may also be further optimizations available (either in the pure Go or in the compiler); I haven't checked yet, and may not for a little while.

cc @CAFxX who I believe likes optimizing stuff like this

tbeeck added a commit to tbeeck/go that referenced this issue Jan 17, 2022
Optimize safeXORBytes based on this conversation: golang#35381
tbeeck added a commit to tbeeck/go that referenced this issue Jan 17, 2022
Optimize safeXORBytes based on this conversation: golang#35381
@tbeeck
Copy link

tbeeck commented Jan 17, 2022

Hello all! I gave this optimization a shot in the PR above. First time contributing to golang so looking for all kinds of feedback 🙇

tbeeck added a commit to tbeeck/go that referenced this issue Jan 20, 2022
Optimize safeXORBytes based on this conversation: golang#35381
tbeeck added a commit to tbeeck/go that referenced this issue Feb 1, 2022
Optimize safeXORBytes based on this conversation: golang#35381
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance Suggested Issues that may be good for new contributors looking for work to do.
Projects
None yet
Development

No branches or pull requests

4 participants