Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: memhash: use simple loop for short strings #67619

Open
adonovan opened this issue May 23, 2024 · 5 comments
Open

runtime: memhash: use simple loop for short strings #67619

adonovan opened this issue May 23, 2024 · 5 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@adonovan
Copy link
Member

adonovan commented May 23, 2024

I noticed in the course of developing starlark-go that maphash.String, which wraps runtime.memhash and thus runtime.aeshashbody, is slower than a trivial loop-based hash such as FNV for short strings (len <= 12). You might want to see whether there are gains to be made by changing runtime.memhash.

In the M1 benchmark below, "hard" is the AES 128-bit hash, soft is the FNV byte-at-a-time hash.

BenchmarkStringHash/hard-1-8         	425380128	         2.724 ns/op // slower
BenchmarkStringHash/soft-1-8         	968800584	         1.253 ns/op
BenchmarkStringHash/hard-2-8         	285782469	         4.155 ns/op // slower
BenchmarkStringHash/soft-2-8         	482670777	         2.543 ns/op
BenchmarkStringHash/hard-4-8         	238793694	         5.426 ns/op // slower
BenchmarkStringHash/soft-4-8         	359522328	         3.318 ns/op
BenchmarkStringHash/hard-8-8         	180634999	         6.658 ns/op // slower
BenchmarkStringHash/soft-8-8         	241373414	         4.995 ns/op
BenchmarkStringHash/hard-16-8        	205045033	         5.823 ns/op
BenchmarkStringHash/soft-16-8        	144273086	         8.322 ns/op
BenchmarkStringHash/hard-32-8        	206178392	         5.844 ns/op
BenchmarkStringHash/soft-32-8        	80871846	        14.96 ns/op
BenchmarkStringHash/hard-64-8        	175336156	         6.891 ns/op
BenchmarkStringHash/soft-64-8        	42610798	        28.30 ns/op
BenchmarkStringHash/hard-128-8       	125781333	         9.542 ns/op
BenchmarkStringHash/soft-128-8       	18739946	        64.47 ns/op
BenchmarkStringHash/hard-256-8       	59081041	        20.63 ns/op
BenchmarkStringHash/soft-256-8       	10185776	       118.5 ns/op
BenchmarkStringHash/hard-512-8       	23744743	        51.08 ns/op
BenchmarkStringHash/soft-512-8       	 5342551	       224.6 ns/op
BenchmarkStringHash/hard-1024-8      	10823535	       111.2 ns/op
BenchmarkStringHash/soft-1024-8      	 2731977	       438.0 ns/op
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label May 23, 2024
@cagedmantis cagedmantis added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 23, 2024
@cagedmantis
Copy link
Contributor

cc @golang/runtime

@randall77
Copy link
Contributor

What does the benchmark code look like?

BenchmarkStringHash/hard-1-8         	425380128	         2.724 ns/op // slower
BenchmarkStringHash/hard-2-8         	285782469	         4.155 ns/op // slower
BenchmarkStringHash/hard-4-8         	238793694	         5.426 ns/op // slower
BenchmarkStringHash/hard-8-8         	180634999	         6.658 ns/op // slower

This variance surprises me, because the hash for 1-15 bytes uses exactly the same hash operations. It only varies a bit in how it initializes the vector register with 1-15 bytes of string contents.

@adonovan
Copy link
Member Author

adonovan commented May 23, 2024

The benchmark is here, and the soft and hard algorithms are here, but note that I deleted the if len < 12 { soft } fallback logic in the numbers I posted above.

@mknyszek mknyszek added this to the Unplanned milestone May 29, 2024
@randall77
Copy link
Contributor

One of the things to consider here is that the current hash is designed to make it hard for an adversary to engineer collisions. We'd want a similar property for any replacement. As far as I know it is easy to engineer collisions with FNV hashing. (But maybe it would be harder if we picked the prime randomly somehow?)
Maybe for 1 and possibly 2 byte hashes collisions aren't that big a deal.

@ericlagergren
Copy link
Contributor

ericlagergren commented Jun 1, 2024

Naively, the variance makes sense for the arm64 aeshash.

Presumably some of the overhead for 12-byte strings is just because aeshash is implemented in assembly and is opaque to the compiler.

Perhaps it makes sense to use memhashFallback for short strings?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
Development

No branches or pull requests

6 participants