Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: maps,cmd/compile: special case and/or add function for getting an arbitrary entry from a map #66361

Open
bradfitz opened this issue Mar 17, 2024 · 22 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Proposal
Milestone

Comments

@bradfitz
Copy link
Contributor

Proposal Details

Sometimes people try to get a random key/value out of a map, writing:

       for k, v = range m {
            break
       }

One example is/was in database/sql. Unfortunately, it uses a lot of CPU:

Screenshot 2024-03-16 at 8 57 01 PM

So I rewrote it in https://go-review.googlesource.com/c/go/+/572119

An alternative is the compiler could recognize that pattern: a range with a body containing only a break.

If so, the compiler could insert a call to a runtime func that takes 2 pointers (either/both of which may be nil) and updates them to a random map value if the map has non-zero length. If it's empty, the pointers are unmodified.

/cc @golang/compiler @maisem @raggi

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Mar 17, 2024
@gopherbot
Copy link

Change https://go.dev/cl/572119 mentions this issue: database/sql: optimize connection request pool

@mdempsky
Copy link
Member

We could add:

package maps

func AnyElem[Key comparable, Val any](m map[Key, Val]) (Key, Val, bool)

And happen to implement that with a compiler optimization that recognizes the way users often write it today. That way existing users benefit, and future users have a more convenient and reliable spelling.

@seankhliao
Copy link
Member

it appears to be somewhat common https://github.com/search?q=language%3AGo+%2Ffor.*%3D.*range.*%5Cn%5Cs*break%24%2F&type=code

@jimmyfrasche
Copy link
Member

An alternate, more general, approach would be to expose some way to get the nth map entry. That may be too general but it's worth exploring, if just for contrast.

Let's say there's

package unsafe
func Entry(m MapType, n IntegerType) (ArbitraryType, ArbitraryType)

It does not provide a fixed order over the map, but as long as you do not modify or iterate over the map the same n will return the same entry. It need not be O(1).

With that selecting an arbitrary element is unsafe.Entry(m, 0) and choosing a random element is unsafe.Entry(m, rand.IntN(len(m))). This would also let you draw k>1 elements from the map.

Of course if you want an arbitrary element a random one will do and you could avoid exposing it by just having a func in rand that calls the equivalent of unsafe.Entry(m, rand.IntN(len(m))) for you, but the lack of generic methods make that a bit cumbersome.

@mdempsky
Copy link
Member

An alternate, more general, approach would be to expose some way to get the nth map entry.

As exposed to users today, map entries are unordered, so there's no unique "nth map entry." We could change that and define some order, but that seems like a substantially broader change than proposed here, and I'm skeptical that's something we want to pursue without use cases.

@Jorropo
Copy link
Member

Jorropo commented Mar 17, 2024

A function would be weird because it becomes O(cap(m)) on map with small ratios.
If you grow a map and remove many elements, this would be expensive operation (because we never shrink maps).

@Jorropo
Copy link
Member

Jorropo commented Mar 17, 2024

@bradfitz I'm not convinced a fast path would improve anything here, scanning the empty slots seems like a slow thing to do.

package a

import "fmt"
import "testing"

var doNotOptimizeK, doNotOptimizeV uint

func benchmark(b *testing.B, cap, len uint) {
	b.Run(fmt.Sprintf("%d/%d", cap, len), func(b *testing.B) {
		m := make(map[uint]uint, cap)
		for i := range len {
			m[i] = i
		}
		b.ResetTimer()
		for range b.N {
			for doNotOptimizeK, doNotOptimizeV = range m {
				break
			}
		}
	})
}

func BenchmarkMap(b *testing.B) {
	for cap := uint(1); cap < 1000000; cap *= 7 {
		for len := uint(1); len <= cap; len *= 7 {
			benchmark(b, cap, len)
		}
	}
}
goos: linux
goarch: amd64
cpu: AMD Ryzen 5 3600 6-Core Processor              
BenchmarkMap/1/1-12         	39464058	        30.63 ns/op
BenchmarkMap/7/1-12         	40099461	        30.96 ns/op
BenchmarkMap/7/7-12         	56495338	        21.27 ns/op
BenchmarkMap/49/1-12        	20483455	        56.14 ns/op
BenchmarkMap/49/7-12        	29672593	        40.95 ns/op
BenchmarkMap/49/49-12       	46328062	        26.57 ns/op
BenchmarkMap/343/1-12       	 5146122	       239.7 ns/op
BenchmarkMap/343/7-12       	13583094	        80.50 ns/op
BenchmarkMap/343/49-12      	28516389	        39.72 ns/op
BenchmarkMap/343/343-12     	42983257	        28.27 ns/op
BenchmarkMap/2401/1-12      	  722722	      1616 ns/op
BenchmarkMap/2401/7-12      	 2753096	       521.7 ns/op
BenchmarkMap/2401/49-12     	14052568	        86.84 ns/op
BenchmarkMap/2401/343-12    	29775351	        38.74 ns/op
BenchmarkMap/2401/2401-12   	41705300	        30.31 ns/op
BenchmarkMap/16807/1-12     	   94954	     12596 ns/op
BenchmarkMap/16807/7-12     	  406827	      3432 ns/op
BenchmarkMap/16807/49-12    	 2267532	       465.1 ns/op
BenchmarkMap/16807/343-12   	10755490	        96.90 ns/op
BenchmarkMap/16807/2401-12  	27650230	        43.03 ns/op
BenchmarkMap/16807/16807-12 	37117978	        32.77 ns/op
BenchmarkMap/117649/1-12    	   11602	    100873 ns/op
BenchmarkMap/117649/7-12    	   38211	     30643 ns/op
BenchmarkMap/117649/49-12   	  325615	      4009 ns/op
BenchmarkMap/117649/343-12  	 1842660	       640.4 ns/op
BenchmarkMap/117649/2401-12 	 9241438	       128.8 ns/op
BenchmarkMap/117649/16807-12         	21193573	        50.61 ns/op
BenchmarkMap/117649/117649-12        	31642425	        38.37 ns/op
BenchmarkMap/823543/1-12             	    2823	    426807 ns/op
BenchmarkMap/823543/7-12             	   10000	    118543 ns/op
BenchmarkMap/823543/49-12            	   67795	     17367 ns/op
BenchmarkMap/823543/343-12           	  466880	      2473 ns/op
BenchmarkMap/823543/2401-12          	 2689250	       422.2 ns/op
BenchmarkMap/823543/16807-12         	 8709276	       124.7 ns/op
BenchmarkMap/823543/117649-12        	18674678	        62.27 ns/op
BenchmarkMap/823543/823543-12        	24420856	        48.04 ns/op
PASS
ok  	command-line-arguments	60.335s

It easily do ~30ns when the fill up ratio is good, so your ~10s suggest you are doing 329 million requests (assuming the ratio is good). I know this is an extremely limited comparison, but are you doing in the range of 3b~32m requests ? If you are doing far less, your map is poorly filled up so there is gonna be lots of scanning of empty cells involved even with a compiler help (more state could be added to skip some scanning but that a change to the runtime more than the compiler).

@bradfitz
Copy link
Contributor Author

@Jorropo, it's a very heavily loaded server. I did the optimization based on data, not on a whim or for fun.

@Jorropo
Copy link
Member

Jorropo commented Mar 17, 2024

The patch you sent make sense.
I'm saying I don't think adding a compiler fast path is gonna be a substantial improvement because a mostly empty sparse map is the wrong datastructure to pick random elements from.
How long was the profile you took ? or how many requests did it handled ? this can be used to estimate if the map is poorly filled up or not.

@pkositsyn
Copy link
Contributor

I think it is generally unobvious, that it's not an even distribution by taking the first element of iteration (https://go.dev/play/p/n50sJuL1O6s and for other lens it's still not even).

Do we actually want to encourage "getting random element directly from a map"? This clearly adds a confusion about the distribution

@gophun
Copy link

gophun commented Mar 18, 2024

@pkositsyn I assume that the intention is to get an 'arbitrary' element (which could be the same each time) instead of a 'random' element. The specification leaves the order undefined. So, it is theoretically possible that another Go implementation (or this one in the future, albeit unlikely) implements an ordered map.

@bboreham
Copy link
Contributor

Anecdata: I spent many hours tracking down odd behaviour in a gossip system due to the skewed distribution of this pattern.

@pkositsyn
Copy link
Contributor

pkositsyn commented Mar 18, 2024

If it is about getting "any" element I suggest renaming the issue to "getting arbitrary element entry in map"

gopherbot pushed a commit that referenced this issue Mar 18, 2024
This replaces a map used as a set with a slice.

We were using a surprising amount of CPU in this code, making mapiters
to pull out a random element of the map. Instead, just rand.IntN to pick
a random element of the slice.

It also adds a benchmark:

                     │    before    │                after                │
                     │    sec/op    │   sec/op     vs base                │
    ConnRequestSet-8   1818.0n ± 0%   452.4n ± 0%  -75.12% (p=0.000 n=10)

(whether random is a good policy is a bigger question, but this
 optimizes the current policy without changing behavior)

Updates #66361

Change-Id: I3d456a819cc720c2d18e1befffd2657e5f50f1e7
Reviewed-on: https://go-review.googlesource.com/c/go/+/572119
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Brad Fitzpatrick <bradfitz@golang.org>
@mdempsky
Copy link
Member

Pedantically, "random" does not imply "uniform distribution", but I appreciate that many users conflate the two. I agree "any", "some", or "arbitrary" seem like better words to use in user-facing naming and documentation to minimize confusion if we don't guarantee a uniform distribution. (I would think we could guarantee range always starts at a uniform random map entry; but that seems like a separate proposal, which I expect has already been discussed and rejected too.)

I don't think it's necessary to retitle the issue though.

@CAFxX
Copy link
Contributor

CAFxX commented Mar 20, 2024

An alternative is the compiler could recognize that pattern: a range with a body containing only a break.

I think it's worth pointing out that the more generic patterns in which a single entry is processed inside the loop seems to be much more common than the simple pattern where the loop body contains only break (as of writing ~18k vs. ~1.2k; while it's true the former count includes the latter, it's still at least an order of magnitude more common). In this case, I would hope the same transformation would apply also to any map iteration loop that we can "easily" prove terminates after a single iteration, so e.g. things like

// delete one element from a non-empty map
for k := range m {
	delete(m, k)
	break
}

// process a single element in a non-empty map
for k, v := range m {
	// some statement operating on m, k, and/or v - e.g. one of the following:
	// - foo(m, k, v) 
	// - m[k] = ...
	// - foo(v); delete(m, k)
	// - if some_condition { ... } (as long as control flow does not diverge)
	break
}

@randall77
Copy link
Contributor

I would think we could guarantee range always starts at a uniform random map entry; but that seems like a separate proposal, which I expect has already been discussed and rejected too.

I don't see any practical way to guarantee uniform distribution in ω(n) time. At least, given multiple iterations from the same map. (Uniform distribution among multiple distinct maps with the same contents, or multiple binary invocations might be easier.)

I think "arbitrary" would be the only way forward here.

@Jorropo
Copy link
Member

Jorropo commented Mar 20, 2024

I don't see any practical way to guarantee uniform distribution in ω(n) time. At least, given multiple iterations from the same map. (Uniform distribution among multiple distinct maps with the same contents, or multiple binary invocations might be easier.)

We can use the number of iterations from the initial random guess until we found a valid value to measure the bias towards that value and use a second random call to pick (or continue iterating) this value based on the bias.

The worst case is gonna be scanning the whole map (like currently) tho.

@randall77
Copy link
Contributor

We can use the number of iterations from the initial random guess until we found a valid value to measure the bias towards that value and use a second random call to pick (or continue iterating) this value based on it's bias.

Well, the bias is not just the number empty slots until you find a valid value, but the number of empty slots since the previous valid value. So you'd have to at least walk in the reverse direction some distance. And then do the bias calculation and compare to len(m)/size(m), and use that to determine whether to continue. But maybe that would be ω(n) overall...
Also maps are currently not simple arrays of present/not-present. There are overflow buckets which throw a wrench into things. Those would also have to be accounted for somehow.

@Jorropo
Copy link
Member

Jorropo commented Mar 20, 2024

Right I did left out all the tricky details.
I don't think we need to iterate in reverse some non linear transform on the bias could make the high bias counts compensate the times we get lucky and measure a low bias for a value with a high bias because we hit near to it.
Edit: now that I think about it, it's probably more expensive than iterating backward unless the map is awfully filed up, would need benchmarks.

@zephyrtronium
Copy link
Contributor

Some mathematical background that might be helpful: The problem of uniformly sampling from a collection is reservoir sampling, and it can be done optimally in O(log n) time for a single sample. The basic idea is to assign each valid item a uniform random number and choose the one with the largest value. The more clever idea is to draw distances between successive largest values from the Gumbel distribution, which requires two floating point logarithms per sample (plus an exponential if drawing more than one sample).

It's been a while since I've looked at Go's maps in particular, but for a basic hash map, I would assume that accounting for empty slots probably still requires O(n) rather than ω(n) iterations. Probably the best improvement you get is exiting early when your next slot position exceeds the size of the map, which can still be implemented in terms of a naïve iterator.

@mknyszek mknyszek added this to the Backlog milestone Mar 20, 2024
@mknyszek mknyszek changed the title cmd/compile: special case getting a random map entry? proposal: maps,cmd/compile: special case and/or add function for getting an arbitrary entry from a map Mar 20, 2024
@adonovan
Copy link
Member

adonovan commented Mar 27, 2024

Most often when I use range m { ...; break}, I know that the map contains a single element and wish there was a library function maps.Sole:

package maps

// Sole returns the sole entry of map m. It panics if len(m) != 1.
func Sole[K comparable, V any](m map[K]V) (K, V) {
   if n := len(m); n > 1 {
     panic(fmt.Sprintf("map contains %d entries", n))
   }
   for k, v := range m { return k, v }
   panic("empty map")
}

bradfitz added a commit to tailscale/go that referenced this issue Apr 3, 2024
This replaces a map used as a set with a slice.

We were using a surprising amount of CPU in this code, making mapiters
to pull out a random element of the map. Instead, just rand.IntN to pick
a random element of the slice.

It also adds a benchmark:

                     │    before    │                after                │
                     │    sec/op    │   sec/op     vs base                │
    ConnRequestSet-8   1818.0n ± 0%   452.4n ± 0%  -75.12% (p=0.000 n=10)

(whether random is a good policy is a bigger question, but this
 optimizes the current policy without changing behavior)

Updates golang#66361

Change-Id: I3d456a819cc720c2d18e1befffd2657e5f50f1e7
Reviewed-on: https://go-review.googlesource.com/c/go/+/572119
Reviewed-by: Cherry Mui <cherryyz@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Emmanuel Odeke <emmanuel@orijtech.com>
Reviewed-by: David Chase <drchase@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Brad Fitzpatrick <bradfitz@golang.org>
@rsc
Copy link
Contributor

rsc commented Apr 3, 2024

I don't think we need to worry about giving every element exactly the same chance for the start of an iteration.

Special-casing this:

   for k, v = range m {
        break
   }

to be faster seems reasonable and does not require a proposal. Is anyone interested?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Proposal
Projects
Status: Incoming
Development

No branches or pull requests