Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: preallocate some overflow buckets when map size hint provided #19931

Closed
josharian opened this issue Apr 11, 2017 · 10 comments
Closed

runtime: preallocate some overflow buckets when map size hint provided #19931

josharian opened this issue Apr 11, 2017 · 10 comments

Comments

@josharian
Copy link
Contributor

When populating a large map, there will ~always be overflow. Currently, the map size hint is used to allocate an initial set of buckets, but not to allocate any overflow buckets. As a result, even if you provide a perfect hint to the runtime, populating a large map allocates a lot of times. A quick simulation suggests that adding 100,000 ints to a perfectly-hinted map allocates about 2700 overflow buckets.

Work require to implement this:

  • Come up with a good estimator for the number of expected overflow buckets based on hint.
  • Find a place to track those pre-allocated overflow buckets, preferably without growing hmap.

cc @randall77

@josharian josharian added this to the Go1.9Maybe milestone Apr 11, 2017
@griesemer
Copy link
Contributor

griesemer commented Apr 11, 2017

See also #19903, #19926.

@josharian
Copy link
Contributor Author

Started on this. Getting a good estimator is proving to be non-trivial; it depends on both the absolute size and how close the resulting map is to loadFactor.

gopherbot pushed a commit that referenced this issue Apr 19, 2017
This simplifies the code, as well as providing
a single place to modify to change the
allocation of new overflow buckets.

Updates #19931
Updates #19992

Change-Id: I77070619f5c8fe449bbc35278278bca5eda780f2
Reviewed-on: https://go-review.googlesource.com/40975
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
gopherbot pushed a commit that referenced this issue Apr 19, 2017
Any change to how we allocate overflow buckets
will require some extra hmap storage,
but we don't want hmap to grow,
particular as small maps usually don't need overflow buckets.

This CL converts the existing hmap overflow field,
which is usually used for pointer-free maps,
into a generic extra field.

This extra field can be used to hold data that is optional.
If it is valuable enough to do have special
handling of overflow buckets, which are medium-sized,
it is valuable enough to pay an extra alloc and two extra words for.

Adding fields to extra would entail adding overhead to pointer-free maps;
any mapextra fields added would need to be weighed against that.
This CL is just rearrangement, though.

Updates #19931
Updates #19992

Change-Id: If8537a206905b9d4dc6cd9d886184ece671b3f80
Reviewed-on: https://go-review.googlesource.com/40976
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
gopherbot pushed a commit that referenced this issue Apr 19, 2017
bmap already has a overflow (getter) method.
Add a setoverflow (setter) method, for readability.

Updates #19931
Updates #19992

Change-Id: I00b3d94037c0d75508a7ebd51085c5c3857fb764
Reviewed-on: https://go-review.googlesource.com/40977
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
gopherbot pushed a commit that referenced this issue Apr 19, 2017
Updates #19931
Updates #19992

Change-Id: Ib2d4e6b9b89a49caa443310d896dce8d6db06050
Reviewed-on: https://go-review.googlesource.com/40978
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
gopherbot pushed a commit that referenced this issue Apr 19, 2017
When allocating a non-small array of buckets for a map,
also preallocate some overflow buckets.

The estimate of the number of overflow buckets
is based on a simulation of putting mid=(low+high)/2 elements
into a map, where low is the minimum number of elements
needed to reach this value of b (according to overLoadFactor),
and high is the maximum number of elements possible
to put in this value of b (according to overLoadFactor).
This estimate is surprisingly reliable and accurate.

The number of overflow buckets needed is quadratic,
for a fixed value of b.
Using this mid estimate means that we will overallocate a few
too many overflow buckets when the actual number of elements is near low,
and underallocate significantly too few overflow buckets
when the actual number of elements is near high.

The mechanism introduced in this CL can be re-used for
other overflow bucket optimizations.

For example, given an initial size hint,
we could estimate quite precisely the number of overflow buckets.
This is #19931.

We could also change from "non-nil means end-of-list"
to "pointer-to-hmap.buckets means end-of-list",
and then create a linked list of reusable overflow buckets
when they are freed by map growth.
That is #19992.

We could also use a similar mechanism to do bulk allocation
of overflow buckets.
All these uses can co-exist with only the one additional pointer
in mapextra, given a little care.

name                  old time/op    new time/op    delta
MapPopulate/1-8         60.1ns ± 2%    60.3ns ± 2%     ~     (p=0.278 n=19+20)
MapPopulate/10-8         577ns ± 1%     578ns ± 1%     ~     (p=0.140 n=20+20)
MapPopulate/100-8       8.06µs ± 1%    8.19µs ± 1%   +1.67%  (p=0.000 n=20+20)
MapPopulate/1000-8       104µs ± 1%     104µs ± 1%     ~     (p=0.317 n=20+20)
MapPopulate/10000-8      891µs ± 1%     888µs ± 1%     ~     (p=0.101 n=19+20)
MapPopulate/100000-8    8.61ms ± 1%    8.58ms ± 0%   -0.34%  (p=0.009 n=20+17)

name                  old alloc/op   new alloc/op   delta
MapPopulate/1-8          0.00B          0.00B          ~     (all equal)
MapPopulate/10-8          179B ± 0%      179B ± 0%     ~     (all equal)
MapPopulate/100-8       3.33kB ± 0%    3.38kB ± 0%   +1.48%  (p=0.000 n=20+16)
MapPopulate/1000-8      55.5kB ± 0%    53.4kB ± 0%   -3.84%  (p=0.000 n=19+20)
MapPopulate/10000-8      432kB ± 0%     428kB ± 0%   -1.06%  (p=0.000 n=19+20)
MapPopulate/100000-8    3.65MB ± 0%    3.62MB ± 0%   -0.70%  (p=0.000 n=20+20)

name                  old allocs/op  new allocs/op  delta
MapPopulate/1-8           0.00           0.00          ~     (all equal)
MapPopulate/10-8          1.00 ± 0%      1.00 ± 0%     ~     (all equal)
MapPopulate/100-8         18.0 ± 0%      17.0 ± 0%   -5.56%  (p=0.000 n=20+20)
MapPopulate/1000-8        96.0 ± 0%      72.6 ± 1%  -24.38%  (p=0.000 n=20+20)
MapPopulate/10000-8        625 ± 0%       319 ± 0%  -48.86%  (p=0.000 n=20+20)
MapPopulate/100000-8     6.23k ± 0%     4.00k ± 0%  -35.79%  (p=0.000 n=20+20)

Change-Id: I01f41cb1374bdb99ccedbc00d04fb9ae43daa204
Reviewed-on: https://go-review.googlesource.com/40979
Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
@josharian
Copy link
Contributor Author

This is partly done in CL 40979, insofar as the hint guides the number of buckets, and the number of buckets now guides the number of preallocated overflow buckets. It could be refined further, using the mechanisms of CL 40979, by estimating more precisely the number of overflow buckets needed for the hint, rather than taking the midpoint of the sizes for that its bucket range.

@josharian josharian modified the milestones: Unplanned, Go1.9Maybe Apr 19, 2017
@gopherbot
Copy link

CL https://golang.org/cl/40976 mentions this issue.

@gopherbot
Copy link

CL https://golang.org/cl/40975 mentions this issue.

@gopherbot
Copy link

CL https://golang.org/cl/40977 mentions this issue.

@gopherbot
Copy link

CL https://golang.org/cl/40979 mentions this issue.

@gopherbot
Copy link

CL https://golang.org/cl/40978 mentions this issue.

@randall77
Copy link
Contributor

Josh, anything else to do here?

@josharian
Copy link
Contributor Author

We could improve the estimation (along the lines of #19931 (comment)), but it's a low enough priority that I think we can call this done.

@golang golang locked and limited conversation to collaborators Apr 7, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants