Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reflect: map iteration does unnecessary excessive allocation for non-pointer types #32424

Closed
ugorji opened this issue Jun 4, 2019 · 12 comments
Closed
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Milestone

Comments

@ugorji
Copy link
Contributor

ugorji commented Jun 4, 2019

Given a map[string]*T or map[uint16]T or many other types, iterating via reflection is expensive, because we will allocate a new value on the heap each time for very many types.

In a call to reflect.MapKeys, MapIndex, MapIter.Key, MapIter.Value, the implementation includes the following code:

func copyVal(typ *rtype, fl flag, ptr unsafe.Pointer) Value {
	if ifaceIndir(typ) {
		// Copy result so future changes to the map won't change the underlying value.
		c := unsafe_New(typ)
		typedmemmove(typ, c, ptr)
		return Value{typ, c, fl | flagIndir}
	}
	return Value{typ, *(*unsafe.Pointer)(ptr), fl}
}

The ifaceIndir call returns true for very many types, including string, intXXX, uintXXX, bool, []uint8 ([]byte), etc.

This causes an allocation to be done when retrieving each map key or value.

Iterating through a large map[int][]byte for example, will cause an allocation for each key or value retrieved from the map.

Without reflection, this doesn't happen. And it need not happen.

It has the effect of causing an explosion in memory use of my benchmarks. Here, I have a map[string]*codec.stringUint64T with about 8192 entries. And I have other maps in here also.

Sample benchmark result:

Benchmark__Json_______Encode-8   	     625	   7503851 ns/op	     432 B/op	       4 allocs/op
Benchmark__Json_______Encode-8   	     297	  15948731 ns/op	  528784 B/op	   32879 allocs/op

The first result was manual using range operator, while the second result was via reflection. The number of allocation jumped from 4 to 32879 per op.


To illustrate, I ran my benchmark which includes a map[string]*codec.stringUint64T with about 8192 entries.

go test -run Nothing -tags "alltests codecgen" -bench "__Json_*En" -benchmem -benchtime 1x  | grep -v -E "^(goos:|goarch:|pkg:|PASS|ok)"

I then instrumented the copyVal function to validate it.


var copyValMap = make(map[*rtype]bool)
var copyValMapNoIface = make(map[*rtype]bool)

// copyVal returns a Value containing the map key or value at ptr,
// allocating a new variable as needed.
func copyVal(typ *rtype, fl flag, ptr unsafe.Pointer) Value {
	if ifaceIndir(typ) {
		if _, ok := copyValMap[typ]; !ok {
			println("copyVal: ifaceIndir: ", typ, typ.String())
			copyValMap[typ] = true
		}
		// Copy result so future changes to the map won't change the underlying value.
		c := unsafe_New(typ)
		typedmemmove(typ, c, ptr)
		return Value{typ, c, fl | flagIndir}
	}
	if _, ok := copyValMapNoIface[typ]; !ok {
		println("copyVal: NO   iface: ", typ, typ.String())
		copyValMapNoIface[typ] = true
	}
	return Value{typ, *(*unsafe.Pointer)(ptr), fl}
}

copyVal: ifaceIndir:  0x14d8880 string
copyVal: ifaceIndir:  0x14d7b00 int64
copyVal: ifaceIndir:  0x14d2380 []uint8
copyVal: NO   iface:  0x1521500 *codec.stringUint64T
copyVal: ifaceIndir:  0x14d8bc0 uint16
copyVal: NO   iface:  0x15213c0 *codec.TestStruc
copyVal: ifaceIndir:  0x1553820 codec.TestStruc

I then instrumented copyVal again to track the number of times that the allocation happened.
It was as expected (>32000 times).

var copyValMap = make(map[*rtype]int)
var copyValMapNoIface = make(map[*rtype]int)

// copyVal returns a Value containing the map key or value at ptr,
// allocating a new variable as needed.
func copyVal(typ *rtype, fl flag, ptr unsafe.Pointer) Value {
	if ifaceIndir(typ) {
		copyValMap[typ]++
		if copyValMap[typ]%1000 == 0 {
			println("copyVal: ifaceIndir: ", typ, typ.String(), copyValMap[typ])
		}
		// Copy result so future changes to the map
		// won't change the underlying value.
		c := unsafe_New(typ)
		typedmemmove(typ, c, ptr)
		return Value{typ, c, fl | flagIndir}
	}
	copyValMapNoIface[typ]++
	if copyValMapNoIface[typ]%1000 == 0 {
		println("copyVal: NO   iface: ", typ, typ.String(), copyValMapNoIface[typ])
	}
	return Value{typ, *(*unsafe.Pointer)(ptr), fl}
}

copyVal: ifaceIndir:  0x14d8960 string 32000
copyVal: NO   iface:  0x15215e0 *codec.stringUint64T 32000

What version of Go are you using (go version)?

$ go version
go version devel +1531192272 Mon May 27 17:58:39 2019 +0000 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE="on"
GOARCH="amd64"
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/bt/526m392x16x8y149jwx28mkm0000gp/T/go-build760118559=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

What did you expect to see?

What did you see instead?

@ugorji
Copy link
Contributor Author

ugorji commented Jun 4, 2019

See attached

github_golang_issue_32424_test.go.txt

Please rename the file to .go, and run as below:

mv github_golang_issue_32424_test.go.txt github_golang_issue_32424_test.go
go test -tags "32424" -run Nothing -bench 'Map(Range|ReflectIter|ReflectIndex)' -benchmem  -cpuprofile cpu.out -memprofile mem.out -memprofilerate 1

Got results:

BenchmarkMapRange-8          	   89374	     12812 ns/op	       0 B/op	       0 allocs/op
BenchmarkMapReflectIter-8    	     660	   1805001 ns/op	   16546 B/op	    1026 allocs/op
BenchmarkMapReflectIndex-8   	     576	   2146006 ns/op	   41088 B/op	    1026 allocs/op

For an operation that shouldn't have any allocations, we have a lot.

$ go tool pprof mem.out 
(pprof) top 2
Showing nodes accounting for 38.34MB, 93.02% of 41.21MB total
Dropped 163 nodes (cum <= 0.21MB)
Showing top 2 nodes out of 20
      flat  flat%   sum%        cum   cum%
   22.47MB 54.52% 54.52%    22.47MB 54.52%  reflect.copyVal
   15.87MB 38.50% 93.02%    26.51MB 64.32%  reflect.Value.MapKeys

The allocation is mostly from copyVal (which I traced as being called for each time we convert a map key or value into a reflect.Value, where we call unsafe_New because interfaceIndir(t) returns true.

For Reflect using MapIndex, the call to MapKeys allocates a slice with same length as the map, which is a reason for why MapIter is better. But that (MapIter) still has allocation because of the copyVal function call.

As you see above, with normal range, there is no allocation. And the allocation is completely because the copyVal code does a seemingly unnecessary allocation for composing a reflect.Value from the map key or value.

@ugorji
Copy link
Contributor Author

ugorji commented Jun 4, 2019

From my understanding, the reason for this is that reflection works off interface{}, and so we MUST create a pointer for each non-pointer that is wrapped in an interface.

I am hoping there is a way to create an API that works around this. For example, we could pass in a reflect.Value to each call to MapIter.Key() or Value(), and that would be used to pass on Set the value on each iteration?

@randall77
Copy link
Contributor

From my understanding, the reason for this is that reflection works off interface{}, and so we MUST create a pointer for each non-pointer that is wrapped in an interface.

Correct.

I am hoping there is a way to create an API that works around this. For example, we could pass in a reflect.Value to each call to MapIter.Key() or Value(), and that would be used to pass on Set the value on each iteration?

That's a possibility. That would be a pretty ugly API though.

In general there are lots of places (not just map iteration) where reflect is slow because we're putting non-pointers in interfaces. We've historically come down on the side of cleaner API even though there might be faster ways with an uglier API.

@josharian
Copy link
Contributor

For single byte things, we could use runtime.staticbytes. For non-large zero-valued things we could use runtime.zeroval. I don’t know how common those cases are.

@ugorji
Copy link
Contributor Author

ugorji commented Jun 4, 2019

@randall77

I am hoping there is a way to create an API that works around this. For example, we could pass in a reflect.Value to each call to MapIter.Key() or Value(), and that would be used to pass on Set the value on each iteration?

That's a possibility. That would be a pretty ugly API though.

In general there are lots of places (not just map iteration) where reflect is slow because we're putting non-pointers in interfaces. We've historically come down on the side of cleaner API even though there might be faster ways with an uglier API.

It's just that I can make a workaround for all other types because I control their location in memory i.e. for a (u)int(|8|16|32|64), float(32|64), bool, string, I can put these in a struct, slice or array or standalone variable, and make an addressable reference. For slices/array (pointers), all elements are already addressable (in general).

However, for maps, because the "slot" can be reused and reclaimed under the hood, we do not expose the allocated space and MUST always re-allocate to expose the entries as an interface{}.

It is thus the case that only map iteration has no workaround.

This is why the option of passing a reference is the only workaround specifically for maps.

I am currently looking to mimic such an interface, leveraging unsafe and cull'ing from the reflection codebase, for API below:

//- look up key, and set val (expecting settable)
//- if return false, then no entry for that key
mapIndex(mapp reflect.Value, key reflect.Value, val reflect.Value) bool
//- creates an iterator, using key and val (expecting both settable) to set the next entries on each iteration
// - if any of key or val is invalid, treat as if a _ was passed to "for ... range"
// 
//- if Next() returns true, then key and val have been set to the next iteration key/value pair
//- if Next() returns false, no more entries
mapIter(key, val reflect.Value) *mapIter
(*mapIter).Next()

IMO, it is not an inelegant API. If retrofited to the current reflection API, it would look like:

// Look up key, and set its mapping into val, returning true if mapping exists
Value.MapIndexUsing(key, val reflect.Value) bool
// Create an iterator, using key and val for each key/value mapping
// Treat as _ if invalid (zero value) allowing us simulate for k, _ = range; _, v = range; k, v = range
Value.MapRangeUsing(key, val reflect.Value) *MapIter

Again, I personally do not see this as an inelegant or unweildly API.

My usecase is the go-codec encoding library: https://github.com/ugorji/go. This is the one place where code-generation is far and away more performant than reflection-based.

Others "encoders" in the space e.g. json-iterator went the way of completely re-writing the reflection package, which I think is extremely fraught with peril: see https://github.com/json-iterator/go https://github.com/modern-go/reflect2 . But they have gotten a lot of mileage doing this, gaining performance equivalent to static codebases.

@randall77
Copy link
Contributor

We could add methods to *reflect.MapIter:

// SetKey does dst.Set(it.Key()).
func (it *MapIter) SetKey(dst Value) { ... }

(and the same for values)

I think that would provide an allocation-free path.

It seems strange to me that we'd provide this optimized path for map iteration, and not for other things that construct a Value. Just scanning value.go, you could make a case for *Recv, Select, Convert, MapIndex, Zero, and Append.

@ugorji
Copy link
Contributor Author

ugorji commented Jun 4, 2019

Nice - the methods on *reflect.MapIter work nicely!

I would love to have for MapIndex also, as that is a common use-case. That is the Value.MapIndexUsing(key, dst reflect.Value) reflect.Value above.

Specifically regarding Zero, I always assumed that it would be allocation-free. I thought it would be sufficient to return a readonly non-addressable and non-settable static value as the zero value. That way, every caller gets the same one, but they cannot do much with it except set something else with it or use it readonly.

Regarding Append, there is a somewhat allocation free path with AppendSlice (as you can pass a slice of non-pointers). You can also pass Value which are addressable elements of a slice. So - there is not much of a gap here.

The remaining gaps will be the chan ones (Recv, Select) and Convert. Those are seemingly less common in typical usage (in my limited experience - YMMV). I also do not have a common use for them and do not have any good ideas of how to present it on the API.

ugorji added a commit to ugorji/go that referenced this issue Jun 8, 2019
We do this by introducing safe and unsafe 1.12+ variants for
map iteration and map indexing.

This is necessary due to golang/go#32424
- Map Iteration using reflection always causes allocation
  because every key and value might have to be converted to an
  interface{} first, meaning scalars may be allocated on the heap.
- Map Indexing also causes allocation

We workaround both by implementing a copy-free allocation
where a holder value is passed into the indexing or iterating operation,
and we "copy" the value into that pointer.

Also, improve heuristics of decInferLen

Also, fix lint, staticcheck and other issues raised via static analysis.

Clean out xdebugf comments and remove much old commented code

Finally, make more changes to assist the bounds check elimination part of compiler.
@dmitshur dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 10, 2019
@dmitshur dmitshur added this to the Go1.14 milestone Jun 10, 2019
@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
@JAicewizard
Copy link

JAicewizard commented Jan 27, 2020

I personally would love this, map iteration is a major allocation pain-point where it is being used, and is significantly slower compared to either type-specific cases, or code-gen. Personally I am seeing a ~3x increase in encoding removing special cases for common maps(map[string]interface{}, map[interface{}]interface{} and map[string]string). I was benchmarking those types (map[interface{}]interface{} specifically) but those are the most common map types.

I don't thinks a generic send/recv for channels is very common, so I personally wouldn't see the value if its any difficulty implementing. But if those are trivial to do then I would do so for the sake of keeping the same API.

@randall77
Copy link
Contributor

Zero is being handled in #33136 .

I agree that the channel ones are probably not worth worrying about, performance-wise.

Convert I'm not sure.

But Slice (and the ones that call Slice, like Append and AppendSlice) would be even more important than the map ones, I suspect. @ugorji I didn't understand how you think Append could be made allocation free - could you explain?

@martisch
Copy link
Contributor

martisch commented May 31, 2020

I came by the reflect.Append case allocating a new SliceHeader in value.Slice (called by the grow function unconditionally) in an append loop in the last weeks and optimized it by using reflect.MakeSlice and reflectSliceValue.Index(i).Set(...).

If we can reuse the existing pointed to SliceHeader in the reflect.Value and just set the underlying len to the new length we may be able to avoid an allocation per Append call. Seems like this would need a new API e.g. reflect.AppendInplace(s Value, x ...Value) (no return value) since it would be a behaviour change.

@ugorji
Copy link
Contributor Author

ugorji commented Mar 8, 2021

@randall77 Sorry, I missed this earlier.

Zero is being handled in #33136 .

I agree that the channel ones are probably not worth worrying about, performance-wise.

Convert I'm not sure.

But Slice (and the ones that call Slice, like Append and AppendSlice) would be even more important than the map ones, I suspect. @ugorji I didn't understand how you think Append could be made allocation free - could you explain?

As @Martish alluded to, a new API like AppendInPlace(s Value, x ...Value) will be needed.

In the same vein, AppendSlice() can elide the allocation caused by the call to s.Slice, by doing something akin to:

Copy(s.Slice(i0, i1), t)

to

p := (*unsafeheader.Slice)(s.ptr)
p0 := *p
p.Data = arrayAt(p0.Data, i0, typ.elem.Size(), "i < cap")
p.Len = i1 - i0
Copy(s, t)
*p = p0

josharian added a commit to tailscale/go that referenced this issue May 18, 2021
These augment the existing MapIter.Key and MapIter.Value methods.
The existing methods return new Values.
Constructing these new Values often requires allocating.
These methods allow the caller to bring their own storage.

The naming is somewhat unfortunate, in that the spec
uses the word "element" instead of "value",
as do the reflect.Type methods.
In a vacuum, MapIter.SetElem would be preferable.
However, matching the existing methods is more important.

Fixes golang#32424
Fixes golang#46131

Change-Id: I19c4d95c432f63dfe52cde96d2125abd021f24fa
@gopherbot
Copy link

Change https://golang.org/cl/320929 mentions this issue: reflect: add MapIter.SetKey and MapIter.SetValue

josharian added a commit to tailscale/go that referenced this issue May 18, 2021
These augment the existing MapIter.Key and MapIter.Value methods.
The existing methods return new Values.
Constructing these new Values often requires allocating.
These methods allow the caller to bring their own storage.

The naming is somewhat unfortunate, in that the spec
uses the word "element" instead of "value",
as do the reflect.Type methods.
In a vacuum, MapIter.SetElem would be preferable.
However, matching the existing methods is more important.

Fixes golang#32424
Fixes golang#46131

Change-Id: I19c4d95c432f63dfe52cde96d2125abd021f24fa
josharian added a commit to tailscale/go that referenced this issue May 18, 2021
These augment the existing MapIter.Key and MapIter.Value methods.
The existing methods return new Values.
Constructing these new Values often requires allocating.
These methods allow the caller to bring their own storage.

The naming is somewhat unfortunate, in that the spec
uses the word "element" instead of "value",
as do the reflect.Type methods.
In a vacuum, MapIter.SetElem would be preferable.
However, matching the existing methods is more important.

Fixes golang#32424
Fixes golang#46131

Change-Id: I19c4d95c432f63dfe52cde96d2125abd021f24fa
@golang golang locked and limited conversation to collaborators Aug 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Performance
Projects
None yet
Development

No branches or pull requests

9 participants