Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: unmarshaling into struct is O(n^2) on number of struct fields #33073

Closed
danielchatfield opened this issue Jul 12, 2019 · 8 comments
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. Performance
Milestone

Comments

@danielchatfield
Copy link

What version of Go are you using (go version)?

$ go version
go version go1.12.7 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/danielchatfield/Library/Caches/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/danielchatfield"
GOPROXY=""
GORACE=""
GOROOT="/usr/local/Cellar/go/1.12.7/libexec"
GOTMPDIR=""
GOTOOLDIR="/usr/local/Cellar/go/1.12.7/libexec/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/fq/s8x0wv5527g1yz9lltnqzds80000gn/T/go-build127136404=/tmp/go-build -gno-record-gcc-switches -fno-common"

What did you do?

I've written a benchmark that illustrates this performance issue here: https://github.com/danielchatfield/jsonbench

The issue boils down to unmarshaling a JSON object into a struct is O(n^2) with respect to the number of struct fields. We recently changed some code that previously used a map to be using an autogenerated struct with a large number of fields and observed a very severe performance regression.

On my machine this takes 1.3s of CPU time to unmarshal 10,000 JSON fields.

What did you expect to see?

I expected it to use a reasonable number of CPU cycles.

What did you see instead?

It used an unreasonable number of CPU cycles.

Proposed fix

I have a patch locally that returns this to being O(n) at the expense of allocating an additional map. I can add a heuristic such that the map is only allocated if the struct has 50 or more fields (where the O(n^2) complexity starts to bite). Does this seem directionally reasonable? If so, I'll prepare the change and submit it.

@ALTree ALTree changed the title JSON unmarshaling into struct is O(n^2) on number of struct fields encoding/json: unmarshaling into struct is O(n^2) on number of struct fields Jul 12, 2019
@ALTree ALTree added this to the Go1.14 milestone Jul 12, 2019
@ALTree ALTree added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Jul 12, 2019
@mvdan
Copy link
Member

mvdan commented Jul 12, 2019

Have you tried a Go 1.13 beta or master? This should have been fixed by https://go-review.googlesource.com/c/go/+/172918.

I plan on tweaking the logic a bit to not use a map if the number of elements is small; I have a patch locally, but I haven't sent it yet. The important bit is not doing the expensive case-insensitive matching in the case where a case-sensitive match exists.

@danielchatfield
Copy link
Author

@mvdan I have not tried on Go 1.13 and that patch looks almost identical to mine without the heuristic so I'm confident that that will fix the issue we are seeing.

@mvdan
Copy link
Member

mvdan commented Jul 12, 2019

Cool. In the future, have a look at https://golang.org/doc/contribute.html, which contains information on how to build Go from master. You should use that version to experiment with changes, as the stable release may not contain the latest relevant changes.

@danielchatfield
Copy link
Author

@mvdan One difference between your patch and mine is that I iterated backwards through the fields when generating the map to preserve the existing behaviour that the "first" matching struct field would be filled rather than the last.

For example, suppose I have a type like this:

type Foo {
 A string
}

type Bar {
 Foo
 A string
}

If you JSON unmarshal into Bar then it will fill in Bar.A not Foo.A. I believe with that patch the behaviour will be the reverse since the "last" field will be the one that ends up in the map.

@danielchatfield
Copy link
Author

It might be worthwhile adding a test for this behaviour as I imagine it might be quite easy to break and will be being relied on.

@mvdan
Copy link
Member

mvdan commented Jul 12, 2019

Hmm, from memory I think there are tests covering this case already, but I might be wrong. I'll take a look and update you here.

@mvdan
Copy link
Member

mvdan commented Jul 12, 2019

Yeah, I can't reproduce a regression, sorry. Here is the code I used: https://play.golang.org/p/Ot_EVbu3lpN

$ go version
go version devel +80cca23b59 Wed Jul 10 21:26:21 2019 +0000 linux/amd64
$ go run f.go
main.Bar{Foo:main.Foo{A:""}, A:"foo"}
$ go1 version
go version go1.12.7 linux/amd64
$ go1 run f.go
main.Bar{Foo:main.Foo{A:""}, A:"foo"}

Note that the decoder has always preferred the last case sensitive match (or the last case insensitive match), so master is correct as far as I can tell. If you do encounter a regression, please file a new issue with a program to reproduce the problem, and we'll look into a fix before the final release.

@danielchatfield
Copy link
Author

I'll take a look later and report back, thanks for taking a look.

@golang golang locked and limited conversation to collaborators Jul 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. Performance
Projects
None yet
Development

No branches or pull requests

4 participants