Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/json: decoding performance regression in 1.9beta1 #20693

Closed
cespare opened this issue Jun 16, 2017 · 14 comments
Closed

encoding/json: decoding performance regression in 1.9beta1 #20693

cespare opened this issue Jun 16, 2017 · 14 comments
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@cespare
Copy link
Contributor

cespare commented Jun 16, 2017

This is with these Go versions:

go version go1.8 linux/amd64
go version go1.9beta1 linux/amd64

The following benchmark unmarshals a JSON object which has a large sub-object that's irrelevant (it's associated with a key that isn't in the destination struct):

package main

import (
	"encoding/json"
	"fmt"
	"testing"
)

type T0 struct {
	Foo map[string]string
}

type T1 struct {
	Bar string
}

func Benchmark19Regression(b *testing.B) {
	junk := make(map[string]string)
	for i := 0; i < 100; i++ {
		junk[fmt.Sprintf("key%d", i)] = fmt.Sprintf("value%d", i)
	}
	input, err := json.Marshal(T0{Foo: junk})
	if err != nil {
		b.Fatal(err)
	}
	b.ResetTimer()

	for i := 0; i < b.N; i++ {
		var t T1
		if err := json.Unmarshal(input, &t); err != nil {
			b.Fatal(err)
		}
	}
}

It gets about 30% slower for me in Go 1.9.beta1:

name            old time/op    new time/op    delta
19Regression-4    20.8µs ± 3%    27.2µs ± 4%  +30.90%  (p=0.001 n=7+7)

name            old alloc/op   new alloc/op   delta
19Regression-4      336B ± 0%      328B ± 0%   -2.38%  (p=0.001 n=7+7)

name            old allocs/op  new allocs/op  delta
19Regression-4      5.00 ± 0%      4.00 ± 0%  -20.00%  (p=0.001 n=7+7)

You can also see that it's not due to increased allocations; they went down in 1.9.

@cespare
Copy link
Contributor Author

cespare commented Jun 16, 2017

I can investigate this later if nobody else gets to it first.

@cespare
Copy link
Contributor Author

cespare commented Jun 16, 2017

(Added to the 1.9 milestone since it was discovered while testing the 1.9 beta. Please fix it if that's not right.)

@cespare cespare added this to the Go1.9 milestone Jun 16, 2017
@vcabbage
Copy link
Member

Appears to be df68afd, the commit that removes the allocation, which causes this benchmark to be slower.

@bradfitz
Copy link
Contributor

Thanks for the report @cespare and investigation @vcabbage.

I suppose we could just revert df68afd but then we lose some performance in some case too. But it'd at least be equivalent to Go 1.8. @cespare, could you locally do a manual revert and see what happens performance wise?

@bradfitz
Copy link
Contributor

@pascaldekloe, could you investigate too?

@pascaldekloe
Copy link
Contributor

Sorry guys, my holidays lasts till the 26th and there are no computers around. If the issue can wait then I'm happy to resolve this once home.

@cespare
Copy link
Contributor Author

cespare commented Jun 17, 2017

Yeah, reverting df68afd makes the performance hit go away and then some (but also makes allocations become equal again).

This benchmark compares 1.8 vs. [tip with df68afd reverted].

name            old time/op    new time/op    delta
19Regression-4    20.7µs ± 3%    19.0µs ± 5%  -8.47%  (p=0.008 n=5+5)

name            old alloc/op   new alloc/op   delta
19Regression-4      336B ± 0%      336B ± 0%    ~     (all equal)

name            old allocs/op  new allocs/op  delta
19Regression-4      5.00 ± 0%      5.00 ± 0%    ~     (all equal)

I wonder if there's some fix for the CPU time which can maintain the alloc improvement.

@cespare
Copy link
Contributor Author

cespare commented Jun 21, 2017

By the way, if it comes down to keeping or reverting df68afd for 1.9, I'd vote to keep it. This benchmark isn't critical for us and the benefit of reduced allocs helps us as well.

@pascaldekloe
Copy link
Contributor

pascaldekloe commented Jun 26, 2017

The root of the problem lies in that function json.decodeState.value leaves the state to as it was on the last applicable character. Such functionality requires read ahead for some literals. As a result we have workaround json.decodeState.nextscan with function next and nextValue (original) in place. I'll try to eliminate the workarounds including function json.scanner.undo.

@bradfitz
Copy link
Contributor

@pascaldekloe, thanks.

@pascaldekloe
Copy link
Contributor

pascaldekloe commented Jun 28, 2017

Starting to work now. Probably too big of a change for the 1.9 beta release.

benchmark                   old ns/op     new ns/op     delta
Benchmark19Regression-4     29291         21890         -25.27%
benchmark                          old ns/op     new ns/op     delta
BenchmarkCodeEncoder-4             8321258       8571049       +3.00%
BenchmarkCodeMarshal-4             9539306       10014633      +4.98%
BenchmarkCodeDecoder-4             40581703      40218844      -0.89%
BenchmarkDecoderStream-4           426           450           +5.63%
BenchmarkCodeUnmarshal-4           42960681      43142710      +0.42%
BenchmarkCodeUnmarshalReuse-4      40541573      40019672      -1.29%
BenchmarkUnmarshalString-4         432           351           -18.75%
BenchmarkUnmarshalFloat64-4        368           316           -14.13%
BenchmarkUnmarshalInt64-4          337           279           -17.21%
BenchmarkIssue10335-4              489           435           -11.04%
BenchmarkUnmapped-4                1456          1178          -19.09%
BenchmarkNumberIsValid-4           32.4          32.1          -0.93%
BenchmarkNumberIsValidRegexp-4     884           920           +4.07%
BenchmarkEncoderEncode-4           269           265           -1.49%

benchmark                    old MB/s     new MB/s     speedup
BenchmarkCodeEncoder-4       233.19       226.40       0.97x
BenchmarkCodeMarshal-4       203.42       193.76       0.95x
BenchmarkCodeDecoder-4       47.82        48.25        1.01x
BenchmarkCodeUnmarshal-4     45.17        44.98        1.00x

benchmark                    old allocs     new allocs     delta
BenchmarkIssue10335-4        3              3              +0.00%
BenchmarkUnmapped-4          4              4              +0.00%
BenchmarkEncoderEncode-4     0              0              +0.00%

benchmark                    old bytes     new bytes     delta
BenchmarkIssue10335-4        312           184           -41.03%
BenchmarkUnmapped-4          344           216           -37.21%
BenchmarkEncoderEncode-4     0             0             +0.00%
 src/encoding/json/decode.go       | 314 ++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------------------------------------
 src/encoding/json/scanner.go      |  57 +------------------
 src/encoding/json/scanner_test.go |  37 ------------
 3 files changed, 160 insertions(+), 248 deletions(-)

@gopherbot
Copy link

CL https://golang.org/cl/47152 mentions this issue.

@bradfitz bradfitz added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Jun 29, 2017
@rsc
Copy link
Contributor

rsc commented Jun 29, 2017

Please revert the change in question. Reducing allocated bytes by 2% (8 out of 336) is not worth a 30% CPU performance regression. Thanks.

@gopherbot
Copy link

CL https://golang.org/cl/47211 mentions this issue.

gopherbot pushed a commit that referenced this issue Mar 1, 2018
Eliminates the need for an extra scanner, read undo and some other tricks.

name                    old time/op    new time/op    delta
CodeEncoder-12            1.92ms ± 0%    1.91ms ± 1%   -0.65%  (p=0.000 n=17+20)
CodeMarshal-12            2.13ms ± 2%    2.12ms ± 1%   -0.49%  (p=0.038 n=18+17)
CodeDecoder-12            8.55ms ± 2%    8.49ms ± 1%     ~     (p=0.119 n=20+18)
UnicodeDecoder-12          411ns ± 0%     422ns ± 0%   +2.77%  (p=0.000 n=19+15)
DecoderStream-12           320ns ± 1%     307ns ± 1%   -3.80%  (p=0.000 n=18+20)
CodeUnmarshal-12          9.65ms ± 3%    9.58ms ± 3%     ~     (p=0.157 n=20+20)
CodeUnmarshalReuse-12     8.54ms ± 3%    8.56ms ± 2%     ~     (p=0.602 n=20+20)
UnmarshalString-12         110ns ± 1%      87ns ± 2%  -21.53%  (p=0.000 n=16+20)
UnmarshalFloat64-12        101ns ± 1%      77ns ± 2%  -23.08%  (p=0.000 n=19+20)
UnmarshalInt64-12         94.5ns ± 2%    68.4ns ± 1%  -27.60%  (p=0.000 n=20+20)
Issue10335-12              128ns ± 1%     100ns ± 1%  -21.89%  (p=0.000 n=19+18)
Unmapped-12                427ns ± 3%     247ns ± 4%  -42.17%  (p=0.000 n=20+20)
NumberIsValid-12          23.0ns ± 0%    21.7ns ± 0%   -5.73%  (p=0.000 n=20+20)
NumberIsValidRegexp-12     641ns ± 0%     642ns ± 0%   +0.15%  (p=0.003 n=19+19)
EncoderEncode-12          56.9ns ± 0%    55.0ns ± 1%   -3.32%  (p=0.012 n=2+17)

name                    old speed      new speed      delta
CodeEncoder-12          1.01GB/s ± 1%  1.02GB/s ± 1%   +0.71%  (p=0.000 n=18+20)
CodeMarshal-12           913MB/s ± 2%   917MB/s ± 1%   +0.49%  (p=0.038 n=18+17)
CodeDecoder-12           227MB/s ± 2%   229MB/s ± 1%     ~     (p=0.110 n=20+18)
UnicodeDecoder-12       34.1MB/s ± 0%  33.1MB/s ± 0%   -2.73%  (p=0.000 n=19+19)
CodeUnmarshal-12         201MB/s ± 3%   203MB/s ± 3%     ~     (p=0.151 n=20+20)

name                    old alloc/op   new alloc/op   delta
Issue10335-12               320B ± 0%      184B ± 0%  -42.50%  (p=0.000 n=20+20)
Unmapped-12                 568B ± 0%      216B ± 0%  -61.97%  (p=0.000 n=20+20)
EncoderEncode-12           0.00B          0.00B          ~     (all equal)

name                    old allocs/op  new allocs/op  delta
Issue10335-12               4.00 ± 0%      3.00 ± 0%  -25.00%  (p=0.000 n=20+20)
Unmapped-12                 18.0 ± 0%       4.0 ± 0%  -77.78%  (p=0.000 n=20+20)
EncoderEncode-12            0.00           0.00          ~     (all equal)

Fixes #17914
Updates #20693
Updates #10335

Change-Id: I0459a52febb8b79c9a2991e69ed2614cf8740429
Reviewed-on: https://go-review.googlesource.com/47152
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
@golang golang locked and limited conversation to collaborators Jun 29, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

6 participants