Issue 13324045: code review 13324045: encoding/json: add "overflow" struct tag option

adg

Hello golang-dev@googlegroups.com, I'd like you to review this change to https://code.google.com/p/go

10 years, 7 months ago (2013-08-29 04:46:55 UTC) #1

r

https://codereview.appspot.com/13324045/diff/1/src/pkg/encoding/json/encode.go File src/pkg/encoding/json/encode.go (right): https://codereview.appspot.com/13324045/diff/1/src/pkg/encoding/json/encode.go#newcode255 src/pkg/encoding/json/encode.go:255: panic("endOverflow called before startOverflow") what if e.Len() is zero ...

10 years, 7 months ago (2013-08-29 05:16:51 UTC) #2

r

https://codereview.appspot.com/13324045/diff/1/src/pkg/encoding/json/decode.go File src/pkg/encoding/json/decode.go (right): https://codereview.appspot.com/13324045/diff/1/src/pkg/encoding/json/decode.go#newcode64 src/pkg/encoding/json/decode.go:64: // When unmarshalling into struct values, a struct field ...

10 years, 7 months ago (2013-08-29 05:17:16 UTC) #3

adg

https://codereview.appspot.com/13324045/diff/1/src/pkg/encoding/json/decode.go File src/pkg/encoding/json/decode.go (right): https://codereview.appspot.com/13324045/diff/1/src/pkg/encoding/json/decode.go#newcode64 src/pkg/encoding/json/decode.go:64: // When unmarshalling into struct values, a struct field ...

10 years, 7 months ago (2013-08-29 07:24:37 UTC) #4

Hierro

On 2013/08/29 04:46:55, adg wrote: > Hello mailto:golang-dev@googlegroups.com, > > I'd like you to review ...

10 years, 7 months ago (2013-08-29 23:58:28 UTC) #5

On 2013/08/29 04:46:55, adg wrote:
> Hello mailto:golang-dev@googlegroups.com,
> 
> I'd like you to review this change to
> https://code.google.com/p/go

NOT LGTM

benchmark                      old ns/op    new ns/op    delta
BenchmarkCodeEncoder            43353324     46277117   +6.74%
BenchmarkCodeMarshal            46923347     50917386   +8.51%
BenchmarkCodeDecoder           188968676    305382293  +61.60%
BenchmarkCodeUnmarshal         184834994    302217744  +63.51%
BenchmarkCodeUnmarshalReuse    175378432    292963951  +67.05%
BenchmarkUnmarshalString            1186         1113   -6.16%
BenchmarkUnmarshalFloat64           1018          994   -2.36%
BenchmarkUnmarshalInt64              914          954   +4.38%
BenchmarkSkipValue               3414183      2936396  -13.99%
BenchmarkEncoderEncode              1614         1611   -0.19%

benchmark                       old MB/s     new MB/s  speedup
BenchmarkCodeEncoder               44.76        41.93    0.94x
BenchmarkCodeMarshal               41.35        38.11    0.92x
BenchmarkCodeDecoder               10.27         6.35    0.62x
BenchmarkCodeUnmarshal             10.50         6.42    0.61x
BenchmarkSkipValue                 81.95        95.28    1.16x

benchmark                     old allocs   new allocs    delta
BenchmarkCodeEncoder                   1            1    0.00%
BenchmarkCodeMarshal                  17           17    0.00%
BenchmarkCodeDecoder              180415       182079    0.92%
BenchmarkCodeUnmarshal            195430       195439    0.00%
BenchmarkCodeUnmarshalReuse       180382       182063    0.93%
BenchmarkUnmarshalString               2            2    0.00%
BenchmarkUnmarshalFloat64              2            2    0.00%
BenchmarkUnmarshalInt64                2            2    0.00%
BenchmarkSkipValue                    32           32    0.00%
BenchmarkEncoderEncode                 2            2    0.00%

benchmark                      old bytes    new bytes    delta
BenchmarkCodeEncoder                  12           12    0.00%
BenchmarkCodeMarshal             4578344      4578360    0.00%
BenchmarkCodeDecoder             3952156      5100454   29.05%
BenchmarkCodeUnmarshal           4283600      4285227    0.04%
BenchmarkCodeUnmarshalReuse      2944364      3094734    5.11%
BenchmarkUnmarshalString             276          276    0.00%
BenchmarkUnmarshalFloat64            268          268    0.00%
BenchmarkUnmarshalInt64              268          268    0.00%
BenchmarkSkipValue                  2291         2291    0.00%
BenchmarkEncoderEncode               221          237    7.24%

Performance is severely affected by this CL, reducing encoding speed by 7-8% and
almost halving decoding speed (which is already dogslow). Also, the memory
consumption for decoding is increased by 30%. These are unacceptable tradeoffs,
specially when the use cases this patch addresses are very limited. If you're
interested in JSON keys that might or might not be there, you're better off
adding all the fields to your struct. In the rare case that the JSON you're
parsing has arbitrary keys that you can't know beforehand, you can always decode
into a map[string]interface{} and inspect the keys later.

On the other hand, if the performance issues were to be fixed, I think the name
"overflow" is kinda bad since most people would think it's related to numeric
overflows (e.g. decoding 257 into a uint8). The "inline" tag used by bson is way
better, since it means that field's subfields should be treated as if they were
directly included in the container struct. This makes sense both when encoding
(the map keys are interpreted as if they were fields in the container struct)
and when decoding (if the container struct was a map, they key would be added to
the map).

Regards,
Alberto

Sign in to reply to this message.

adg

Thanks for running the benchmarks. I agree that this patch is a non-starter with this ...

10 years, 7 months ago (2013-08-30 00:02:10 UTC) #6

Thanks for running the benchmarks. I agree that this patch is a non-starter
with this kind of performance degradation.

I suspect the issues may be caused by the refactoring I did, which added
more function calls to the critical path.

I'll take a look at optimizing this. In theory it should have no sizable
effect on performance.


On 30 August 2013 09:58, <alberto.garcia.hierro@gmail.com> wrote:

> On 2013/08/29 04:46:55, adg wrote:
>
>> Hello mailto:golang-dev@**googlegroups.com <golang-dev@googlegroups.com>,
>>
>
>  I'd like you to review this change to
>> https://code.google.com/p/go
>>
>
> NOT LGTM
>
> benchmark                      old ns/op    new ns/op    delta
> BenchmarkCodeEncoder            43353324     46277117   +6.74%
> BenchmarkCodeMarshal            46923347     50917386   +8.51%
> BenchmarkCodeDecoder           188968676    305382293  +61.60%
> BenchmarkCodeUnmarshal         184834994    302217744  +63.51%
> BenchmarkCodeUnmarshalReuse    175378432    292963951  +67.05%
> BenchmarkUnmarshalString            1186         1113   -6.16%
> BenchmarkUnmarshalFloat64           1018          994   -2.36%
> BenchmarkUnmarshalInt64              914          954   +4.38%
> BenchmarkSkipValue               3414183      2936396  -13.99%
> BenchmarkEncoderEncode              1614         1611   -0.19%
>
> benchmark                       old MB/s     new MB/s  speedup
> BenchmarkCodeEncoder               44.76        41.93    0.94x
> BenchmarkCodeMarshal               41.35        38.11    0.92x
> BenchmarkCodeDecoder               10.27         6.35    0.62x
> BenchmarkCodeUnmarshal             10.50         6.42    0.61x
> BenchmarkSkipValue                 81.95        95.28    1.16x
>
> benchmark                     old allocs   new allocs    delta
> BenchmarkCodeEncoder                   1            1    0.00%
> BenchmarkCodeMarshal                  17           17    0.00%
> BenchmarkCodeDecoder              180415       182079    0.92%
> BenchmarkCodeUnmarshal            195430       195439    0.00%
> BenchmarkCodeUnmarshalReuse       180382       182063    0.93%
> BenchmarkUnmarshalString               2            2    0.00%
> BenchmarkUnmarshalFloat64              2            2    0.00%
> BenchmarkUnmarshalInt64                2            2    0.00%
> BenchmarkSkipValue                    32           32    0.00%
> BenchmarkEncoderEncode                 2            2    0.00%
>
> benchmark                      old bytes    new bytes    delta
> BenchmarkCodeEncoder                  12           12    0.00%
> BenchmarkCodeMarshal             4578344      4578360    0.00%
> BenchmarkCodeDecoder             3952156      5100454   29.05%
> BenchmarkCodeUnmarshal           4283600      4285227    0.04%
> BenchmarkCodeUnmarshalReuse      2944364      3094734    5.11%
> BenchmarkUnmarshalString             276          276    0.00%
> BenchmarkUnmarshalFloat64            268          268    0.00%
> BenchmarkUnmarshalInt64              268          268    0.00%
> BenchmarkSkipValue                  2291         2291    0.00%
> BenchmarkEncoderEncode               221          237    7.24%
>
> Performance is severely affected by this CL, reducing encoding speed by
> 7-8% and almost halving decoding speed (which is already dogslow). Also,
> the memory consumption for decoding is increased by 30%. These are
> unacceptable tradeoffs, specially when the use cases this patch
> addresses are very limited. If you're interested in JSON keys that might
> or might not be there, you're better off adding all the fields to your
> struct. In the rare case that the JSON you're parsing has arbitrary keys
> that you can't know beforehand, you can always decode into a
> map[string]interface{} and inspect the keys later.
>
> On the other hand, if the performance issues were to be fixed, I think
> the name "overflow" is kinda bad since most people would think it's
> related to numeric overflows (e.g. decoding 257 into a uint8). The
> "inline" tag used by bson is way better, since it means that field's
> subfields should be treated as if they were directly included in the
> container struct. This makes sense both when encoding (the map keys are
> interpreted as if they were fields in the container struct) and when
> decoding (if the container struct was a map, they key would be added to
> the map).
>
> Regards,
> Alberto
>
>
https://codereview.appspot.**com/13324045/<https://codereview.appspot.com/133...
>

Sign in to reply to this message.

Hierro

On 2013/08/30 00:02:10, adg wrote: > Thanks for running the benchmarks. I agree that this ...

10 years, 7 months ago (2013-08-30 00:31:14 UTC) #7

adg

On 30 August 2013 10:31, <alberto.garcia.hierro@gmail.com> wrote: > On the other hand, wouldn't it be ...

10 years, 7 months ago (2013-08-30 00:33:22 UTC) #8

adg

On 30 August 2013 10:31, <alberto.garcia.hierro@gmail.com> wrote: > I checked your code and that's very ...

10 years, 7 months ago (2013-08-30 00:34:21 UTC) #9

Hierro

On 2013/08/30 00:34:21, adg wrote: > I'd be interested to see those changes. Want to ...

10 years, 7 months ago (2013-08-30 12:13:22 UTC) #10

On 2013/08/30 00:34:21, adg wrote:
> I'd be interested to see those changes. Want to send a CL? (For discussion,
> if not submission.:)

Not right now, sorry. Our decoder currently uses a bunch of tricks depending on
the unsafe package and lacks "safe" fallbacks for environments which can't use
package unsafe (although we plan to add them before releasing this code).  It
also needs a bit more testing and cleaning up before it can be published. We
will probably release it in the coming months along our web framework.
Furthermore, when we tried to submit some of our changes for inclusion, Rob and
Russ made it very clear that they found our approach to optimizing encoding/json
inappropriate for the standard library, where maintainability is preferred over
performance. Also, our code has diverged so much from encoding/json that it's
basically a reimplementation of its public interface, so I'm not sure if it
could be of any help. I can, however, tell you what we did on the high level,
just in case you might find it helpful. From the top of my head, it was:

- Don't check the JSON for validity beforehand, but rather check it as it's
being parsed, without altering the public function semantics. This sped up
parsing valid JSON considerably, while slowing down parsing invalid JSON. I'd
say 99.999% of the JSON our apps parse is valid, so optimizing for the common
case (valid JSON) made more sense to us (people who reviewed the CL I sent with
this change didn't agree with that).

- Rewrite the scanner to be event based, rather than byte based. The current
scanner casts every byte it's fed to int, causing a lot of unneeded MOVZX or
equivalent (we were told that this didn't make a difference big enough to
warrant a code change, we beg to differ because a lot of small performance
improvements eventually become a significant improvement), and requires a
function call for every character in the input. Function calls are very slow in
Go (compared to C), so we reduced them by making the scanner return a constant
indicating the type of the next event (beginString, beginNumber, beginObject,
endObject, comma, etc…) and then having specific functions for each event (e.g.
scanNull, scanTrue, scanStringBytes, scanComma…)

- Change as many functions as possible to receive a *reflect.Value, rather than
a reflect.Value. Passing a pointer translates to a LEAQ and a MOVQ, while
passing a reflect.Value generates 6 MOVQ (3 from the stack and another 3 to the
stack). Functions in decode.go call each other a lot of times while decoding
JSON and those extra trips to memory quickly add up.

- Remove as much function calls as possible from the hot code paths. This meant
manually duplicating a bit of code, but it's OK for us.

- Remove as many branches as possible from hot code paths (e.g. the current code
in object() checks if the value is a map or a struct on every iteration rather
than before beginning the loop, and array() does kind of the same with
reflect.Array and reflect.Slice).

- Use ASM to look for the end of the strings, while also checking if they have
to be escaped (and if they have to, from where in the string you need to start
escaping) iterating over them just once. unquote() is very expensive so the more
times you can avoid it, the better (and, truth to be told, most strings in JSON
don't need to be unquoted - most people use only ASCII for object keys, which
represent a high percentage of the strings present in a JSON payload).

- Use "unsafe" conversions from []byte to string when appropriate. This avoids a
lot of copying and reduces garbage significantly and the conversions are not
really unsafe, since the scanner is holding a reference to the underlying
[]byte, which prevents any invalid memory access from happening. The worst thing
that might happen is the JSON won't be correctly decoded if the underlying
[]byte is modified while hashing or comparing the "unsafe" string, but if you're
changing the JSON data at the same time you're parsing it, all bets are off (we
were also told that this wasn't OK for the standard library).

- Cache JSON key to struct field lookups in decodeState, to allow lockless
access to this cache. strings.EqualFold is slower than the bad guy's horse.

These changes gave us around 5x-6x more performance while decoding, but keep in
mind that there are still a few tests that don't pass, so there might be some
small performance decreases when we fix those corner cases.

Regards,
Alberto

Sign in to reply to this message.

rsc

NOT LGTM Too late, and unclear that it's worthwhile anyway. If you need custom stuff, ...

10 years, 7 months ago (2013-09-03 21:13:56 UTC) #11

adg

On 4 September 2013 07:13, <rsc@golang.org> wrote: > NOT LGTM > > Too late, and ...

10 years, 7 months ago (2013-09-04 00:20:55 UTC) #12

rsc

On Tue, Sep 3, 2013 at 8:20 PM, Andrew Gerrand <adg@golang.org> wrote: > > On ...

10 years, 7 months ago (2013-09-05 16:02:10 UTC) #13

R=close

seems to have timed out

Sign in to reply to this message.

Issue 13324045: code review 13324045: encoding/json: add "overflow" struct tag option

Description

Patch Set 1 #

Patch Set 2 : diff -r c98067b47ab0 https://code.google.com/p/go #

Messages