Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: encoding/json: more performant implementation with JIT and SIMD #53178

Open
AsterDY opened this issue Jun 1, 2022 · 6 comments
Open

Comments

@AsterDY
Copy link

AsterDY commented Jun 1, 2022

Since JSON protocol is such popular today, encoding/json is being heavily used in all kinds of Go-based applications. However, we found it consumed a lot of computation resources in practice, due to its poor performance and less-efficient APIs. This is the main reason why we developed sonic, a high-performance and dedicated-API JSON library:

  • It is compatible with existing APIs of encoding/json (default behaviors may be a little different given performance, but all can be adjusted through options)
  • It provides flexible and effective APIs for dynamic operations on raw JSON (Get()/Set()/Unset()/Add()), which is a very common need in practice.
  • Its performance is generally 2~4x times of encoding/json regarding standard APIs (referring benchmarks), not to speak of dynamic APIs that are more efficient.

One topic I want to discuss here is if it can be used as the underlying implementation of the standard library? If not, at least some of our optimizations, JIT/SIMD/lazy-load, can be absorbed into the standard library?

@gopherbot gopherbot added this to the Proposal milestone Jun 1, 2022
@D1CED
Copy link

D1CED commented Jun 1, 2022

There is currently work going on creating a successor to encoding/json by some members of the Go team at https://github.com/go-json-experiment/json. There is a section in the Readme file comparing performance of selected json packages including sonic.

@ianlancetaylor
Copy link
Contributor

CC @dsnet @mvdan

@ianlancetaylor ianlancetaylor added this to Incoming in Proposals (old) Jun 2, 2022
@dsnet
Copy link
Member

dsnet commented Jun 2, 2022

Hi @AsterDY, thanks for your proposal.

The approaches that sonic takes are fascinating. It's nearly impossible to beat the performance of specialized code generated by a JIT since it can circumvent all of the overhead of Go reflection. However, JITs have several disadvantages that make it unsuitable for use in the Go standard library:

  1. JITs are notoriously difficult to implement in a way that is free of the worst kind of bugs (e.g., memory corruption). The set of possible outputs from a JIT are practically infinite and difficult to ensure that it correctly handles memory accesses in all cases.
  2. The JIT has to be linked into the Go program, resulting in fairly bloated binaries. For example, just compiling a simple program that links in sonic.Marshal and sonic.Unmarshal is 5.5MiB, while an equivalent program that just links in json.Marshal and json.Unmarshal is only 1.5MiB.
  3. The JIT implementation needs to handle various architectures. I noticed that sonic only has an implementation for amd64, which I imagine was quite a bit of work already to implement. It will be quite a bit of work to add support for arm64, mips64, or riscv64 and so forth.
  4. The JIT probably generates code that makes assumptions about how the Go runtime operates. This is going to be a continual maintenance burden making sure that it remains compatible with any changes that occur in the Go runtime.

These are significant disadvantages that I believe make a JIT unsuitable as the default approach. JITs make sense if performance is the greatest (and almost singular) priority. However, most people use Go because they want the type and memory safety that the language provides. JSON performance is important, but it is catastrophic to many users if the JSON implementation was a means through which attackers could exploit remote code execution by sending the server a maliciously crafted JSON input that triggers memory corruption in the code produced by the JIT. As much as possible, we want to lean on code that avoids the use of unsafe.

That said, use of assembly (and SIMD instructions) can be appropriate if it 1) provides significant performance, and 2) is relatively concise and well tested. For this reason, packages under crypto and hash have an assembly implementation.

Based on my work on JSON, I would say that there are two notable areas where performance optimizations (done in assembly or otherwise) will help encoding/json:

  • On some benchmarks, strconv.ParseFloat and strconv.AppendFloat can occupy up to 50% of CPU cycles.
  • On other benchmarks, JSON string encoding/decoding can occupy up to 25% of CPU cycles, where up to 15% of this is spent in utf8.DecodeRune. In general, the utf8 package could be made faster and also more inlineable.

Most of these areas of optimization are actually not in the encoding/json package.

@AsterDY
Copy link
Author

AsterDY commented Jun 2, 2022

Thanks for the reply. As you mentioned, sonic has its own problems regarding compatibility, such as our JIT system can work on assumptions about how the Go runtime operates, or SIMD functions only support the amd64 arch at present. This is why I want the official team to join us and work together: export specific runtime APIs for JIT, enhance the Go compiler to translate non-Plan9 assembly, and so on. To Take a long-term view, If Go supports JIT natively, a lot of components that rely on runtime interpretation (such as reflect, GraphQL and regex) may benefit from it.
PS:
Another JIT-based dynamic Thrift library 'frugal' of our team has been released. It has a more flexible JIT architecture and better performance. Hope it can inspire you in the construction of a JIT ecosystem for Go?

@dsnet
Copy link
Member

dsnet commented Jun 2, 2022

Merging a JIT into the Go toolchain itself will help alleviate the 4th disadvantage, but it doesn't directly address the 1st, 2nd, or 3rd disadvantages. The 1st disadvantage is the most significant and is alleviated if there is a well-staffed team to maintain the JIT and battle test it for correctness. This is way beyond my ability to adequately comment on, but I suspect any attempt to staff a team to manage a JIT will take away from efforts spent on the Go compiler itself. Personally, I would rather see the Go compiler get better.

@mvdan
Copy link
Member

mvdan commented Jun 2, 2022

I think it's worth looking at https://go.dev/blog/survey2021-results#prioritization for context. It looks like many Go developers care more about maintainability (diagnosing bugs), reliability, and security than they do about CPU usage. There are also binary size and build speed further down the list; it seems to me like using a JIT in encoding/json would be a very hard tradeoff to sell. We'd make the library faster at the cost of all the other factors I mentioned, which goes against what the survey tells us.

There's also https://go.dev/doc/faq#x_in_std; nothing prevents a JIT-based JSON encoder and decoder from living outside of the standard library. If people want to squeeze every bit of performance at the cost of some other factor (user experience, safety, maintainability, binary size, etc), typically the answer is outside of the standard library. Take a look at https://github.com/valyala/fasthttp for example, which is up to ten times faster than net/http, but likely does not make sense as part of the standard library.

@seankhliao seankhliao changed the title proposal: encoding/json: more performant implementation of standard library proposal: encoding/json: more performant implementation with JIT and SIMD Jul 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

6 participants