-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
syscall/js: performance considerations #32591
Comments
It would be also good to benchmark with Firefox and see the results. IIUC, you are just benchmarking DOM manipulation. And since DOM manipulation anyways happens outside wasm, it is just about the price of context jump from wasm land to browser land and back. In that case, I wonder if it is even within the control of syscall/js and not the underlying wasm engine. Would be also good to benchmark equivalent code using Rust and C and compare the benchmarks. I think that may be a better apples-apples comparison just to compare syscall/js performance with other languages. |
As @agnivade said, probably worth trying Firefox. V8 is known to have some performance problems with the Wasm code generated by the Go compiler. |
Agreed. I'll do this later and share results.
Yes. When I said
Agreed, that would be good and more representative of the actual WebAssembly <-> JS call overhead. Doing that would give us more information. I won't have a chance to do this, but if someone else can, it'd be helpful. |
I've tried the benchmark again with recent development versions of 3 browsers:
The results are pretty consistent across the 3 browsers in that doing lots of DOM queries via WebAssembly was about 10x slower than with pure JavaScript. |
Could you share the code to take the benchmark to output the values [s/op]? |
Thanks for the tests @dmitshur. I would have thought that after https://hacks.mozilla.org/2018/10/calls-between-javascript-and-webassembly-are-finally-fast-%F0%9F%8E%89/, the DOM access overhead would have reduced in Firefox. And interesting that Safari is much faster for DOM access than Firefox. The tests with Rust/C should give us a better idea on what exactly can be improved from Go side. If anybody can post results for that, that'll be great. |
@hajimehoshi Sure. I've updated the source code in the original post. |
Change https://golang.org/cl/183457 mentions this issue: |
@martisch suggested that I add a "real-world" example that demonstrates the performance hit of webassembly compared to running natively. A good example is the "gophers" demo from Gio (gioui.org). With modules enabled and using Go 1.13 (tip), you can build and run the demo with two commands:
Then, open a browser and open http://localhost:8080. The target frame time is ~16.7ms (60 Hz), but on my macbook pro it almost never hit the target. Running the example natively,
it easily hits the 60 Hz target. In both Chrome and Firefox the builtin profiler is a great way to see what takes up the time. I've attached a screenshot of a single frame from Chrome's "Performance" tab. The frame time is 24ms.
CC @cherrymui who recently optimized wasm. |
Thanks @eliasnaur. I have sent CL 183457 which should alleviate the DOM overhead to some extent. Would you be able to try that and check if it helps at all ? Just a note that the CL only optimizes DOM overhead, so if your app is heavy on computations in the wasm land itself, it might not help very much. Regarding profiles, yes the Chrome profiler is a great tool. The |
Thanks @agnivade, wasmbrowsertest is definitely useful for running benchmark and standalone tests on wasm. However, the full drawing and rendering to a window doesn't lend itself to that model yet. Fortunately, I figured out how to bring back function names: the gioui.org/cmd/gio command passed -ldflags=-w -s which as a side effect stripped the function names from browser debuggers. I've removed the flags which didn't save much space anyway. Finally I updated my comment to add the |
Re: function names, it is a cold cache phenomenon as far as I understood. For the first time, it comes up as wasm-function, and then on all consecutive reloads, the names show up. Although, it is hard to reproduce. See the bug I filed. Anyways, I see some syscall/js.ValueCall in the profile. So my CL shouldTM be able to help. Feel free to give it a try whenever you have a chance. |
I tested with your CL 183457 which seems to help: the frame times are lower and more consistent. This is an example for a 17ms frame (the above profile had frame times above 20ms): However, the CPU usage still seems too high. According to the profile, almost 10ms of CPU time is spent building the vector shape for the frame timer in the top right corner. The text layout code is definitely CPU heavy and unoptimized, but 10ms seems excessive. To verify the profile, I cut out the rendering of the statistics label and redid the profile: Firefox also misses the frame target: It looks like CPU heavy code is faster in Firefox, whereas DOM calls are slower. Perhaps DOM calls are only slower because Firefox' WebGL implementation is slower. In summary, it looks like the demo is CPU bound, leading to the claim that Go generates inefficient webassembly. I'll work on preparing a benchmark that can run in wasmbrowsertest and that skips all rendering/DOM calls. |
Great stuff ! I think we are getting somewhere. Yes, the wasm code generation can use some love. I have a couple of CLs which apply some rewrite optimizations which were there in amd64 but absent in wasm, which should go in when the tree opens. But it would be great if you can prepare a standalone benchmark. That would allow us to compare the generated code with amd64 and see if there are some obvious places for improvement. |
I split the UI update from its rendering and added a benchmark. To see the difference, I ran:
So more than 10 times slower on wasm compared to native code, at least on my 2014 MBP. |
I investigated the profiles and started looking at the GOSSAFUNC output of some hot functions. The amd64 code showed lots of (MUL/DIV)SS. However, the wasm code showed something interesting, there were lots of
And in fact, several times, code like this was generated -
This means all 32 bit FP values are being promoted to 64bit, then worked on, and then again demoted to 32 bit before being written back to memory. A quick look into WasmOps.go revealed that 32 bit FP instructions were missing. And then I understood why. It is because all the FP registers (F0-F15) are treated as 64 bit registers. Now here is where my speculation begins. Since Go SSA works with only registers, these virtual registers were created to work with SSA. But in the generated code, all references to registers are rewritten to local.(get|set|tee). So theoretically it should be possible to construct another set of 32bit registers and add 32 bit FP instructions which just deal with them, and avoid this 32-64 jump. @neelance / @cherrymui - Is this analysis correct ? If so, how would you recommend to extend the F0-F15 register set to include 32 bit registers too. I have a local CL where I have already added the 32 bit instructions. Now I just need to fix these local.(get|set|tee) to work with 32 bit values. |
The Yes, it is possible to add registers for 32 bit floats, but I'm not sure how much this would affect performance, because I guess that CPUs are not faster on 32 bit floats than on 64 bit floats (might be wrong). |
I think you are referring to this
I actually found another code path in
Sure, if there is no perf boost, then there is no use. But I would like to try and check the benchmarks. What is the right way to add 32 bit registers ? Just add F16-F32 ? Or is there another way ? |
Is the
This is not easy to describe in a few words... |
@agnivade does your CL contain fixes to 32-bit integral instructions? it also amazed me to see such 32-64-32 int/fp convertions. |
I have not sent any CL yet. And no, I have not looked into 32bit integral instructions. |
Currently, every call to mem() incurs a new DataView object. This was necessary because the wasm linear memory could grow at any time. Now, whenever the memory grows, we make a call to the front-end. This allows us to reuse the existing DataView object and create a new one only when the memory actually grows. This gives us a boost in performance during DOM operations, while incurring an extra trip to front-end when memory grows. However, since the GrowMemory calls are meant to decrease over the runtime of an application, this is a good tradeoff in the long run. The benchmarks have been tested inside a browser (Google Chrome 75.0.3770.90 (Official Build) (64-bit)). It is hard to get stable nos. for DOM operations since the jumps make the timing very unreliable. But overall, it shows a clear gain. name old time/op new time/op delta DOM 135µs ±26% 84µs ±10% -37.22% (p=0.000 n=10+9) Go1 benchmarks do not show any noticeable degradation: name old time/op new time/op delta BinaryTree17 22.5s ± 0% 22.5s ± 0% ~ (p=0.743 n=8+9) Fannkuch11 15.1s ± 0% 15.1s ± 0% +0.17% (p=0.000 n=9+9) FmtFprintfEmpty 324ns ± 1% 303ns ± 0% -6.64% (p=0.000 n=9+10) FmtFprintfString 535ns ± 1% 515ns ± 0% -3.85% (p=0.000 n=10+10) FmtFprintfInt 609ns ± 0% 589ns ± 0% -3.28% (p=0.000 n=10+10) FmtFprintfIntInt 938ns ± 0% 920ns ± 0% -1.92% (p=0.000 n=9+10) FmtFprintfPrefixedInt 950ns ± 0% 924ns ± 0% -2.72% (p=0.000 n=10+9) FmtFprintfFloat 1.41µs ± 1% 1.43µs ± 0% +1.01% (p=0.000 n=10+10) FmtManyArgs 3.66µs ± 1% 3.46µs ± 0% -5.43% (p=0.000 n=9+10) GobDecode 38.8ms ± 1% 37.8ms ± 0% -2.50% (p=0.000 n=10+8) GobEncode 26.3ms ± 1% 26.3ms ± 0% ~ (p=0.853 n=10+10) Gzip 1.16s ± 1% 1.16s ± 0% -0.37% (p=0.008 n=10+9) Gunzip 210ms ± 0% 208ms ± 1% -1.01% (p=0.000 n=10+10) JSONEncode 48.0ms ± 0% 48.1ms ± 1% +0.29% (p=0.019 n=9+9) JSONDecode 348ms ± 1% 326ms ± 1% -6.34% (p=0.000 n=10+10) Mandelbrot200 6.62ms ± 0% 6.64ms ± 0% +0.37% (p=0.000 n=7+9) GoParse 23.9ms ± 1% 24.7ms ± 1% +2.98% (p=0.000 n=9+9) RegexpMatchEasy0_32 555ns ± 0% 561ns ± 0% +1.10% (p=0.000 n=8+10) RegexpMatchEasy0_1K 3.94µs ± 1% 3.94µs ± 0% ~ (p=0.906 n=9+8) RegexpMatchEasy1_32 516ns ± 0% 524ns ± 0% +1.51% (p=0.000 n=9+10) RegexpMatchEasy1_1K 4.39µs ± 1% 4.40µs ± 1% ~ (p=0.171 n=10+10) RegexpMatchMedium_32 25.1ns ± 0% 25.5ns ± 0% +1.51% (p=0.000 n=9+8) RegexpMatchMedium_1K 196µs ± 0% 203µs ± 1% +3.23% (p=0.000 n=9+10) RegexpMatchHard_32 11.2µs ± 1% 11.6µs ± 1% +3.62% (p=0.000 n=10+10) RegexpMatchHard_1K 334µs ± 1% 348µs ± 1% +4.21% (p=0.000 n=9+10) Revcomp 2.39s ± 0% 2.41s ± 0% +0.78% (p=0.000 n=8+9) Template 385ms ± 1% 336ms ± 0% -12.61% (p=0.000 n=10+9) TimeParse 2.18µs ± 1% 2.18µs ± 1% ~ (p=0.424 n=10+10) TimeFormat 2.28µs ± 1% 2.22µs ± 1% -2.30% (p=0.000 n=10+10) name old speed new speed delta GobDecode 19.8MB/s ± 1% 20.3MB/s ± 0% +2.56% (p=0.000 n=10+8) GobEncode 29.1MB/s ± 1% 29.2MB/s ± 0% ~ (p=0.810 n=10+10) Gzip 16.7MB/s ± 1% 16.8MB/s ± 0% +0.37% (p=0.007 n=10+9) Gunzip 92.2MB/s ± 0% 93.2MB/s ± 1% +1.03% (p=0.000 n=10+10) JSONEncode 40.4MB/s ± 0% 40.3MB/s ± 1% -0.28% (p=0.025 n=9+9) JSONDecode 5.58MB/s ± 1% 5.96MB/s ± 1% +6.80% (p=0.000 n=10+10) GoParse 2.42MB/s ± 0% 2.35MB/s ± 1% -2.83% (p=0.000 n=8+9) RegexpMatchEasy0_32 57.7MB/s ± 0% 57.0MB/s ± 0% -1.09% (p=0.000 n=8+10) RegexpMatchEasy0_1K 260MB/s ± 1% 260MB/s ± 0% ~ (p=0.963 n=9+8) RegexpMatchEasy1_32 62.1MB/s ± 0% 61.1MB/s ± 0% -1.53% (p=0.000 n=10+10) RegexpMatchEasy1_1K 233MB/s ± 1% 233MB/s ± 1% ~ (p=0.190 n=10+10) RegexpMatchMedium_32 39.8MB/s ± 0% 39.1MB/s ± 1% -1.74% (p=0.000 n=9+10) RegexpMatchMedium_1K 5.21MB/s ± 0% 5.05MB/s ± 1% -3.09% (p=0.000 n=9+10) RegexpMatchHard_32 2.86MB/s ± 1% 2.76MB/s ± 1% -3.43% (p=0.000 n=10+10) RegexpMatchHard_1K 3.06MB/s ± 1% 2.94MB/s ± 1% -4.06% (p=0.000 n=9+10) Revcomp 106MB/s ± 0% 105MB/s ± 0% -0.77% (p=0.000 n=8+9) Template 5.04MB/s ± 1% 5.77MB/s ± 0% +14.48% (p=0.000 n=10+9) Updates #32591 Change-Id: Id567e14a788e359248b2129ef1cf0adc8cc4ab7f Reviewed-on: https://go-review.googlesource.com/c/go/+/183457 Run-TryBot: Agniva De Sarker <agniva.quicksilver@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Richard Musiol <neelance@gmail.com>
Currently, every call to mem() incurs a new DataView object. This was necessary because the wasm linear memory could grow at any time. Now, whenever the memory grows, we make a call to the front-end. This allows us to reuse the existing DataView object and create a new one only when the memory actually grows. This gives us a boost in performance during DOM operations, while incurring an extra trip to front-end when memory grows. However, since the GrowMemory calls are meant to decrease over the runtime of an application, this is a good tradeoff in the long run. The benchmarks have been tested inside a browser (Google Chrome 75.0.3770.90 (Official Build) (64-bit)). It is hard to get stable nos. for DOM operations since the jumps make the timing very unreliable. But overall, it shows a clear gain. name old time/op new time/op delta DOM 135µs ±26% 84µs ±10% -37.22% (p=0.000 n=10+9) Go1 benchmarks do not show any noticeable degradation: name old time/op new time/op delta BinaryTree17 22.5s ± 0% 22.5s ± 0% ~ (p=0.743 n=8+9) Fannkuch11 15.1s ± 0% 15.1s ± 0% +0.17% (p=0.000 n=9+9) FmtFprintfEmpty 324ns ± 1% 303ns ± 0% -6.64% (p=0.000 n=9+10) FmtFprintfString 535ns ± 1% 515ns ± 0% -3.85% (p=0.000 n=10+10) FmtFprintfInt 609ns ± 0% 589ns ± 0% -3.28% (p=0.000 n=10+10) FmtFprintfIntInt 938ns ± 0% 920ns ± 0% -1.92% (p=0.000 n=9+10) FmtFprintfPrefixedInt 950ns ± 0% 924ns ± 0% -2.72% (p=0.000 n=10+9) FmtFprintfFloat 1.41µs ± 1% 1.43µs ± 0% +1.01% (p=0.000 n=10+10) FmtManyArgs 3.66µs ± 1% 3.46µs ± 0% -5.43% (p=0.000 n=9+10) GobDecode 38.8ms ± 1% 37.8ms ± 0% -2.50% (p=0.000 n=10+8) GobEncode 26.3ms ± 1% 26.3ms ± 0% ~ (p=0.853 n=10+10) Gzip 1.16s ± 1% 1.16s ± 0% -0.37% (p=0.008 n=10+9) Gunzip 210ms ± 0% 208ms ± 1% -1.01% (p=0.000 n=10+10) JSONEncode 48.0ms ± 0% 48.1ms ± 1% +0.29% (p=0.019 n=9+9) JSONDecode 348ms ± 1% 326ms ± 1% -6.34% (p=0.000 n=10+10) Mandelbrot200 6.62ms ± 0% 6.64ms ± 0% +0.37% (p=0.000 n=7+9) GoParse 23.9ms ± 1% 24.7ms ± 1% +2.98% (p=0.000 n=9+9) RegexpMatchEasy0_32 555ns ± 0% 561ns ± 0% +1.10% (p=0.000 n=8+10) RegexpMatchEasy0_1K 3.94µs ± 1% 3.94µs ± 0% ~ (p=0.906 n=9+8) RegexpMatchEasy1_32 516ns ± 0% 524ns ± 0% +1.51% (p=0.000 n=9+10) RegexpMatchEasy1_1K 4.39µs ± 1% 4.40µs ± 1% ~ (p=0.171 n=10+10) RegexpMatchMedium_32 25.1ns ± 0% 25.5ns ± 0% +1.51% (p=0.000 n=9+8) RegexpMatchMedium_1K 196µs ± 0% 203µs ± 1% +3.23% (p=0.000 n=9+10) RegexpMatchHard_32 11.2µs ± 1% 11.6µs ± 1% +3.62% (p=0.000 n=10+10) RegexpMatchHard_1K 334µs ± 1% 348µs ± 1% +4.21% (p=0.000 n=9+10) Revcomp 2.39s ± 0% 2.41s ± 0% +0.78% (p=0.000 n=8+9) Template 385ms ± 1% 336ms ± 0% -12.61% (p=0.000 n=10+9) TimeParse 2.18µs ± 1% 2.18µs ± 1% ~ (p=0.424 n=10+10) TimeFormat 2.28µs ± 1% 2.22µs ± 1% -2.30% (p=0.000 n=10+10) name old speed new speed delta GobDecode 19.8MB/s ± 1% 20.3MB/s ± 0% +2.56% (p=0.000 n=10+8) GobEncode 29.1MB/s ± 1% 29.2MB/s ± 0% ~ (p=0.810 n=10+10) Gzip 16.7MB/s ± 1% 16.8MB/s ± 0% +0.37% (p=0.007 n=10+9) Gunzip 92.2MB/s ± 0% 93.2MB/s ± 1% +1.03% (p=0.000 n=10+10) JSONEncode 40.4MB/s ± 0% 40.3MB/s ± 1% -0.28% (p=0.025 n=9+9) JSONDecode 5.58MB/s ± 1% 5.96MB/s ± 1% +6.80% (p=0.000 n=10+10) GoParse 2.42MB/s ± 0% 2.35MB/s ± 1% -2.83% (p=0.000 n=8+9) RegexpMatchEasy0_32 57.7MB/s ± 0% 57.0MB/s ± 0% -1.09% (p=0.000 n=8+10) RegexpMatchEasy0_1K 260MB/s ± 1% 260MB/s ± 0% ~ (p=0.963 n=9+8) RegexpMatchEasy1_32 62.1MB/s ± 0% 61.1MB/s ± 0% -1.53% (p=0.000 n=10+10) RegexpMatchEasy1_1K 233MB/s ± 1% 233MB/s ± 1% ~ (p=0.190 n=10+10) RegexpMatchMedium_32 39.8MB/s ± 0% 39.1MB/s ± 1% -1.74% (p=0.000 n=9+10) RegexpMatchMedium_1K 5.21MB/s ± 0% 5.05MB/s ± 1% -3.09% (p=0.000 n=9+10) RegexpMatchHard_32 2.86MB/s ± 1% 2.76MB/s ± 1% -3.43% (p=0.000 n=10+10) RegexpMatchHard_1K 3.06MB/s ± 1% 2.94MB/s ± 1% -4.06% (p=0.000 n=9+10) Revcomp 106MB/s ± 0% 105MB/s ± 0% -0.77% (p=0.000 n=8+9) Template 5.04MB/s ± 1% 5.77MB/s ± 0% +14.48% (p=0.000 n=10+9) Updates golang#32591 Change-Id: Id567e14a788e359248b2129ef1cf0adc8cc4ab7f Reviewed-on: https://go-review.googlesource.com/c/go/+/183457 Run-TryBot: Agniva De Sarker <agniva.quicksilver@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Richard Musiol <neelance@gmail.com>
Currently, every call to mem() incurs a new DataView object. This was necessary because the wasm linear memory could grow at any time. Now, whenever the memory grows, we make a call to the front-end. This allows us to reuse the existing DataView object and create a new one only when the memory actually grows. This gives us a boost in performance during DOM operations, while incurring an extra trip to front-end when memory grows. However, since the GrowMemory calls are meant to decrease over the runtime of an application, this is a good tradeoff in the long run. The benchmarks have been tested inside a browser (Google Chrome 75.0.3770.90 (Official Build) (64-bit)). It is hard to get stable nos. for DOM operations since the jumps make the timing very unreliable. But overall, it shows a clear gain. name old time/op new time/op delta DOM 135µs ±26% 84µs ±10% -37.22% (p=0.000 n=10+9) Go1 benchmarks do not show any noticeable degradation: name old time/op new time/op delta BinaryTree17 22.5s ± 0% 22.5s ± 0% ~ (p=0.743 n=8+9) Fannkuch11 15.1s ± 0% 15.1s ± 0% +0.17% (p=0.000 n=9+9) FmtFprintfEmpty 324ns ± 1% 303ns ± 0% -6.64% (p=0.000 n=9+10) FmtFprintfString 535ns ± 1% 515ns ± 0% -3.85% (p=0.000 n=10+10) FmtFprintfInt 609ns ± 0% 589ns ± 0% -3.28% (p=0.000 n=10+10) FmtFprintfIntInt 938ns ± 0% 920ns ± 0% -1.92% (p=0.000 n=9+10) FmtFprintfPrefixedInt 950ns ± 0% 924ns ± 0% -2.72% (p=0.000 n=10+9) FmtFprintfFloat 1.41µs ± 1% 1.43µs ± 0% +1.01% (p=0.000 n=10+10) FmtManyArgs 3.66µs ± 1% 3.46µs ± 0% -5.43% (p=0.000 n=9+10) GobDecode 38.8ms ± 1% 37.8ms ± 0% -2.50% (p=0.000 n=10+8) GobEncode 26.3ms ± 1% 26.3ms ± 0% ~ (p=0.853 n=10+10) Gzip 1.16s ± 1% 1.16s ± 0% -0.37% (p=0.008 n=10+9) Gunzip 210ms ± 0% 208ms ± 1% -1.01% (p=0.000 n=10+10) JSONEncode 48.0ms ± 0% 48.1ms ± 1% +0.29% (p=0.019 n=9+9) JSONDecode 348ms ± 1% 326ms ± 1% -6.34% (p=0.000 n=10+10) Mandelbrot200 6.62ms ± 0% 6.64ms ± 0% +0.37% (p=0.000 n=7+9) GoParse 23.9ms ± 1% 24.7ms ± 1% +2.98% (p=0.000 n=9+9) RegexpMatchEasy0_32 555ns ± 0% 561ns ± 0% +1.10% (p=0.000 n=8+10) RegexpMatchEasy0_1K 3.94µs ± 1% 3.94µs ± 0% ~ (p=0.906 n=9+8) RegexpMatchEasy1_32 516ns ± 0% 524ns ± 0% +1.51% (p=0.000 n=9+10) RegexpMatchEasy1_1K 4.39µs ± 1% 4.40µs ± 1% ~ (p=0.171 n=10+10) RegexpMatchMedium_32 25.1ns ± 0% 25.5ns ± 0% +1.51% (p=0.000 n=9+8) RegexpMatchMedium_1K 196µs ± 0% 203µs ± 1% +3.23% (p=0.000 n=9+10) RegexpMatchHard_32 11.2µs ± 1% 11.6µs ± 1% +3.62% (p=0.000 n=10+10) RegexpMatchHard_1K 334µs ± 1% 348µs ± 1% +4.21% (p=0.000 n=9+10) Revcomp 2.39s ± 0% 2.41s ± 0% +0.78% (p=0.000 n=8+9) Template 385ms ± 1% 336ms ± 0% -12.61% (p=0.000 n=10+9) TimeParse 2.18µs ± 1% 2.18µs ± 1% ~ (p=0.424 n=10+10) TimeFormat 2.28µs ± 1% 2.22µs ± 1% -2.30% (p=0.000 n=10+10) name old speed new speed delta GobDecode 19.8MB/s ± 1% 20.3MB/s ± 0% +2.56% (p=0.000 n=10+8) GobEncode 29.1MB/s ± 1% 29.2MB/s ± 0% ~ (p=0.810 n=10+10) Gzip 16.7MB/s ± 1% 16.8MB/s ± 0% +0.37% (p=0.007 n=10+9) Gunzip 92.2MB/s ± 0% 93.2MB/s ± 1% +1.03% (p=0.000 n=10+10) JSONEncode 40.4MB/s ± 0% 40.3MB/s ± 1% -0.28% (p=0.025 n=9+9) JSONDecode 5.58MB/s ± 1% 5.96MB/s ± 1% +6.80% (p=0.000 n=10+10) GoParse 2.42MB/s ± 0% 2.35MB/s ± 1% -2.83% (p=0.000 n=8+9) RegexpMatchEasy0_32 57.7MB/s ± 0% 57.0MB/s ± 0% -1.09% (p=0.000 n=8+10) RegexpMatchEasy0_1K 260MB/s ± 1% 260MB/s ± 0% ~ (p=0.963 n=9+8) RegexpMatchEasy1_32 62.1MB/s ± 0% 61.1MB/s ± 0% -1.53% (p=0.000 n=10+10) RegexpMatchEasy1_1K 233MB/s ± 1% 233MB/s ± 1% ~ (p=0.190 n=10+10) RegexpMatchMedium_32 39.8MB/s ± 0% 39.1MB/s ± 1% -1.74% (p=0.000 n=9+10) RegexpMatchMedium_1K 5.21MB/s ± 0% 5.05MB/s ± 1% -3.09% (p=0.000 n=9+10) RegexpMatchHard_32 2.86MB/s ± 1% 2.76MB/s ± 1% -3.43% (p=0.000 n=10+10) RegexpMatchHard_1K 3.06MB/s ± 1% 2.94MB/s ± 1% -4.06% (p=0.000 n=9+10) Revcomp 106MB/s ± 0% 105MB/s ± 0% -0.77% (p=0.000 n=8+9) Template 5.04MB/s ± 1% 5.77MB/s ± 0% +14.48% (p=0.000 n=10+9) Updates golang#32591 Change-Id: Id567e14a788e359248b2129ef1cf0adc8cc4ab7f Reviewed-on: https://go-review.googlesource.com/c/go/+/183457 Run-TryBot: Agniva De Sarker <agniva.quicksilver@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Richard Musiol <neelance@gmail.com>
In my experience, a large portion of the time spent in trivial As of May 2021, in cases like function calls, it might be significantly faster to cache the result of |
I was porting some frontend Go code to be compiled to WebAssembly instead of GopherJS, and noticed the performance was noticeably reduced. The Go code in question makes a lot of DOM manipulation calls and queries, so I decided to benchmark the performance of making calls from WebAssembly to the JavaScript APIs via
syscall/js
.I found it's approximately 10x slower than native JavaScript.
Results of running a benchmark in Chrome 75.0.3770.80 on macOS 10.14.5:
Here's the benchmark code I used, written to be self-contained:
Source Code
main.go
wasm.go
gopherjs.go
I know
syscall/js
is documented as "Its current scope is only to allow tests to run, but not yet to provide a comprehensive API for users", but I wanted to open this issue to discuss the future. Performance is important for Go applications that need to make a lot of calls into the JavaScript world.What is the current state of
syscall/js
performance, and are there known opportunities to improve it?/cc @neelance @cherrymui @hajimehoshi
The text was updated successfully, but these errors were encountered: