crypto/aes: add dedicated asm version of AES, AES-GCM for arm64 #18498

matt2909 · 2017-01-03T10:00:51Z

Add a dedicated asm version of AES, AES-GCM for arm64 - utilizing ARMv8-A crypto extension when available.

It should be noted that an asm accelerated version of this algorithm, utilizing AES-NI when available, exists for amd64.

matt2909 · 2017-01-03T19:07:04Z

A partial implementation seems to have been developed under changelist:
https://go-review.googlesource.com/#/c/32579/

yonderblue · 2017-02-23T05:19:20Z

Any chance to be targeted for 1.9?

bradfitz · 2017-02-23T06:51:03Z

@cherrymui, is this something you could review?

cherrymui · 2017-02-24T01:48:22Z

I am not familiar with the algorithm. I may be able to review in terms of whether the assembly version and the Go version are equivalent. Probably not so fast though.

vielmetti · 2017-03-27T06:02:36Z

There's related wrok here https://github.com/minio/sha256-simd/ and this open issue there for upstream support minio/sha256-simd#7

matt2909 · 2017-03-27T12:40:44Z

I agree it's "related" in that they both accelerate crypto things, and they are both nice to haves, but sha256 acceleration should be filed as a separate issue.

…

On 27 March 2017 at 07:03, Edward Vielmetti ***@***.***> wrote: There's related wrok here https://github.com/minio/sha256-simd/ and this open issue there for upstream support minio/sha256-simd#7 <minio/sha256-simd#7> — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#18498 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAUR3thfZBFUp5-nLRyl2Blm9gW6gKW-ks5rp1EegaJpZM4LZeXN> .

ALTree · 2017-10-25T19:23:23Z

A partial fix here: https://go-review.googlesource.com/c/go/+/64490

gopherbot · 2017-11-22T02:11:14Z

Change https://golang.org/cl/64490 mentions this issue: crypto/aes: optimize arm64 AES implementation

gopherbot · 2017-11-22T02:21:12Z

Change https://golang.org/cl/77810 mentions this issue: crypto/aes: implement AES-GCM mode

gopherbot · 2018-03-26T09:09:15Z

Change https://golang.org/cl/102460 mentions this issue: crypto/aes: implement AES-GCM mode(interleave of CTR and GHASH ) for arm64

gopherbot · 2018-04-15T23:28:12Z

Change https://golang.org/cl/107298 mentions this issue: crypto/aes: implement AES-GCM AEAD for arm64

vielmetti · 2018-06-26T22:16:14Z

On arm64, Packet Type 2A / c1.large.arm Cavium ThunderX:

ed@ed-2a-bcc-llvm:~$ go test crypto/cipher -bench GCM

goos: linux
goarch: arm64
pkg: crypto/cipher
BenchmarkAESGCMSeal1K-96           20000             68788 ns/op          14.89 MB/s
BenchmarkAESGCMOpen1K-96           20000             68697 ns/op          14.91 MB/s
BenchmarkAESGCMSign8K-96           10000            182114 ns/op          44.98 MB/s
BenchmarkAESGCMSeal8K-96            3000            536359 ns/op          15.27 MB/s
BenchmarkAESGCMOpen8K-96            3000            537432 ns/op          15.24 MB/s
PASS
ok      crypto/cipher   9.404s
ed@ed-2a-bcc-llvm:~$ 
ed@ed-2a-bcc-llvm:~$ go version
go version go1.10.2 linux/arm64
ed@ed-2a-bcc-llvm:~$ ~/go/bin/go1.11beta1 test crypto/cipher -bench GCM
goos: linux
goarch: arm64
pkg: crypto/cipher
BenchmarkAESGCMSeal1K-96           50000             37520 ns/op          27.29 MB/s
BenchmarkAESGCMOpen1K-96           50000             37550 ns/op          27.27 MB/s
BenchmarkAESGCMSign8K-96           10000            172278 ns/op          47.55 MB/s
BenchmarkAESGCMSeal8K-96            5000            289794 ns/op          28.27 MB/s
BenchmarkAESGCMOpen8K-96            5000            288511 ns/op          28.39 MB/s
PASS
ok      crypto/cipher   9.274s

1.11beta1 is substantially faster than 1.10.2.

jared2501 · 2018-07-19T06:51:54Z

So excited to see this merge! https://go-review.googlesource.com/c/go/+/107298

Use the dedicated AES* and PMULL* instructions to accelerate AES-GCM name old time/op new time/op delta AESGCMSeal1K-46 12.1µs ± 0% 0.9µs ± 0% -92.66% (p=0.000 n=9+10) AESGCMOpen1K-46 12.1µs ± 0% 0.9µs ± 0% -92.43% (p=0.000 n=10+10) AESGCMSign8K-46 58.6µs ± 0% 2.1µs ± 0% -96.41% (p=0.000 n=9+8) AESGCMSeal8K-46 92.8µs ± 0% 5.7µs ± 0% -93.86% (p=0.000 n=9+9) AESGCMOpen8K-46 92.9µs ± 0% 5.7µs ± 0% -93.84% (p=0.000 n=8+9) name old speed new speed delta AESGCMSeal1K-46 84.7MB/s ± 0% 1153.4MB/s ± 0% +1262.21% (p=0.000 n=9+10) AESGCMOpen1K-46 84.4MB/s ± 0% 1115.2MB/s ± 0% +1220.53% (p=0.000 n=10+10) AESGCMSign8K-46 140MB/s ± 0% 3894MB/s ± 0% +2687.50% (p=0.000 n=9+10) AESGCMSeal8K-46 88.2MB/s ± 0% 1437.5MB/s ± 0% +1529.30% (p=0.000 n=9+9) AESGCMOpen8K-46 88.2MB/s ± 0% 1430.5MB/s ± 0% +1522.01% (p=0.000 n=8+9) This change mirrors the current amd64 implementation, and provides optimal performance on a range of arm64 processors including Centriq 2400 and Apple A12. By and large it is implicitly tested by the robustness of the already existing amd64 implementation. The implementation interleaves GHASH with CTR mode to achieve the highest possible throughput, it also aggregates GHASH with a factor of 8, to decrease the cost of the reduction step. Even thought there is a significant amount of assembly, the code reuses the go code for the amd64 implementation, so there is little additional go code. Since AES-GCM is critical for performance of all web servers, this change is required to level the playfield for arm64 CPUs, where amd64 currently enjoys an unfair advantage. Ideally both amd64 and arm64 codepaths could be replaced by hypothetical AES and CLMUL intrinsics, with a few additional vector instructions. Fixes #18498 Fixes #19840 Change-Id: Icc57b868cd1f67ac695c1ac163a8e215f74c7910 Reviewed-on: https://go-review.googlesource.com/107298 Run-TryBot: Vlad Krasnov <vlad@cloudflare.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

jared2501 · 2018-07-20T06:49:32Z

Thanks so much for the hard work on this!

Use the dedicated AES* and PMULL* instructions to accelerate AES-GCM name old time/op new time/op delta AESGCMSeal1K-46 12.1µs ± 0% 0.9µs ± 0% -92.66% (p=0.000 n=9+10) AESGCMOpen1K-46 12.1µs ± 0% 0.9µs ± 0% -92.43% (p=0.000 n=10+10) AESGCMSign8K-46 58.6µs ± 0% 2.1µs ± 0% -96.41% (p=0.000 n=9+8) AESGCMSeal8K-46 92.8µs ± 0% 5.7µs ± 0% -93.86% (p=0.000 n=9+9) AESGCMOpen8K-46 92.9µs ± 0% 5.7µs ± 0% -93.84% (p=0.000 n=8+9) name old speed new speed delta AESGCMSeal1K-46 84.7MB/s ± 0% 1153.4MB/s ± 0% +1262.21% (p=0.000 n=9+10) AESGCMOpen1K-46 84.4MB/s ± 0% 1115.2MB/s ± 0% +1220.53% (p=0.000 n=10+10) AESGCMSign8K-46 140MB/s ± 0% 3894MB/s ± 0% +2687.50% (p=0.000 n=9+10) AESGCMSeal8K-46 88.2MB/s ± 0% 1437.5MB/s ± 0% +1529.30% (p=0.000 n=9+9) AESGCMOpen8K-46 88.2MB/s ± 0% 1430.5MB/s ± 0% +1522.01% (p=0.000 n=8+9) This change mirrors the current amd64 implementation, and provides optimal performance on a range of arm64 processors including Centriq 2400 and Apple A12. By and large it is implicitly tested by the robustness of the already existing amd64 implementation. The implementation interleaves GHASH with CTR mode to achieve the highest possible throughput, it also aggregates GHASH with a factor of 8, to decrease the cost of the reduction step. Even thought there is a significant amount of assembly, the code reuses the go code for the amd64 implementation, so there is little additional go code. Since AES-GCM is critical for performance of all web servers, this change is required to level the playfield for arm64 CPUs, where amd64 currently enjoys an unfair advantage. Ideally both amd64 and arm64 codepaths could be replaced by hypothetical AES and CLMUL intrinsics, with a few additional vector instructions. Fixes golang#18498 Fixes golang#19840 Change-Id: Icc57b868cd1f67ac695c1ac163a8e215f74c7910 Reviewed-on: https://go-review.googlesource.com/107298 Run-TryBot: Vlad Krasnov <vlad@cloudflare.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>

matt2909 mentioned this issue Jan 3, 2017

crypto/aes: add dedicated asm version of AES-GCM for Power/ARM64 #12408

Closed

bradfitz added this to the Unplanned milestone Jan 3, 2017

bradfitz added the Performance label Jan 3, 2017

ncw mentioned this issue Jan 10, 2017

Crypto Performance on ARMv8 rclone/rclone#1013

Closed

bradfitz modified the milestones: Go1.9, Unplanned Feb 23, 2017

yonderblue mentioned this issue Feb 24, 2017

Portable gocryptfs rfjakob/gocryptfs#79

Closed

vielmetti mentioned this issue Mar 27, 2017

Go 1.7 implemented AVX2 support should we do ARM64 support upstream? minio/sha256-simd#7

Closed

ALTree modified the milestones: Go1.10, Go1.9 Jun 3, 2017

bdarnell mentioned this issue Aug 6, 2017

crypto/tls: Export TLS default cipher suites #21167

Closed

matt2909 mentioned this issue Nov 21, 2017

crypto/aes: linux/arm64 Go 1.9 performance is +20X slower than OpenSSL #22808

Closed

rsc modified the milestones: Go1.10, Go1.11 Nov 22, 2017

xiaokangwang mentioned this issue Jan 14, 2018

Vmess: Use aes-128-gcm by default on arm64 platform v2ray/v2ray-core#812

Closed

losfair mentioned this issue Feb 17, 2018

extreamly high cpu consumption on arm device v2ray/v2ray-core#866

Closed

gopherbot closed this as completed in 917e726 Mar 6, 2018

bassosimone mentioned this issue Jun 18, 2019

ndt7: experiment with larger messages m-lab/ndt-server#130

Closed

golang locked and limited conversation to collaborators Jul 20, 2019

gopherbot added the FrozenDueToAge label Jul 20, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

crypto/aes: add dedicated asm version of AES, AES-GCM for arm64 #18498

crypto/aes: add dedicated asm version of AES, AES-GCM for arm64 #18498

matt2909 commented Jan 3, 2017

matt2909 commented Jan 3, 2017

yonderblue commented Feb 23, 2017

bradfitz commented Feb 23, 2017

cherrymui commented Feb 24, 2017

vielmetti commented Mar 27, 2017

matt2909 commented Mar 27, 2017 via email

ALTree commented Oct 25, 2017

gopherbot commented Nov 22, 2017

gopherbot commented Nov 22, 2017

gopherbot commented Mar 26, 2018

gopherbot commented Apr 15, 2018

vielmetti commented Jun 26, 2018

jared2501 commented Jul 19, 2018

jared2501 commented Jul 20, 2018

crypto/aes: add dedicated asm version of AES, AES-GCM for arm64 #18498

crypto/aes: add dedicated asm version of AES, AES-GCM for arm64 #18498

Comments

matt2909 commented Jan 3, 2017

matt2909 commented Jan 3, 2017

yonderblue commented Feb 23, 2017

bradfitz commented Feb 23, 2017

cherrymui commented Feb 24, 2017

vielmetti commented Mar 27, 2017

matt2909 commented Mar 27, 2017 via email

ALTree commented Oct 25, 2017

gopherbot commented Nov 22, 2017

gopherbot commented Nov 22, 2017

gopherbot commented Mar 26, 2018

gopherbot commented Apr 15, 2018

vielmetti commented Jun 26, 2018

jared2501 commented Jul 19, 2018

jared2501 commented Jul 20, 2018