Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: add GOMIPS32, GOMIPS64 ISA levels (iii, r1, r2, r5, r6) #60072

Open
HeliC829 opened this issue May 9, 2023 · 21 comments
Open

cmd/go: add GOMIPS32, GOMIPS64 ISA levels (iii, r1, r2, r5, r6) #60072

HeliC829 opened this issue May 9, 2023 · 21 comments

Comments

@HeliC829
Copy link
Contributor

HeliC829 commented May 9, 2023

Currently GOMIPS64 accepts hardfloat(as default) and softfloat.

Golang currently support MIPS III or higher. I had submitted two CLs and they take little performance improvement to MIPS64. CL 485635 CL 485595. But those instructions only available after r1 but not MIPS III.

So if we want to get more performance improvement on mips64, we should support more isa level.

We wish that GOMIPS can also accept r2/r5.

I tried introduce some instructions from MIPS R2. The following data shows the test results and performance improvement if we can support newer isa level on mips64x..

goos: linux
goarch: mips64le
pkg: crypto/tls
                                                 │    oldtls     │               newtls                │
                                                 │    sec/op     │    sec/op     vs base               │
CertCache/0-4                                       5.839m ±  6%   6.417m ±  8%   +9.91% (p=0.001 n=8)
CertCache/1-4                                       6.277m ±  6%   6.246m ±  7%        ~ (p=0.721 n=8)
CertCache/2-4                                       6.119m ± 14%   6.305m ±  7%   +3.04% (p=0.050 n=8)
CertCache/3-4                                       6.115m ± 10%   6.542m ± 11%   +6.98% (p=0.038 n=8)
HandshakeServer/RSA-4                               6.293m ±  1%   6.214m ±  0%   -1.26% (p=0.002 n=8)
HandshakeServer/ECDHE-P256-RSA/TLSv13-4             11.57m ±  0%   11.34m ±  1%   -1.98% (p=0.010 n=8)
HandshakeServer/ECDHE-P256-RSA/TLSv12-4             10.89m ±  0%   10.79m ±  0%   -0.88% (p=0.000 n=8)
HandshakeServer/ECDHE-P256-ECDSA-P256/TLSv13-4      7.247m ±  1%   7.008m ±  1%   -3.29% (p=0.007 n=8)
HandshakeServer/ECDHE-P256-ECDSA-P256/TLSv12-4      6.592m ±  0%   6.496m ±  0%   -1.46% (p=0.000 n=8)
HandshakeServer/ECDHE-X25519-ECDSA-P256/TLSv13-4    5.356m ±  3%   5.172m ±  3%   -3.45% (p=0.015 n=8)
HandshakeServer/ECDHE-X25519-ECDSA-P256/TLSv12-4    4.686m ±  1%   4.566m ±  0%   -2.56% (p=0.000 n=8)
HandshakeServer/ECDHE-P521-ECDSA-P521/TLSv13-4      220.2m ±  0%   217.4m ±  0%   -1.26% (p=0.000 n=8)
HandshakeServer/ECDHE-P521-ECDSA-P521/TLSv12-4      219.6m ±  0%   216.9m ±  0%   -1.25% (p=0.000 n=8)
Throughput/MaxPacket/1MB/TLSv12-4                   519.1m ±  1%   148.1m ±  2%  -71.47% (p=0.000 n=8)
Throughput/MaxPacket/1MB/TLSv13-4                   537.9m ±  0%   164.5m ±  1%  -69.42% (p=0.000 n=8)
Throughput/MaxPacket/2MB/TLSv12-4                  1028.5m ±  0%   279.3m ±  1%  -72.85% (p=0.000 n=8)
Throughput/MaxPacket/2MB/TLSv13-4                  1063.3m ±  0%   313.1m ±  1%  -70.56% (p=0.000 n=8)
Throughput/MaxPacket/4MB/TLSv12-4                  2036.4m ±  0%   552.1m ±  1%  -72.89% (p=0.000 n=8)
Throughput/MaxPacket/4MB/TLSv13-4                  2106.4m ±  0%   614.5m ±  1%  -70.83% (p=0.000 n=8)
Throughput/MaxPacket/8MB/TLSv12-4                    4.064 ±  0%    1.080 ±  4%  -73.43% (p=0.000 n=8)
Throughput/MaxPacket/8MB/TLSv13-4                    4.198 ±  0%    1.212 ±  7%  -71.12% (p=0.000 n=8)
Throughput/MaxPacket/16MB/TLSv12-4                   8.115 ±  1%    2.202 ±  7%  -72.87% (p=0.000 n=8)
Throughput/MaxPacket/16MB/TLSv13-4                   8.383 ±  0%    2.403 ±  1%  -71.33% (p=0.000 n=8)
Throughput/MaxPacket/32MB/TLSv12-4                  16.198 ±  0%    4.283 ±  0%  -73.56% (p=0.000 n=8)
Throughput/MaxPacket/32MB/TLSv13-4                  16.763 ±  0%    4.792 ±  1%  -71.42% (p=0.000 n=8)
Throughput/MaxPacket/64MB/TLSv12-4                  32.388 ±  0%    8.603 ±  2%  -73.44% (p=0.000 n=8)
Throughput/MaxPacket/64MB/TLSv13-4                  33.502 ±  0%    9.636 ±  1%  -71.24% (p=0.000 n=8)
Throughput/DynamicPacket/1MB/TLSv12-4               514.2m ±  1%   146.3m ±  1%  -71.55% (p=0.000 n=8)
Throughput/DynamicPacket/1MB/TLSv13-4               531.9m ±  1%   162.4m ±  2%  -69.47% (p=0.000 n=8)
Throughput/DynamicPacket/2MB/TLSv12-4              1019.8m ±  3%   279.2m ±  3%  -72.62% (p=0.000 n=8)
Throughput/DynamicPacket/2MB/TLSv13-4              1056.9m ±  0%   311.2m ±  1%  -70.56% (p=0.000 n=8)
Throughput/DynamicPacket/4MB/TLSv12-4              2031.2m ±  1%   547.3m ±  1%  -73.06% (p=0.000 n=8)
Throughput/DynamicPacket/4MB/TLSv13-4              2102.5m ±  0%   608.2m ±  1%  -71.07% (p=0.000 n=8)
Throughput/DynamicPacket/8MB/TLSv12-4                4.053 ±  0%    1.082 ±  1%  -73.31% (p=0.000 n=8)
Throughput/DynamicPacket/8MB/TLSv13-4                4.193 ±  0%    1.216 ±  1%  -70.99% (p=0.000 n=8)
Throughput/DynamicPacket/16MB/TLSv12-4               8.104 ±  1%    2.151 ±  2%  -73.46% (p=0.000 n=8)
Throughput/DynamicPacket/16MB/TLSv13-4               8.388 ±  0%    2.406 ±  1%  -71.32% (p=0.000 n=8)
Throughput/DynamicPacket/32MB/TLSv12-4              16.202 ±  0%    4.287 ±  1%  -73.54% (p=0.000 n=8)
Throughput/DynamicPacket/32MB/TLSv13-4              16.761 ±  0%    4.869 ±  2%  -70.95% (p=0.000 n=8)
Throughput/DynamicPacket/64MB/TLSv12-4              32.394 ±  0%    8.589 ±  2%  -73.49% (p=0.000 n=8)
Throughput/DynamicPacket/64MB/TLSv13-4              33.500 ±  0%    9.610 ±  3%  -71.31% (p=0.000 n=8)
Latency/MaxPacket/200kbps/TLSv12-4                  719.9m ±  0%   712.3m ±  0%   -1.06% (p=0.000 n=8)
Latency/MaxPacket/200kbps/TLSv13-4                  722.7m ±  0%   714.5m ±  0%   -1.13% (p=0.000 n=8)
Latency/MaxPacket/500kbps/TLSv12-4                  303.9m ±  0%   296.1m ±  0%   -2.57% (p=0.000 n=8)
Latency/MaxPacket/500kbps/TLSv13-4                  304.5m ±  0%   296.3m ±  0%   -2.68% (p=0.000 n=8)
Latency/MaxPacket/1000kbps/TLSv12-4                 165.5m ±  0%   157.5m ±  0%   -4.85% (p=0.000 n=8)
Latency/MaxPacket/1000kbps/TLSv13-4                 165.0m ±  0%   156.6m ±  0%   -5.07% (p=0.000 n=8)
Latency/MaxPacket/2000kbps/TLSv12-4                 96.05m ±  0%   88.17m ±  0%   -8.21% (p=0.000 n=8)
Latency/MaxPacket/2000kbps/TLSv13-4                 95.48m ±  0%   87.23m ±  0%   -8.65% (p=0.000 n=8)
Latency/MaxPacket/5000kbps/TLSv12-4                 54.42m ±  1%   46.43m ±  0%  -14.68% (p=0.000 n=8)
Latency/MaxPacket/5000kbps/TLSv13-4                 54.75m ±  0%   46.36m ±  0%  -15.33% (p=0.000 n=8)
Latency/DynamicPacket/200kbps/TLSv12-4              152.4m ±  0%   149.2m ±  0%   -2.13% (p=0.000 n=8)
Latency/DynamicPacket/200kbps/TLSv13-4              153.8m ±  0%   151.6m ±  0%   -1.48% (p=0.000 n=8)
Latency/DynamicPacket/500kbps/TLSv12-4              73.47m ±  0%   69.92m ±  1%   -4.84% (p=0.000 n=8)
Latency/DynamicPacket/500kbps/TLSv13-4              72.63m ±  1%   70.06m ±  0%   -3.54% (p=0.000 n=8)
Latency/DynamicPacket/1000kbps/TLSv12-4             47.15m ±  0%   43.59m ±  0%   -7.55% (p=0.000 n=8)
Latency/DynamicPacket/1000kbps/TLSv13-4             45.26m ±  1%   42.60m ±  1%   -5.88% (p=0.000 n=8)
Latency/DynamicPacket/2000kbps/TLSv12-4             33.88m ±  0%   30.25m ±  0%  -10.70% (p=0.000 n=8)
Latency/DynamicPacket/2000kbps/TLSv13-4             31.90m ±  1%   29.36m ±  0%   -7.96% (p=0.000 n=8)
Latency/DynamicPacket/5000kbps/TLSv12-4             25.60m ±  0%   21.99m ±  1%  -14.12% (p=0.000 n=8)
Latency/DynamicPacket/5000kbps/TLSv13-4             24.41m ±  0%   21.93m ±  1%  -10.19% (p=0.000 n=8)
geomean                                             346.1m         188.9m        -45.43%

                                       │    oldtls     │                newtls                 │
                                       │      B/s      │      B/s       vs base                │
Throughput/MaxPacket/1MB/TLSv12-4        1.926Mi ±  1%   6.752Mi ± 13%  +250.50% (p=0.000 n=8)
Throughput/MaxPacket/1MB/TLSv13-4        1.860Mi ±  1%   6.080Mi ±  4%  +226.92% (p=0.000 n=8)
Throughput/MaxPacket/2MB/TLSv12-4        1.945Mi ±  0%   7.162Mi ±  3%  +268.14% (p=0.000 n=8)
Throughput/MaxPacket/2MB/TLSv13-4        1.884Mi ±  4%   6.390Mi ± 21%  +239.24% (p=0.000 n=8)
Throughput/MaxPacket/4MB/TLSv12-4        1.965Mi ±  2%   7.243Mi ±  4%  +268.69% (p=0.000 n=8)
Throughput/MaxPacket/4MB/TLSv13-4        1.898Mi ±  1%   6.509Mi ±  5%  +242.96% (p=0.000 n=8)
Throughput/MaxPacket/8MB/TLSv12-4        1.969Mi ±  0%   7.405Mi ±  6%  +276.03% (p=0.000 n=8)
Throughput/MaxPacket/8MB/TLSv13-4        1.907Mi ±  0%   6.599Mi ± 18%  +246.00% (p=0.000 n=8)
Throughput/MaxPacket/16MB/TLSv12-4       1.974Mi ±  1%   7.262Mi ±  8%  +267.87% (p=0.000 n=8)
Throughput/MaxPacket/16MB/TLSv13-4       1.907Mi ±  4%   6.657Mi ±  2%  +249.00% (p=0.000 n=8)
Throughput/MaxPacket/32MB/TLSv12-4       1.974Mi ±  2%   7.467Mi ±  3%  +278.26% (p=0.000 n=8)
Throughput/MaxPacket/32MB/TLSv13-4       1.907Mi ±  1%   6.680Mi ±  1%  +250.25% (p=0.000 n=8)
Throughput/MaxPacket/64MB/TLSv12-4       1.974Mi ±  2%   7.439Mi ±  3%  +276.81% (p=0.000 n=8)
Throughput/MaxPacket/64MB/TLSv13-4       1.912Mi ±  1%   6.642Mi ±  2%  +247.38% (p=0.000 n=8)
Throughput/DynamicPacket/1MB/TLSv12-4    1.945Mi ± 12%   6.838Mi ±  8%  +251.47% (p=0.000 n=8)
Throughput/DynamicPacket/1MB/TLSv13-4    1.879Mi ±  1%   6.156Mi ±  3%  +227.66% (p=0.000 n=8)
Throughput/DynamicPacket/2MB/TLSv12-4    1.965Mi ± 11%   7.167Mi ± 16%  +264.81% (p=0.000 n=8)
Throughput/DynamicPacket/2MB/TLSv13-4    1.893Mi ±  1%   6.428Mi ±  2%  +239.55% (p=0.000 n=8)
Throughput/DynamicPacket/4MB/TLSv12-4    1.969Mi ±  1%   7.310Mi ±  6%  +271.19% (p=0.000 n=8)
Throughput/DynamicPacket/4MB/TLSv13-4    1.903Mi ±  0%   6.576Mi ±  2%  +245.61% (p=0.000 n=8)
Throughput/DynamicPacket/8MB/TLSv12-4    1.974Mi ±  1%   7.396Mi ± 10%  +274.64% (p=0.000 n=8)
Throughput/DynamicPacket/8MB/TLSv13-4    1.907Mi ±  1%   6.576Mi ±  3%  +244.75% (p=0.000 n=8)
Throughput/DynamicPacket/16MB/TLSv12-4   1.974Mi ±  3%   7.439Mi ±  3%  +276.81% (p=0.000 n=8)
Throughput/DynamicPacket/16MB/TLSv13-4   1.907Mi ±  1%   6.647Mi ±  4%  +248.50% (p=0.000 n=8)
Throughput/DynamicPacket/32MB/TLSv12-4   1.974Mi ±  0%   7.463Mi ± 10%  +278.02% (p=0.000 n=8)
Throughput/DynamicPacket/32MB/TLSv13-4   1.907Mi ±  1%   6.576Mi ±  2%  +244.75% (p=0.000 n=8)
Throughput/DynamicPacket/64MB/TLSv12-4   1.974Mi ±  0%   7.448Mi ±  3%  +277.29% (p=0.000 n=8)
Throughput/DynamicPacket/64MB/TLSv13-4   1.912Mi ±  1%   6.661Mi ±  4%  +248.38% (p=0.000 n=8)
geomean                                  1.931Mi         6.878Mi        +256.13%
goos: linux
goarch: mips64le
pkg: crypto/md5
                      │    oldmd5    │               newmd5               │
                      │    sec/op    │   sec/op     vs base               │
Hash8Bytes-4             2.712µ ± 0%   2.514µ ± 0%   -7.28% (p=0.000 n=8)
Hash64-4                 3.387µ ± 0%   2.999µ ± 0%  -11.46% (p=0.000 n=8)
Hash128-4                4.115µ ± 0%   3.527µ ± 0%  -14.30% (p=0.000 n=8)
Hash256-4                5.569µ ± 0%   4.583µ ± 0%  -17.71% (p=0.000 n=8)
Hash512-4                8.492µ ± 0%   6.709µ ± 0%  -21.00% (p=0.000 n=8)
Hash1K-4                 14.31µ ± 0%   10.94µ ± 0%  -23.57% (p=0.000 n=8)
Hash8K-4                 95.82µ ± 0%   70.18µ ± 0%  -26.76% (p=0.000 n=8)
Hash1M-4                11.933m ± 0%   8.674m ± 0%  -27.31% (p=0.000 n=8)
Hash8M-4                 95.45m ± 0%   69.40m ± 0%  -27.29% (p=0.000 n=8)
Hash8BytesUnaligned-4    2.784µ ± 0%   2.588µ ± 0%   -7.04% (p=0.000 n=8)
Hash1KUnaligned-4        14.31µ ± 0%   10.95µ ± 0%  -23.48% (p=0.000 n=8)
Hash8KUnaligned-4        95.76µ ± 0%   70.23µ ± 0%  -26.66% (p=0.000 n=8)
geomean                  38.51µ        30.88µ       -19.82%

                      │    oldmd5     │                newmd5                │
                      │      B/s      │      B/s       vs base               │
Hash8Bytes-4            2.813Mi ±  0%    3.033Mi ± 0%   +7.80% (p=0.000 n=8)
Hash64-4                18.02Mi ±  0%    20.35Mi ± 0%  +12.91% (p=0.000 n=8)
Hash128-4               29.66Mi ±  0%    34.61Mi ± 0%  +16.69% (p=0.000 n=8)
Hash256-4               43.85Mi ±  0%    53.27Mi ± 0%  +21.50% (p=0.000 n=8)
Hash512-4               57.50Mi ±  0%    72.78Mi ± 0%  +26.59% (p=0.000 n=8)
Hash1K-4                68.25Mi ±  0%    89.30Mi ± 0%  +30.84% (p=0.000 n=8)
Hash8K-4                81.53Mi ±  0%   111.33Mi ± 0%  +36.54% (p=0.000 n=8)
Hash1M-4                83.80Mi ± 28%   115.29Mi ± 0%  +37.58% (p=0.000 n=8)
Hash8M-4                83.82Mi ±  0%   115.27Mi ± 0%  +37.52% (p=0.000 n=8)
Hash8BytesUnaligned-4   2.737Mi ±  0%    2.947Mi ± 0%   +7.67% (p=0.000 n=8)
Hash1KUnaligned-4       68.24Mi ±  0%    89.19Mi ± 0%  +30.69% (p=0.000 n=8)
Hash8KUnaligned-4       81.59Mi ±  0%   111.24Mi ± 0%  +36.34% (p=0.000 n=8)
geomean                 33.84Mi          42.21Mi       +24.72%
goos: linux
goarch: mips64le
pkg: crypto/sha1
                   │   oldsha1   │              newsha1               │
                   │   sec/op    │   sec/op     vs base               │
Hash8Bytes/New-4     5.341µ ± 0%   4.863µ ± 0%   -8.95% (p=0.000 n=8)
Hash8Bytes/Sum-4     5.456µ ± 0%   4.983µ ± 0%   -8.68% (p=0.000 n=8)
Hash320Bytes/New-4   16.69µ ± 0%   13.85µ ± 0%  -17.00% (p=0.000 n=8)
Hash320Bytes/Sum-4   16.81µ ± 0%   13.97µ ± 0%  -16.92% (p=0.000 n=8)
Hash1K/New-4         42.90µ ± 0%   34.81µ ± 0%  -18.87% (p=0.000 n=8)
Hash1K/Sum-4         43.02µ ± 0%   34.94µ ± 0%  -18.80% (p=0.000 n=8)
Hash8K/New-4         309.6µ ± 0%   248.3µ ± 0%  -19.78% (p=0.000 n=8)
Hash8K/Sum-4         309.5µ ± 0%   248.5µ ± 0%  -19.71% (p=0.000 n=8)
geomean              33.11µ        27.75µ       -16.20%

                   │   oldsha1    │               newsha1               │
                   │     B/s      │     B/s       vs base               │
Hash8Bytes/New-4     1.431Mi ± 1%   1.574Mi ± 1%  +10.00% (p=0.000 n=8)
Hash8Bytes/Sum-4     1.402Mi ± 1%   1.535Mi ± 1%   +9.52% (p=0.000 n=8)
Hash320Bytes/New-4   18.29Mi ± 0%   22.04Mi ± 0%  +20.49% (p=0.000 n=8)
Hash320Bytes/Sum-4   18.15Mi ± 0%   21.85Mi ± 0%  +20.39% (p=0.000 n=8)
Hash1K/New-4         22.76Mi ± 1%   28.06Mi ± 0%  +23.25% (p=0.000 n=8)
Hash1K/Sum-4         22.70Mi ± 0%   27.95Mi ± 0%  +23.13% (p=0.000 n=8)
Hash8K/New-4         25.24Mi ± 0%   31.46Mi ± 0%  +24.64% (p=0.000 n=8)
Hash8K/Sum-4         25.24Mi ± 0%   31.44Mi ± 0%  +24.54% (p=0.000 n=8)
geomean              11.03Mi        13.16Mi       +19.35%
goos: linux
goarch: mips64le
pkg: math/bits
                  │   oldbits    │              newbits               │
                  │    sec/op    │   sec/op     vs base               │
LeadingZeros-4      20.505n ± 1%   6.780n ± 0%  -66.93% (p=0.000 n=8)
LeadingZeros8-4     10.040n ± 0%   9.039n ± 0%   -9.98% (p=0.000 n=8)
LeadingZeros16-4    19.085n ± 0%   9.038n ± 0%  -52.64% (p=0.000 n=8)
LeadingZeros32-4     24.13n ± 0%   10.55n ± 0%  -56.28% (p=0.000 n=8)
LeadingZeros64-4    19.660n ± 0%   6.776n ± 0%  -65.54% (p=0.000 n=8)
TrailingZeros-4     13.055n ± 0%   9.037n ± 0%  -30.77% (p=0.000 n=8)
TrailingZeros8-4     7.364n ± 0%   7.364n ± 0%        ~ (p=0.449 n=8)
TrailingZeros16-4    17.07n ± 0%   10.05n ± 0%  -41.14% (p=0.000 n=8)
TrailingZeros32-4   17.405n ± 0%   8.534n ± 0%  -50.97% (p=0.000 n=8)
TrailingZeros64-4   13.050n ± 0%   9.037n ± 0%  -30.75% (p=0.000 n=8)
OnesCount-4          21.09n ± 0%   21.10n ± 0%        ~ (p=0.054 n=8)
OnesCount8-4         6.024n ± 0%   6.024n ± 0%        ~ (p=0.533 n=8)
OnesCount16-4        13.05n ± 0%   13.05n ± 0%        ~ (p=1.000 n=8)
OnesCount32-4        20.08n ± 0%   20.08n ± 0%        ~ (p=0.367 n=8)
OnesCount64-4        23.10n ± 0%   23.11n ± 0%        ~ (p=0.407 n=8)
RotateLeft-4         9.037n ± 0%   4.418n ± 0%  -51.11% (p=0.000 n=8)
RotateLeft8-4        9.537n ± 0%   9.208n ± 0%   -3.45% (p=0.000 n=8)
RotateLeft16-4       9.208n ± 0%   9.375n ± 0%   +1.82% (p=0.000 n=8)
RotateLeft32-4      10.380n ± 0%   4.021n ± 0%  -61.26% (p=0.000 n=8)
RotateLeft64-4       8.034n ± 0%   4.016n ± 0%  -50.01% (p=0.000 n=8)
Reverse-4            62.26n ± 0%   18.08n ± 0%  -70.96% (p=0.000 n=8)
Reverse8-4           5.020n ± 0%   5.021n ± 0%        ~ (p=1.000 n=8)
Reverse16-4          9.036n ± 0%   9.039n ± 0%        ~ (p=0.098 n=8)
Reverse32-4          29.13n ± 0%   23.11n ± 0%  -20.68% (p=0.000 n=8)
Reverse64-4          27.50n ± 0%   21.10n ± 0%  -23.27% (p=0.000 n=8)
ReverseBytes-4      13.970n ± 1%   3.044n ± 1%  -78.21% (p=0.000 n=8)
ReverseBytes16-4     4.297n ± 1%   4.329n ± 1%   +0.74% (p=0.050 n=8)
ReverseBytes32-4    12.050n ± 0%   5.021n ± 0%  -58.34% (p=0.000 n=8)
ReverseBytes64-4    17.220n ± 2%   3.030n ± 0%  -82.40% (p=0.000 n=8)
Add-4                8.178n ± 0%   8.188n ± 0%        ~ (p=0.661 n=8)
Add32-4              8.284n ± 0%   8.285n ± 0%        ~ (p=0.292 n=8)
Add64-4              7.890n ± 1%   7.876n ± 0%        ~ (p=0.522 n=8)
Add64multiple-4      17.08n ± 0%   17.08n ± 0%        ~ (p=0.297 n=8)
Sub-4                9.543n ± 0%   9.540n ± 0%        ~ (p=0.312 n=8)
Sub32-4              13.07n ± 0%   13.05n ± 0%   -0.08% (p=0.011 n=8)
Sub64-4              10.30n ± 0%   10.29n ± 0%        ~ (p=0.080 n=8)
Sub64multiple-4      19.09n ± 0%   19.08n ± 0%   -0.05% (p=0.008 n=8)
Mul-4                5.100n ± 0%   5.097n ± 0%        ~ (p=0.338 n=8)
Mul32-4              7.371n ± 0%   7.363n ± 0%   -0.11% (p=0.000 n=8)
Mul64-4              5.242n ± 0%   5.020n ± 0%   -4.24% (p=0.000 n=8)
Div-4                133.6n ± 0%   118.4n ± 0%  -11.38% (p=0.000 n=8)
Div32-4              15.65n ± 1%   15.41n ± 0%   -1.53% (p=0.000 n=8)
Div64-4              132.7n ± 0%   117.3n ± 1%  -11.53% (p=0.000 n=8)
geomean              13.85n        9.917n       -28.41%
goos: linux
goarch: mips64le
pkg: crypto/sha256
                    │  oldsha256  │             newsha256              │
                    │   sec/op    │   sec/op     vs base               │
Hash8Bytes/New-4      6.689µ ± 0%   6.094µ ± 0%   -8.89% (p=0.000 n=8)
Hash8Bytes/Sum224-4   7.106µ ± 0%   6.507µ ± 0%   -8.43% (p=0.000 n=8)
Hash8Bytes/Sum256-4   7.217µ ± 0%   6.623µ ± 0%   -8.24% (p=0.000 n=8)
Hash1K/New-4          62.66µ ± 0%   52.35µ ± 0%  -16.45% (p=0.000 n=8)
Hash1K/Sum224-4       62.91µ ± 0%   52.75µ ± 0%  -16.16% (p=0.000 n=8)
Hash1K/Sum256-4       63.03µ ± 0%   52.86µ ± 0%  -16.14% (p=0.000 n=8)
Hash8K/New-4          450.8µ ± 0%   373.5µ ± 0%  -17.15% (p=0.000 n=8)
Hash8K/Sum224-4       451.0µ ± 0%   373.9µ ± 0%  -17.10% (p=0.000 n=8)
Hash8K/Sum256-4       451.5µ ± 0%   374.0µ ± 0%  -17.16% (p=0.000 n=8)
geomean               58.34µ        50.14µ       -14.05%

                    │  oldsha256   │              newsha256               │
                    │     B/s      │      B/s       vs base               │
Hash8Bytes/New-4      1.144Mi ± 1%   1.249Mi ±  0%   +9.17% (p=0.000 n=8)
Hash8Bytes/Sum224-4   1.078Mi ± 1%   1.173Mi ±  0%   +8.85% (p=0.000 n=8)
Hash8Bytes/Sum256-4   1.059Mi ± 1%   1.154Mi ±  0%   +9.01% (p=0.000 n=8)
Hash1K/New-4          15.58Mi ± 0%   18.65Mi ± 12%  +19.71% (p=0.000 n=8)
Hash1K/Sum224-4       15.53Mi ± 1%   18.51Mi ±  0%  +19.23% (p=0.000 n=8)
Hash1K/Sum256-4       15.49Mi ± 0%   18.47Mi ±  0%  +19.24% (p=0.000 n=8)
Hash8K/New-4          17.33Mi ± 0%   20.91Mi ±  0%  +20.69% (p=0.000 n=8)
Hash8K/Sum224-4       17.32Mi ± 0%   20.90Mi ±  0%  +20.65% (p=0.000 n=8)
Hash8K/Sum256-4       17.30Mi ± 0%   20.89Mi ±  0%  +20.73% (p=0.000 n=8)
geomean               6.649Mi        7.729Mi        +16.24%
@gopherbot gopherbot added this to the Proposal milestone May 9, 2023
@HeliC829
Copy link
Contributor Author

HeliC829 commented May 9, 2023

cc @cherrymui

@randall77
Copy link
Contributor

See my comment over at #59415 (comment)

@ianlancetaylor
Copy link
Contributor

@randall77 I think that your comment has been addressed: the proposal here is permitting setting GOMIPS64 to direct the compiler to generate a few special purpose instructions.

@HeliC829 The GOMIPS64 variable already exists, of course. I think that you are suggesting that we permit a comma-separate list of options in GOMIPS64. The options can be

  • either hardfloat (default) or softfloat
  • one of r1 (default), r2, r3, r5, r6

I added r1 because there has to be a way to specify the default. I added the others because compilers support them. I don't know what happened to r4.

Do you have any reference to what the different ISA levels mean? I couldn't find one.

@HeliC829
Copy link
Contributor Author

Do you have any reference to what the different ISA levels mean? I couldn't find one.

Here is MIPS ISA level ref, at Page 24 of 148 :
https://s3-eu-west-1.amazonaws.com/downloads-mips/documents/MD00083-2B-MIPS64INT-AFP-06.01.pdf

Golang currently support MIPS III on MIPS64, to be notice, MIPS III is different from MIPS R1. So we could consider 3 as the default level,

To resolve there is letter in ISA level, i think we can use the enum mips_isa level defined in gcc in rules.

@ianlancetaylor
Copy link
Contributor

I know that the situation is very confusing, but it doesn't seem ideal to treat 3 as the default level while also permitting r1. Can we come up with a list of strings that makes sense today and also for the future?

@HeliC829
Copy link
Contributor Author

OK, so let us use roman numerals iii mean default level MIPS III? And the value related to isa level are as follows:

iii: MIPS III (default, also current MIPS64 isa level)
r1:MIPS R1
r2:MIPS R2
r5:MIPS R5
r6:MIPS R6

@rsc
Copy link
Contributor

rsc commented Jun 7, 2023

From the doc linked above:

Screenshot 2023-06-07 at 1 56 15 PM

It sounds like GOMIPS64 is a comma-separated list of choices: hardfloat, softfloat, iii, r1, r2, r5, r6.
Probably we should define them all: iii, iv, v, r1, r2, r3, r5, r6. We may not use them today but they'll be defined.

Do I have that right?

@rsc
Copy link
Contributor

rsc commented Jun 7, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc changed the title proposal: MIPS64: pass ISA level with GOMIPS64 cmd/go: add GOMIPS32, GOMIPS64 ISA levels (i, ii, iii, iv, v, r1, r2, r3, r5, r6) Jun 7, 2023
@Rongronggg9
Copy link
Member

It sounds like GOMIPS64 is a comma-separated list of choices: hardfloat, softfloat, iii, r1, r2, r5, r6.

Right.

Probably we should define them all: iii, iv, v, r1, r2, r3, r5, r6. We may not use them today but they'll be defined.

Just FYI: there is no MIPS IV hardware running Linux distribution in practice and even no MIPS V hardware implementation. Besides, in user space, the difference between III, IV and V is tiny. R3 is a significant release but there are only privileged instructions added and no visible user space change compared to R2. Thus, as a minimum requirement, we consider that only defining iii, r1, r2, r5 and r6 should be enough. It is okay to define other ISA levels as reserved, of course, if there is such a demand.

@rsc
Copy link
Contributor

rsc commented Jun 14, 2023

In practice since we don't emit code that cares about the difference, GOMIPS32=iii and GOMIPS32=iv and GOMIPS32=v will all mean the same thing, but they exist(ed) and it's easy to include them, so we might as well recognize the full set.

@rsc
Copy link
Contributor

rsc commented Jun 14, 2023

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@Rongronggg9
Copy link
Member

Rongronggg9 commented Jun 14, 2023

Based on the discussion above, this proposal seems like a likely accept.

Excited news! Thanks for your review.

GOMIPS32=iii and GOMIPS32=iv and GOMIPS32=v

Did you mean GOMIPS64?

they exist(ed) and it's easy to include them, so we might as well recognize the full set.

Let me summarize:

ISA level GOMIPS32 GOMIPS64
i defined, ? N/A
ii defined, ? N/A
iii N/A valid, implemented (current default)
iv N/A valid, equivalent to iii
v N/A valid, equivalent to iii
r1 valid, implemented (current default) valid, ?1
r2 valid, to be implemented valid, to be implemented
r3 valid, equivalent to r2 valid, equivalent to r2
r42 N/A N/A
r5 valid, to be implemented valid, to be implemented
r6 valid, to be implemented valid, to be implemented

Footnotes

  1. I consider we can make GOMIPS64=r1 equivalent to GOMIPS64=iii for the time being, or separate r1-compatible optimizations from the GOMIPS64=r2 patchset if it is not too complex.

  2. Does not exist.

@gopherbot
Copy link

Change https://go.dev/cl/493816 mentions this issue: cmd/internal/obj/mips: add REBH/REBHV/REHVV instructions

@gopherbot
Copy link

Change https://go.dev/cl/485595 mentions this issue: math/bits: optimize BitLens64/32 on mips64x

@HeliC829
Copy link
Contributor Author

Excited news! Thanks for your review.

GOMIPS32=iii and GOMIPS32=iv and GOMIPS32=v

Did you mean GOMIPS64?

they exist(ed) and it's easy to include them, so we might as well recognize the full set.

Let me summarize:

It‘s such a good summary. Besides, each newer ISA level is the superset of previous version except for R6 (R6 removed and adjusted some outdated instructions due to the changes in microarchitecture desgin).

@rsc
Copy link
Contributor

rsc commented Jun 21, 2023

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@gopherbot
Copy link

Change https://go.dev/cl/508095 mentions this issue: internal/buildcfg: add support for accepting different MIPS ISA level on mips64

@HeliC829 HeliC829 changed the title cmd/go: add GOMIPS32, GOMIPS64 ISA levels (i, ii, iii, iv, v, r1, r2, r3, r5, r6) cmd/go: add GOMIPS32, GOMIPS64 ISA levels (iii, r1, r2, r5, r6) Jul 12, 2023
@HeliC829
Copy link
Contributor Author

Can some one take a look at CL 508095 ? So that I can rework on CL 485635 CL 485595 again.

gopherbot pushed a commit that referenced this issue Aug 3, 2023
Add support for WSBH/DSBH/DSHD instructions, which are introduced in mips{32,64}r2.

WSBH reverse bytes within halfwords for 32-bit word, DSBH reverse bytes within halfwords for 64-bit doubleword, and DSHD reverse halfwords within doublewords. These instructions can be used to optimize byte swaps.

Ref: The MIPS64 Instruction Set, Revision 5.04: https://s3-eu-west-1.amazonaws.com/downloads-mips/documents/MD00087-2B-MIPS64BIS-AFP-05.04.pdf

Updates #60072

Change-Id: I31c043150fe8ac03027f413ef4cb2f3e435775e1
Reviewed-on: https://go-review.googlesource.com/c/go/+/493816
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@google.com>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Joel Sing <joel@sing.id.au>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
@gopherbot
Copy link

Change https://go.dev/cl/515475 mentions this issue: cmd/internal/obj/mips: add SEB/SEH instructions

gopherbot pushed a commit that referenced this issue Aug 8, 2023
Add support for SEB/SEH instructions, which are introduced in mips32r2.

SEB/SEH can be used to sign-extend byte/halfword in registers directly without passing through memory.

Ref: The MIPS32 Instruction Set, Revision 5.04: https://s3-eu-west-1.amazonaws.com/downloads-mips/documents/MD00086-2B-MIPS32BIS-AFP-05.04.pdf

Updates #60072

Change-Id: I33175ae9d943ead5983ac004bd2a158039046d65
Reviewed-on: https://go-review.googlesource.com/c/go/+/515475
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Joel Sing <joel@sing.id.au>
clktmr added a commit to clktmr/go that referenced this issue Sep 30, 2023
For GOARCH=mips the Go compiler will use the newer MIPS32-r1 ISA,
whereas for GOARCH=mips64 it will use the MIPS-III ISA, which is the
highest N64 supports.

See golang#60072
@HeliC829
Copy link
Contributor Author

@cherrymui Hi, PTAL on CL 508095, thanks.

@gopherbot
Copy link

Change https://go.dev/cl/578175 mentions this issue: cmd/go: add GOMIPS32, GOMIPS64 ISA levels

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Accepted
Development

No branches or pull requests

6 participants