Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: Building a tool chain using PGO does not achieve the 2-7% improvement claimed by blogs #63407

Closed
qiulaidongfeng opened this issue Oct 6, 2023 · 2 comments
Labels
WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.

Comments

@qiulaidongfeng
Copy link
Contributor

What version of Go are you using (go version)?

$ go version
tip

Does this issue reproduce with the latest release?

yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env

What did you do?

I wrote CL 533015 to use pgo to build a toolchain, but I did not measure much performance difference using compilebench.

                         │   old.txt    │              new.txt               │
                         │    sec/op    │   sec/op     vs base               │
Template                    228.6m ± 2%   231.4m ± 2%       ~ (p=0.792 n=50)
Unicode                    100.62m ± 3%   97.97m ± 2%       ~ (p=0.124 n=50)
GoTypes                      1.654 ± 2%    1.646 ± 2%       ~ (p=0.683 n=50)
Compiler                    88.48m ± 2%   88.55m ± 1%       ~ (p=0.850 n=50)
SSA                          11.82 ± 1%    11.88 ± 1%  +0.49% (p=0.031 n=50)
Flate                       126.0m ± 2%   127.0m ± 2%       ~ (p=0.424 n=50)
GoParser                    259.1m ± 2%   257.8m ± 1%       ~ (p=0.327 n=50)
Tar                         202.1m ± 6%   201.2m ± 3%       ~ (p=0.557 n=50)
XML                         263.6m ± 5%   260.6m ± 1%       ~ (p=0.276 n=50)
LinkCompiler                533.2m ± 3%   529.5m ± 3%       ~ (p=0.212 n=50)
ExternalLinkCompiler         1.538 ± 1%    1.537 ± 2%       ~ (p=0.321 n=50)
LinkWithoutDebugCompiler    251.2m ± 3%   252.3m ± 1%       ~ (p=0.359 n=50)
geomean                     394.6m        393.7m       -0.24%

                         │   old.txt   │              new.txt               │
                         │ user-sec/op │ user-sec/op  vs base               │
Template                   341.1m ± 1%   345.0m ± 2%       ~ (p=0.207 n=50)
Unicode                    110.4m ± 4%   108.0m ± 3%       ~ (p=0.151 n=50)
GoTypes                     2.569 ± 1%    2.596 ± 1%  +1.02% (p=0.002 n=50)
Compiler                   105.4m ± 5%   105.0m ± 2%       ~ (p=0.734 n=50)
SSA                         18.70 ± 0%    19.06 ± 1%  +1.89% (p=0.000 n=50)
Flate                      185.9m ± 2%   182.9m ± 3%  -1.64% (p=0.043 n=50)
GoParser                   403.9m ± 1%   402.5m ± 1%       ~ (p=0.448 n=50)
Tar                        290.8m ± 2%   293.3m ± 2%       ~ (p=0.562 n=50)
XML                        406.9m ± 1%   401.0m ± 1%  -1.45% (p=0.041 n=50)
LinkCompiler               736.0m ± 1%   737.0m ± 1%       ~ (p=0.975 n=50)
ExternalLinkCompiler        1.562 ± 1%    1.549 ± 1%       ~ (p=0.052 n=50)
LinkWithoutDebugCompiler   266.6m ± 3%   267.7m ± 2%       ~ (p=0.628 n=50)
geomean                    532.5m        531.8m       -0.12%

          │   old.txt    │           new.txt           │
          │  text-bytes  │  text-bytes   vs base       │
HelloSize   791.0Ki ± 0%   790.7Ki ± 0%  -0.03% (n=50)

          │   old.txt    │               new.txt               │
          │  data-bytes  │  data-bytes   vs base               │
HelloSize   14.31Ki ± 0%   14.33Ki ± 0%  +0.11% (p=0.000 n=50)

          │   old.txt    │             new.txt              │
          │  bss-bytes   │  bss-bytes    vs base            │
HelloSize   199.0Ki ± 0%   199.0Ki ± 0%  ~ (p=1.000 n=50) ¹
¹ all samples are equal

          │   old.txt    │           new.txt           │
          │  exe-bytes   │  exe-bytes    vs base       │
HelloSize   1.235Mi ± 0%   1.235Mi ± 0%  -0.00% (n=50)

What did you expect to see?

I hope to see pgo improve the performance of the tool chain, as claimed by the blog.

Copy the original text:
In Go 1.21, workloads typically get between 2% and 7% CPU usage improvements from enabling PGO.

What did you see instead?

No significant performance improvement was found using compilebench measurement.

@cherrymui
Copy link
Member

The compiler should already be built with PGO, as -pgo=auto is the default with go build and go install. So your CL doesn't seem to make a difference. Could you verify that? Thanks.

@cherrymui cherrymui added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Oct 6, 2023
@cherrymui
Copy link
Member

To be clear, your CL seems to apply the compiler's profile cmd/compile/default.pgo to all binaries in cmd, including e.g. the go command, the linker, the assembler, etc.. Since the profile is for the compiler, applying it to other binaries may have positive or negative effect, or no. As the compiler binary is essentially unchanged, and compilebench mainly measures the compiler speed, it is expected that there is no difference.

@cherrymui cherrymui closed this as not planned Won't fix, can't repro, duplicate, stale Oct 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

2 participants