Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crypto/ecdsa, crypto/elliptic: changes to benchmarks dramatically improve results on ppc64le with no change in implementation #23137

Closed
laboger opened this issue Dec 14, 2017 · 5 comments

Comments

@laboger
Copy link
Contributor

laboger commented Dec 14, 2017

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go 1.10beta1

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

ppc64le Ubuntu

What did you do?

Looking at benchmark results from some packages to find changes in performance.

cd crypto/ecdsa
go test -c
./ecdsa.test -test.bench=.

What did you expect to see?

Consistent or small improvements as compared to results from previous runs, since no specific improvements were made for ppc64le in either of these packages.

What did you see instead?

Dramatic improvements for the benchmark results in crypto/ecdsa and crypto/elliptic starting with CL 80757 even though there were no changes that should affect ppc64le performance. This CL changed the benchmarks and the amd64 implementation but nothing specific to ppc64le.

These results are misleading because there was no change to provide an improvement for these crypto packages, but perfstat shows a big improvement.

I didn't look at the test changes in depth to understand why the results are different, possibly because the tests are now being run in parallel but weren't before.

If the plan is to add parallelized benchmarks then there should still be some that are run exactly as before to compare against.

Before and after this CL on ppc64le:

benchstat -alpha 2 ecdsa.new.out ecdsa.out
name              old time/op  new time/op  delta
SignP256-16        477µs ± 0%    35µs ± 0%  -92.75%  (p=1.000 n=1+1)
SignP384-16       8.80ms ± 0%  2.01ms ± 0%  -77.16%  (p=1.000 n=1+1)
VerifyP256-16     1.61ms ± 0%  0.11ms ± 0%  -93.13%  (p=1.000 n=1+1)
KeyGeneration-16   406µs ± 0%    26µs ± 0%  -93.56%  (p=1.000 n=1+1)

benchstat -alpha 2 elliptic.new.out elliptic.out 
name               old time/op  new time/op  delta
BaseMult-16        2.08ms ± 0%  0.14ms ± 0%  -93.25%  (p=1.000 n=1+1)
BaseMultP256-16     404µs ± 0%    25µs ± 0%  -93.73%  (p=1.000 n=1+1)
ScalarMultP256-16  1.04ms ± 0%  0.07ms ± 0%  -93.69%  (p=1.000 n=1+1)
@bradfitz
Copy link
Contributor

Yeah, I'm not sure why that CL added parallelization. Seems unrelated.

/cc @ianlancetaylor @TocarIP

@TocarIP
Copy link
Contributor

TocarIP commented Dec 14, 2017

Cl was inspired by https://blog.cloudflare.com/go-dont-collect-my-garbage/ which complained about performance of parallelized p256 performance, so it add parallelization to better replicate the issue. Go test has -parallel and -cpu flags if you want to run single-threaded version. We already run a lot of benches in parallel after e.g CL 36724,36831,36810 etc so it is impossible to compare benchmarks across go versions. That being said, I'm also ok with reverting those benchmarks to single thread.

@laboger
Copy link
Contributor Author

laboger commented Dec 14, 2017

If changing parallelism in benchmarks has been done in the past, then it doesn't make sense to just change this one back. I have not seen this kind of difference in the past so it surprised me.

Is this package benchmark policy documented somewhere, so that it is known that doing release to release benchmark comparisons is not always meaningful?

@agl
Copy link
Contributor

agl commented Dec 31, 2017

The change to the benchmarks seems to be reasonably motivated to uncover issues that the previous benchmarks didn't. I don't believe that we need to have complete consistency for benchmarks and thus it's ok that other arches changed here.

@agl agl closed this as completed Dec 31, 2017
@laboger
Copy link
Contributor Author

laboger commented Jan 2, 2018

I was under the impression that these benchmarks are used to track performance regressions, but if the benchmarks could be changed at any time in a way that affects their performance then you can't count on using them for that purpose. I understand why you'd want to provide an improved test but if you do then they should be given a different test name IMO so it is clear that it shouldn't be compared with anything from an earlier commit.

@golang golang locked and limited conversation to collaborators Jan 2, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants