Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/crypto/openpgp: streaming encryption allocates excessively #26578

Closed
twmb opened this issue Jul 24, 2018 · 3 comments
Closed

x/crypto/openpgp: streaming encryption allocates excessively #26578

twmb opened this issue Jul 24, 2018 · 3 comments

Comments

@twmb
Copy link
Contributor

twmb commented Jul 24, 2018

x/crypto/openpgp.Encrypt wraps the to-be-encrypted io.Writer with a packet.seMDCWriter. This io.WriteCloser encrypts data on every write using a cipher.StreamWriter.

Every write to a cipher.StreamWriter allocates a slice as long as the input write. This is fine if we are allocating a bulk of data at once, but if we are streaming encryption over time, these allocations are unnecessary. Worse, if we are streaming through a different writer (i.e., if we stream through gzip compression), that writer may cut the writes even smaller.

We cannot add a reusable buffer to cipher.StreamWriter because all of its fields are exported. Because cipher.StreamWriter contains nearly no code in its methods, it is small enough to copy the struct and its methods into the packet package and add a reusable buffer to it. This cuts allocations significantly for streaming input and is basically a no-op for bulk encryption.

The following diff would accomplish what I am describing:

diff --git a/openpgp/packet/symmetrically_encrypted.go b/openpgp/packet/symmetrically_encrypted.go
index 6126030..03e0aa7 100644
--- a/openpgp/packet/symmetrically_encrypted.go
+++ b/openpgp/packet/symmetrically_encrypted.go
@@ -280,7 +280,7 @@ func SerializeSymmetricallyEncrypted(w io.Writer, c CipherFunction, key []byte,
        if err != nil {
                return
        }
-       plaintext := cipher.StreamWriter{S: s, W: ciphertext}
+       plaintext := &streamWriter{S: s, W: ciphertext}
 
        h := sha1.New()
        h.Write(iv)
@@ -288,3 +288,31 @@ func SerializeSymmetricallyEncrypted(w io.Writer, c CipherFunction, key []byte,
        contents = &seMDCWriter{w: plaintext, h: h}
        return
 }
+
+type streamWriter struct {
+       buf []byte
+       s   cipher.Stream
+       w   io.Writer
+}
+
+func (w *streamWriter) Write(src []byte) (n int, err error) {
+       if cap(w.buf) < len(src) {
+               w.buf = make([]byte, len(src))
+       }
+       w.buf = w.buf[:len(src)]
+       w.s.XORKeyStream(w.buf, src)
+       n, err = w.w.Write(w.buf)
+       if n != len(src) && err == nil { // should never happen
+               err = io.ErrShortWrite
+       }
+       return
+}
+
+func (w streamWriter) Close() error {
+       if c, ok := w.w.(io.Closer); ok {
+               return c.Close()
+       }
+       return nil
+}

If this is an OK idea, I would like to issue this patch. For some numbers, applying this patch to a project results in the following bench diff (names changed):

name                   old time/op    new time/op    delta
Foo/gpg=true,gzip=0-4    12.3ms ± 2%    12.1ms ± 4%     ~     (p=0.105 n=10+10)
Foo/gpg=true,gzip=1-4    18.6ms ± 1%    18.1ms ± 6%   -2.69%  (p=0.029 n=10+10)
Foo/gpg=true,gzip=9-4    46.0ms ± 1%    46.5ms ± 4%     ~     (p=0.497 n=9+10)

name                   old speed      new speed      delta
Foo/gpg=true,gzip=0-4  85.9MB/s ± 2%  87.1MB/s ± 3%     ~     (p=0.165 n=10+10)
Foo/gpg=true,gzip=1-4  56.8MB/s ± 1%  58.2MB/s ± 6%   +2.57%  (p=0.026 n=9+10)
Foo/gpg=true,gzip=9-4  22.9MB/s ± 1%  22.7MB/s ± 4%     ~     (p=0.484 n=9+10)

name                   old alloc/op   new alloc/op   delta
Foo/gpg=true,gzip=0-4    2.41MB ± 0%    1.35MB ± 0%  -43.79%  (p=0.000 n=10+9)
Foo/gpg=true,gzip=1-4    1.95MB ± 0%    1.14MB ± 0%  -41.40%  (p=0.000 n=10+10)
Foo/gpg=true,gzip=9-4    1.99MB ± 0%    1.19MB ± 0%  -39.98%  (p=0.000 n=10+10)

name                   old allocs/op  new allocs/op  delta
Foo/gpg=true,gzip=0-4       232 ± 2%        88 ± 0%  -62.15%  (p=0.000 n=10+9)
Foo/gpg=true,gzip=1-4     26.5k ± 1%      0.1k ± 0%  -99.75%  (p=0.000 n=10+9)
Foo/gpg=true,gzip=9-4     26.1k ± 0%      0.1k ± 1%  -99.76%  (p=0.000 n=10+10)
@gopherbot gopherbot added this to the Unreleased milestone Jul 24, 2018
@meirf
Copy link
Contributor

meirf commented Jul 25, 2018

It doesn't look like cipher.StreamWriter is used anywhere else in x/crypto so if this code is added, it seems okay to not (yet) worry about sharing it.

@twmb it's not super important since the benefit is obvious but figured I'd ask: is that benchmark you're using open source? (Might want to include some benchmark if you end up sending a proper CL.)

/cc @FiloSottile

@twmb
Copy link
Contributor Author

twmb commented Jul 25, 2018

Writing this benchmark was pretty tricky since apparently the input buffer size and the character set in that buffer matters a lot. The allocations surprisingly drop significantly when the input is binary data or when the input is <=32KiB. When the input is small enough, the gzip wrapped case doesn't even allocate, which is pretty surprising to me since this still goes through the same allocating path. I'm not sure what's up there. Edit: also, changing the code that inspired this idea to use small writes doesn't drop the allocations, so I think something with escape analysis and smaller writes may be going on in the benchmark below (hard to tell).

Anyway, a recreating benchmark:

package foo

import (
	"compress/gzip"
	"fmt"
	"io"
	"io/ioutil"
	"math/rand"
	"strings"
	"testing"

	"golang.org/x/crypto/openpgp"
)

func BenchmarkGzipdGPG(b *testing.B) {
	const pubkey = `-----BEGIN PGP PUBLIC KEY BLOCK-----
Version: GnuPG v1

mI0EW1gSrwEEAOsrmrgduD7GqoakTKUPkihRW5vtL5RFstHTnqs+5IpPeloUiDxH
VB2DJ+o2JrfMOgkdcqgVp6kJ03wBrk68A2VZVZ/ZXVlmMGNwXWx9ShhnG/JcOnAt
FOsd57UiimJymzKjU5vmuNeGlleFdhDgHIVSyBVkjjSQZwYxEUYFTGprABEBAAG0
MUZvbyBCYXIgKHBhc3N3b3JkOiBleGFtcGxlKSA8Zm9vLmJhckBleGFtcGxlLmNv
bT6IuAQTAQIAIgUCW1gSrwIbAwYLCQgHAwIGFQgCCQoLBBYCAwECHgECF4AACgkQ
eoAmE/vUvGd/bAP+J6SJT72qeUU0LabJWAu/bru7QRimaFOD9AwSthrw43gbVaJ5
kRvN9yanO0p6VThvBRN7uzUjPbvbLB/FuGR9xh55fHTlf37HA7aa2LKfdcxN4n/f
aWJAGjkjY99LQw+2GgWaWBXxKh+1joAmocYdPWKV6QctNllBuGBleMz5Xui4jQRb
WBKvAQQA5xQFvS8MWUhG8CCck0UE6CGD0i8CENZssontiX0Tf4a4BrCCqmgDi/uN
8HwYsyS3R4sLNqbgMOqbohJCBNcoIEGQkTmWTxFjpYcSLdLZKg0YDzqVpcPk30wj
rLjsWsa/vwoRSMEZ8iZ/SR4/7JjWxQKLE1BU37cchsarK/LL41kAEQEAAYifBBgB
AgAJBQJbWBKvAhsMAAoJEHqAJhP71LxnL9UD/jnvWJ5f72NclLtTBZgfVMyALH0j
6cL2TVjQZ880nNM5M8cj3HP2FpuyCnhG1LRLCTAfPcbVBk029uMNBABb1oBZg+c6
QNZ6gFrw6wLGOo84+HILzhEC+QoqZ0822Td7HgpKkNIfdjxju/LjWrlHWByPMoO/
Y+GGXnYV6NqnwX6j
=X61Q
-----END PGP PUBLIC KEY BLOCK-----`
	const charset = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
	buf := make([]byte, 32<<10+1)
	for i := 0; i < len(buf); i++ {
		buf[i] = charset[rand.Intn(len(charset))]
	}

	pubEntities, err := openpgp.ReadArmoredKeyRing(strings.NewReader(pubkey))
	if err != nil {
		b.Fatalf("unable to read pubkey: %v", err)
	}
	wc, err := openpgp.Encrypt(ioutil.Discard, pubEntities, nil, nil, nil)
	if err != nil {
		b.Fatalf("unable to encrypt writer: %v", err)
	}

	for _, level := range []int{0, 1, 9} {
		b.Run(fmt.Sprintf("gzip=%d", level), func(b *testing.B) {
			b.SetBytes(int64(len(buf)))

			w := io.Writer(wc)

			if level != 0 {
				zipr, err := gzip.NewWriterLevel(wc, level)
				if err != nil {
					b.Fatalf("unable to wrap with gzip: %v", err)
				}
				w = zipr
			}

			b.ResetTimer()
			for i := 0; i < b.N; i++ {
				if _, err := w.Write(buf); err != nil {
					b.Fatalf("unable to write: %v", err)
				}
			}
		})
	}
}

@twmb
Copy link
Contributor Author

twmb commented Jun 4, 2019

Closing due to prior discussion of openpgp being removed from x/crypto.

@twmb twmb closed this as completed Jun 4, 2019
@golang golang locked and limited conversation to collaborators Jun 3, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants