Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: crypto/{sha256, md5, ...}: bound work in non-preemptible assembly loops #64417

Open
nvanbenschoten opened this issue Nov 28, 2023 · 1 comment
Labels
Proposal Proposal-Crypto Proposal related to crypto packages or other security issues
Milestone

Comments

@nvanbenschoten
Copy link
Contributor

Proposal Details

As discussed in #24543 (comment), assembly code is non-preemptible. Non-preemptible goroutines can delay stop-the-world pauses (after all other goroutines have been paused), meaning that they can directly contribute to workload tail latency if timed poorly. This makes libraries which perform unbounded amounts of work in assembly loops risky for use in latency sensitive applications.

One way to mitigate this risk would be to bound the amount of work that any single call into an assembly routine performs on a case-by-case basis.

The crypto libraries provide a handful of examples where this technique could be applied. For example, in crypto/sha256, the following diff limits the non-preemptible section of a sha256 hash to somewhere around 250µs:

diff --git a/src/crypto/sha256/sha256.go b/src/crypto/sha256/sha256.go
index 2deafbc9fc..567a3c81f9 100644
--- a/src/crypto/sha256/sha256.go
+++ b/src/crypto/sha256/sha256.go
@@ -28,6 +28,10 @@ const Size224 = 28
 // The blocksize of SHA256 and SHA224 in bytes.
 const BlockSize = 64
 
+// The maximum number of iterations performed by a single assembly loop.
+const maxAsmIters = 1024
+const maxAsmSize = maxAsmIters * BlockSize // 64KiB
+
 const (
 	chunk     = 64
 	init0     = 0x6A09E667
@@ -191,6 +195,11 @@ func (d *digest) Write(p []byte) (nn int, err error) {
 	}
 	if len(p) >= chunk {
 		n := len(p) &^ (chunk - 1)
+		for n > maxAsmSize {
+			block(d, p[:maxAsmSize])
+			p = p[maxAsmSize:]
+			n -= maxAsmSize
+		}
 		block(d, p[:n])
 		p = p[n:]
 	}

We've seen in CockroachDB that this change combined with an almost identical one in crypto/md5 halves database p99 latency during backup uploads to AWS S3.

Alternative 1

One simple alternative is to document this risk and push the responsibility to bound the amount of non-preemptible to the users of these libraries.

However, that's not a great solution, as the direct callers of these hashing libraries are often buried deep inside of other libraries which applications cannot control (examples in the aws sdk: here, here, and here).

Alternative 2

The Safe-points everywhere for non-cooperative goroutine preemption design proposed a method to annotate safe preemption points in assembly code.

By default, the runtime cannot safely preempt assembly code since it won't know what registers contain pointers. As a follow-on to the work on safe-points everywhere, we should audit assembly in the standard library for non-preemptible loops and annotate them with register maps. In most cases this should be trivial since most assembly never constructs a pointer that isn't shadowed by an argument, so it can simply claim there are no pointers in registers. We should also document in the Go assembly guide how to do this for user code.

To my knowledge, this has never been implemented. If it was, it would provide an alternative to this proposal, though it would still require changes around each relevant use of assembly.


cc. @aclements you've thought about this problem more than anyone else. Does this seem like a reasonable proposal?

@gopherbot gopherbot added this to the Proposal milestone Nov 28, 2023
@Jorropo
Copy link
Member

Jorropo commented Nov 28, 2023

Your code could be simplified but it sounds reasonable.
A third alternative could be to have assembly check some magic preemption boolean and return early if it is preempted.
Something along the lines of:

func block(state *[8]uint32, buf []byte) (hashed uint)

// ...
 	for h := p[:len(p) &^ (chunk - 1)]; len(h) >= chunk; h = h[block(d, h):] {}

@ianlancetaylor ianlancetaylor added the Proposal-Crypto Proposal related to crypto packages or other security issues label Dec 2, 2023
craig bot pushed a commit to cockroachdb/cockroach that referenced this issue Feb 15, 2024
118605: build: add go runtime patch to bound non-preemptible work r=nvanbenschoten,rickystewart,rsevinsky-cr a=nicktrav

There are two commits in this patch set, and are best reviewed individually.

---

The first is a purely mechanical change to reapply our existing Go runtime patches to the upstream, and then port the diff back into Cockroach. This shuffles around the ordering of the various patches, making subsequent patching simpler.

The second patch is cribbed from golang/go#64417, and is the material change, touching #115192.

Epic: None.

Co-authored-by: Nick Travers <travers@cockroachlabs.com>
Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Proposal Proposal-Crypto Proposal related to crypto packages or other security issues
Projects
Status: Incoming
Development

No branches or pull requests

4 participants