crypto/rand: Please reconsider GRND_NONBLOCK, or at least provide a mode without it #11833

tv42 · 2015-07-23T01:06:21Z

I've read #8520 and https://codereview.appspot.com/123260044 . While I can appreciate @agl's "might be a big surprise to Go programs" argument, I feel that crypto/rand falling back to /dev/urandom if GetRandom(..., GRND_NONBLOCK) returns EAGAIN completely misses out the primary point of the syscall.

I find the tradeoff of going from "application that used to get non-entropy and trusting it for crypto" to "application hangs until entropy is available, which is a very brief delay on modern hardware and only happens at boot, or even just first boot on a well set up system" to be very much desirable.

This particularly hurts OS installs and all kinds of appliances, e.g. wifi access points and such, that would typically generate TLS keys on first boot. That is exactly the time when they don't yet have much entropy, and exactly the time where the current implementation of Go's crypto/rand will let them down.

Would you at least entertain a rand.WaitForEntropy() method or such, after which crypto/rand can guarantee it's operating in safe mode? (Without the getrandom syscall, that might mean that a 1-byte read from /dev/random succeeded once.) Because the current crypto/rand really isn't safe too close to boot time.

On a related note, in the code review at https://codereview.appspot.com/123260044 @agl says

I think that GRND_NONBLOCK should be passed here so that, if the pool isn't
ready, we'll fall back on /dev/urandom until the syscall will work.

but the implementation does not do that; it stays with /dev/urandom.

One extra complication from the above is that one important use case for the getrandom syscall was that you no longer need to populate /dev/urandom in a chroot etc -- now it's racy whether it's needed or not, depending on how fast after boot the program started.

The text was updated successfully, but these errors were encountered:

jeffallen · 2015-07-23T13:28:00Z

To my view, the Go stdlib is delivering what it sets out to do here (simplicity, small surface area, reliability). Your use case is legitimate, but rare in the grand scheme of all Go programs.

Nothing is stopping you from writing your own io.Reader that is guaranteed to use syscall.GetRandom. You could even use "func init() { r := &myGuaranteedSyscallRandReader{}; r.waitForEntropy(); rand.Reader = r; } to set crypto/rand.Reader at process init time.

Come to think of it, it might even be possible to write an init() function that calls syscall.GetRandom without GRND_NONBLOCK. But then you need to study the rules about init function order carefully to assure yourself that your init function happens before crypto/rand.init().

Works as designed, IMHO.

tv42 · 2015-07-23T21:16:55Z

@jeffallen Nothing is stopping me from reimplementing from scratch all of the crypto goodness already in Go. The point of it being there is that it's there, by default, and known to be of high quality.

agl · 2015-07-24T17:01:05Z

If we dropped the NONBLOCK flag then the semantics of crypto/rand would be "maybe it blocks, maybe it doesn't—depends on your kernel". It will be many years until this new system call has filtered down to all the LTS releases etc and those semantics seem bad.

It's also not clear that we could provide a WaitForEntropy call even if we wanted to. Without the new system call, can user-space find that out? There's the RNDGETENTCNT ioctl, but that returns the number of bits in the entropy pool. But that's different from whether it's seeded or not. One can read from /dev/random and drain the pool to zero, but then a zero from RNDGETENTCNT doesn't mean that it's not seeded.

We can't read a byte from /dev/random to check it because, if every Go process did that, they would start blocking, and blocking unnecessarily.

So, on the (many) systems where we can't figure out the return value for WaitForEntropy, do we block forever, panic, or just return true?

tv42 · 2015-07-24T17:24:13Z

@agl Maybe the /dev/urandom fallback stays unreliable, but on newer kernels it would sit & wait for the entropy pool to initialize once. And that'll often happen via the bootup process feeding back some entropy saved from the last run. That way, the Go process would only ever block if it was run very very close to boot, and then only on relatively new platforms; no LTS worries there.

jeffallen · 2015-07-24T17:44:16Z

Go can be (and is) used on embedded platforms, where it is not practical to
save entropy across boots.

It is important that Go does the best thing possible out of the box and
also makes it possible for people like you to do precisely what you need in
specialized cases. The current implementation does that.

Can you please experiment with the ideas I gave you and see if you can
write a package that lets you ensure you get random data they way you would
like? If Go is preventing that, it's a defect. If Go's out of box isn't
what you want, it's not a defect, it's an "opportunity". :)

On Fri, Jul 24, 2015 at 7:24 PM, Tv notifications@github.com wrote:

@agl https://github.com/agl Maybe the /dev/urandom fallback stays
unreliable, but on newer kernels it would sit & wait for the entropy pool
to initialize once. And that'll often happen via the bootup process
feeding back some entropy saved from the last run. That way, the Go process
would only ever block if it was run very very close to boot, and then only
on relatively new platforms; no LTS worries there.

—
Reply to this email directly or view it on GitHub
#11833 (comment).

tv42 · 2015-07-24T17:56:50Z

@jeffallen What I'm arguing is that "crypto you may or may not be able to trust" is not "the best thing possible", and is a very bad default.

agl · 2015-08-02T16:21:56Z

tv42: having the /dev/urandom fallback never block, and the getrandom call sometimes block, was what I had in mind when I said "maybe it blocks, maybe it doesn't—depends on your kernel".

I do have sympathy for your point here. Wouldn't it be nice if the kernel always had good entropy and didn't start init until the pool was initialised?

But that's not the world that we live in. I do think that code that does the "right" thing and reads from /dev/random when generating keys has lead to multiple cases of manufacturers shipping, say, a fixed SSH host key for all devices etc just to get the thing to boot. If you have an embedded system that runs a Go server then, if crypto/rand blocks for the pool to be initialised the system might deadlock: without the Go server running and causing activity, the system may never collect enough entropy to unblock the server.

If your platform has a hardware entropy source to prevent this, then blocking was a unnecessary in the first place.

So I'm going to close this issue, for now because:

Having the behaviour depend on kernel version is unpleasant. When getrandom is significantly more common in a few years, this concern would be ameliorated.
The cases where blocking is helpful seem limited. If you need it (i.e. tiny system, probably just one main server process) then blocking might well deadlock the system prompting developers to do worse things to "solve" the problem. In larger systems, it's very likely to be superfluous.

There is a lot of uncertainty in the judgments above. This could be revised in light of more experience but I'm going to leave it for now.

takeyourhatoff · 2015-08-10T16:18:24Z

I disagree that deadlocking the system is a worse solution than providing random numbers that are not actually random.

One of the failure cases of getrandom(2) mentioned here is there not being enough entropy available. Currently, Go's response to this is to fallback to /dev/urandom (the type of behaviour mentioned in the getrandom(2) commit message which the introduction of getrandom(2) is designed to avoid), knowing that the bytes it receives are not going to be unpredictable.

crypto/rand returning predictable bytes is a bug, has always been a bug and does not follow the principle of least surprise. Nowhere can I see specified that it does not block. Nowhere can I see it specified that it returns predictable bytes instead of an error when random bytes are no longer available.

In my opinion, either returning an error when random bytes are not available or blocking till they are are both acceptable solutions with my preference slightly leaning towards returning an error. You can consider Read not returning an error in that case previously as a bug.

ianlancetaylor added this to the Go1.6 milestone Jul 23, 2015

agl closed this as completed Aug 2, 2015

golang locked and limited conversation to collaborators Aug 9, 2016

gopherbot added the FrozenDueToAge label Aug 9, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

crypto/rand: Please reconsider GRND_NONBLOCK, or at least provide a mode without it #11833

crypto/rand: Please reconsider GRND_NONBLOCK, or at least provide a mode without it #11833

tv42 commented Jul 23, 2015

jeffallen commented Jul 23, 2015

tv42 commented Jul 23, 2015

agl commented Jul 24, 2015

tv42 commented Jul 24, 2015

jeffallen commented Jul 24, 2015

tv42 commented Jul 24, 2015

agl commented Aug 2, 2015

takeyourhatoff commented Aug 10, 2015

crypto/rand: Please reconsider GRND_NONBLOCK, or at least provide a mode without it #11833

crypto/rand: Please reconsider GRND_NONBLOCK, or at least provide a mode without it #11833

Comments

tv42 commented Jul 23, 2015

jeffallen commented Jul 23, 2015

tv42 commented Jul 23, 2015

agl commented Jul 24, 2015

tv42 commented Jul 24, 2015

jeffallen commented Jul 24, 2015

tv42 commented Jul 24, 2015

agl commented Aug 2, 2015

takeyourhatoff commented Aug 10, 2015