proposal: use PCG Source in math/rand for Go 2 #21835

robpike · 2017-09-11T11:00:59Z

The original math/rand package had a number of flaws. Some of the flaws in the API have been addressed, such as the addition of the Source64 interface, but the most important issue remains: the default Source is poor.

The Source has several problems.

It has no clear provenance (it came from Plan 9, but its history before that is unknown).
It is a linear feedback shift register, which is known to have flaws uncovered by various empirical tests.
It is difficult and expensive to seed.
It is enormous! The shift register itself comprises 607 64-bit words, a total of 4856 bytes.

This last point is not fatal if there is only one instance, but it couples poorly with another fundamental and perhaps unfixable problem: bad behavior when accessed concurrently. It must be protected with a mutex, but if it were much smaller (and cheaper to seed) it would be practical instead to have each client in the address space access a separate instance, seeded uniquely.

To put it another way, with a separate Source for each goroutine, no locking would be required, but the current implementation is impractically large for such an approach.

The proposal here is in several parts.

First, we create an alternate Source implementation. Our choice is the Permuted Congruential Generator (PCG) designed and thoroughly tested by O'Neill in:

PCG: A Family of Simple Fast Space-Efficient Statistically Good
Algorithms for Random Number Generation
Melissa E. O’Neill, Harvey Mudd College
http://www.pcg-random.org/pdf/toms-oneill-pcg-family-v1.02.pdf

A 128-bit PCG generator producing a 64-bit output is only two words of memory but provably has a period of 2**128, is cheap to seed, and efficient. (A 64-bit generator would be noticeably cheaper, but a period of only 2**64 is deemed too small for modern hardware). It is practical and reasonable for every goroutine to have a local Source (a simple seeding algorithm is to use the address of the Source as the seed).

An implementation of this generator, specifically PCG XSL RR 128/64 (LCG), is checked in to golang.org/x/exp/rand as its default Source.

Second, we work on the compiler, possibly through math/bits, to make it a little more efficient. The current implementation, in pure Go, is somewhat slower than math/rand, but could be comparably fast or even faster given compiler support for the 128-bit multiply inside.

The third part is the critical piece.

We propose to make this generator the default one in math/rand for Go 2. The Go 1 compatibility decree could allow it to happen today, but we have chosen in the past to take the guarantee more literally than is expressed by also requiring that the generated stream of bits will never change. This decision was made when a fix was done, but enough tests in production were discovered that depended on the exact stream that it was deemed simpler to lock down the implementation rather than to fix all the tests.

However, with Go 2 on the horizon, there is enough time and understanding and process to offer an opportunity to use rolling out an improved, API-compatible version of math/rand as an early test case for Go 2. Changing math/rand to use PCG is strictly incompatible, but not in a way that the Go 1 decree actually prohibits. A soft breakage at this level would be an excellent step on the way to understanding what it takes to roll out a breaking change for Go 2.

In essence, the proposal is to use the fixing of math/rand as a test case for introducing breaking changes in Go 2. And a much better random number generator would result, too: the breaking change is worth making.

The precise sequence of steps to roll it out needs discussion, but one possible outline is like this:

Add PCG as a secondary source to math/rand, but not as the default.
Make it available under an alternate name; PCGSource is the obvious.
Provide an alias for the current source, say LFSRSource.
Optional: Provide a gofix that updates existing code to use LFSRSource rather than Source. Since Source is usually hidden, accessed through Rand, this may be ineffectual unless we also provide PCG functions explicitly for Int32, etc. Or perhaps provide LFSR versions and rotate towards those in legacy code.
Make announcements of what's to come.
At some point, flip the alias so Source points to PCGSource and LFSRSource is no longer the default.
In parallel with this work, make math/bits support its operations more efficiently through the compiler.
Just prior to Go 2, delete LFSRSource.

At some point along the way, Source64 should become the default, under a similar sequence of events. Minor cleanups could occur too, tidying up the API and fixing some old-style names like Intn not IntN.

There is much detail that remains to be worked out; this proposal is only an outline.

The text was updated successfully, but these errors were encountered:

odeke-em · 2017-09-11T21:03:37Z

@MichaelTJones perhaps you might be interested in this issue, as you already have an implementation of Melissa O'Neill's PCG generator at https://github.com/MichaelTJones/pcg.

MichaelTJones · 2017-09-12T00:01:46Z

I would very much support this change. In fact my careful PCG implementation is because of the weaknesses mentioned here. I long ago switched to this code and will not use the present library PRNG. In addition to being technically excellent, a new generator can be created in 2.72ns, makes new 64-bit values in 5.56ns, and has small state. Examine the tests in my implementation for details.

dongweigogo · 2017-09-12T11:44:08Z

why not bring it to Go 1

mvdan · 2017-09-12T12:07:24Z

@dongweigogo please read the proposal text carefully, it is explained.

btracey · 2017-10-24T21:26:58Z

To put it another way, with a separate Source for each goroutine, no locking would be required, but the current implementation is impractically large for such an approach.

For some random generators, generating a bunch of new streams is worse statistically for once stream. Section 4.3.2 of the paper suggests that this is not a problem with PCG if the seeding is done properly. Just wanted to comment for anyone curious.

MichaelTJones · 2017-10-24T23:46:59Z

yes that's how i do it. BenchmarkNew32-8 2000000000 0.67 ns/op BenchmarkRandom32-8 1000000000 3.23 ns/op BenchmarkBounded32-8 100000000 15.0 ns/op BenchmarkNew64-8 300000000 6.64 ns/op BenchmarkRandom64-8 100000000 15.8 ns/op BenchmarkBounded64-8 50000000 45.3 ns/op A new one is 0.67ns for 32 bits or 0.67ns for 64. Easy to have one or more per concurrent function.

…

On Tue, Oct 24, 2017 at 2:28 PM, Brendan Tracey ***@***.***> wrote: To put it another way, with a separate Source for each goroutine, no locking would be required, but the current implementation is impractically large for such an approach. For some random generators, generating a bunch of new streams is worse statistically for once stream. Section 4.3.2 of the paper suggests that this is not a problem with PCG if the seeding is done properly. Just wanted to comment for anyone curious. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#21835 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHgypZGI0OER3trgpt7Eq23dW_V8-q-Yks5svlZwgaJpZM4PS_P2> .

-- Michael T. Jones michael.jones@gmail.com

lemire · 2018-01-10T02:11:57Z

@MichaelTJones Notice how the bounded functions are significantly slower, that's because of the divisions. You can follow the new implementation at go/src/math/rand/rand.go and avoid divisions (with high probability) and get something faster...

MichaelTJones/pcg#2

Once Go gets something like a full 64-bit multiplication (outputting both the high and low 64-bit values), then we will be able to similarly speed up Bounded64.

lemire · 2018-01-10T02:31:38Z

@robpike The implementation of the bounded RNGs at https://github.com/golang/exp/blob/master/rand/rand.go implies one or two 64-bit division per call, assuming no magical tricks from the compiler. It costs dozens of cycles. The divisions probably cost more than the generation of the random number itself! The 64-bit division is a monster on x64 processors. Even the 32-bit division is surprisingly expensive. It is not very different on ARM processors.

A better alternative is to do a multiply followed by a shift...
https://github.com/golang/go/blob/master/src/math/rand/rand.go#L138-L160

MichaelTJones · 2018-01-10T02:44:54Z

I am reading the linked paper very carefully... (in the Long Beach airport)

…

On Tue, Jan 9, 2018 at 6:32 PM, Daniel Lemire ***@***.***> wrote: @robpike <https://github.com/robpike> The implementation of the bounded RNGs at https://github.com/golang/exp/blob/master/rand/rand.go implies one or two 64-bit division per call, assuming no magical tricks from the compiler. It costs dozens of cycles. The divisions probably cost more than the generation of the random number itself! The 64-bit division is a monster on x64 processors. Even the 32-bit division is surprisingly expensive. It is not very different on ARM processors. A better alternative is to do a multiply followed by a shift... https://github.com/golang/go/blob/master/src/math/rand/rand.go#L138-L160 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#21835 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHgypSZtKQ956CZ_eHuTCYIW2o4QwGh2ks5tJCFTgaJpZM4PS_P2> .

-- Michael T. Jones michael.jones@gmail.com

josharian · 2018-01-11T22:10:22Z

There are at least three parts of math/rand: the source, the conversion/distribution routines, and the concurrency strategy. I think that it is worth revisiting all three for Go 2 and that all three should be considered together.

This proposal as written addresses the source.

The conversion routines changes include #16213 and a handful of other math/rand decisions that were made to preserve existing output, e.g. #8013, #6721, #12290, #16124, #8731, and probably others.

For the concurrency strategy, see #20387 and #21393.

MichaelTJones · 2018-01-11T23:55:29Z

Excellent comment and very helpful. Thank you for the work in crafting this hub of information.

…

On Thu, Jan 11, 2018 at 2:11 PM, Josh Bleecher Snyder < ***@***.***> wrote: There are at least three parts of math/rand: the source, the conversion/distribution routines, and the concurrency strategy. I think that it is worth revisiting all three for Go 2 and that all three should be considered together. This proposal as written addresses the source. The conversion routines changes include #16213 <#16213> and a handful of other math/rand decisions that were made to preserve existing output, e.g. #8013 <#8013>, #6721 <#6721>, #12290 <#12290>, #16124 <#16124>, #8731 <#8731>, and probably others. For the concurrency strategy, see #20387 <#20387> and #21393 <#21393>. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#21835 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHgypSD4yXN6VKEdC04UL9_nK266zT0vks5tJoclgaJpZM4PS_P2> .

-- Michael T. Jones michael.jones@gmail.com

josharian · 2018-05-02T14:02:12Z

If get per-P storage in Go 2, or if we are willing to let math/rand have privileged runtime access, I think we should consider adding a global Source that:

stores state in a P
is lock-free
is explicitly documented not to have Any reproducibility
is seeded at startup or on first use
panics if Seed is called

This would allow people not to have to manually thread PRNG state around through goroutines, which seems to be a significant complaint.

Maybe it should even be the default Source, and have people opt in to reproducibility by making their own Source.

josharian · 2018-05-02T14:02:44Z

cc @cespare for that last comment

…e default source PCG is suppose to be higher quality RNG: http://www.pcg-random.org/ See related: golang/go#21835 Xoshiro is also suppose to be a high quality RNG: http://prng.di.unimi.it/ PCG and Xoshiro seem to think each is better than the other: * http://www.pcg-random.org/posts/a-quick-look-at-xoshiro256.html * http://www.pcg-random.org/posts/xoshiro-repeat-flaws.html * http://pcg.di.unimi.it/pcg.php PCGSource perf on Go 1.13 is worse though... $ benchcmp old.txt new.txt benchmark old ns/op new ns/op delta Benchmark/default1-8 9.74 10.0 +2.67% Benchmark/default4-8 15.7 15.1 -3.82% Benchmark/default8-8 22.1 20.6 -6.79% Benchmark/default16-8 37.2 33.4 -10.22% Benchmark/default32-8 67.1 59.4 -11.48% Benchmark/default64-8 122 116 -4.92% Benchmark/default128-8 234 214 -8.55% Benchmark/dhrand1-8 18.0 17.7 -1.67% Benchmark/dhrand4-8 47.2 47.5 +0.64% Benchmark/dhrand8-8 83.0 90.5 +9.04% Benchmark/dhrand16-8 141 156 +10.64% Benchmark/dhrand32-8 277 287 +3.61% Benchmark/dhrand64-8 524 542 +3.44% Benchmark/dhrand128-8 976 1083 +10.96% BenchmarkModule/1-8 24.8 23.0 -7.26% BenchmarkModule/4-8 25.9 25.9 +0.00% BenchmarkModule/8-8 30.2 31.4 +3.97% BenchmarkModule/16-8 43.3 43.4 +0.23% BenchmarkModule/32-8 67.4 65.6 -2.67% BenchmarkModule/64-8 124 128 +3.23% BenchmarkModule/128-8 212 231 +8.96% benchmark old allocs new allocs delta Benchmark/default1-8 0 0 +0.00% Benchmark/default4-8 0 0 +0.00% Benchmark/default8-8 0 0 +0.00% Benchmark/default16-8 0 0 +0.00% Benchmark/default32-8 0 0 +0.00% Benchmark/default64-8 0 0 +0.00% Benchmark/default128-8 0 0 +0.00% Benchmark/dhrand1-8 0 0 +0.00% Benchmark/dhrand4-8 0 0 +0.00% Benchmark/dhrand8-8 0 0 +0.00% Benchmark/dhrand16-8 0 0 +0.00% Benchmark/dhrand32-8 0 0 +0.00% Benchmark/dhrand64-8 0 0 +0.00% Benchmark/dhrand128-8 0 0 +0.00% BenchmarkModule/1-8 0 0 +0.00% BenchmarkModule/4-8 0 0 +0.00% BenchmarkModule/8-8 0 0 +0.00% BenchmarkModule/16-8 0 0 +0.00% BenchmarkModule/32-8 0 0 +0.00% BenchmarkModule/64-8 0 0 +0.00% BenchmarkModule/128-8 0 0 +0.00% benchmark old bytes new bytes delta Benchmark/default1-8 0 0 +0.00% Benchmark/default4-8 0 0 +0.00% Benchmark/default8-8 0 0 +0.00% Benchmark/default16-8 0 0 +0.00% Benchmark/default32-8 0 0 +0.00% Benchmark/default64-8 0 0 +0.00% Benchmark/default128-8 0 0 +0.00% Benchmark/dhrand1-8 0 0 +0.00% Benchmark/dhrand4-8 0 0 +0.00% Benchmark/dhrand8-8 0 0 +0.00% Benchmark/dhrand16-8 0 0 +0.00% Benchmark/dhrand32-8 0 0 +0.00% Benchmark/dhrand64-8 0 0 +0.00% Benchmark/dhrand128-8 0 0 +0.00% BenchmarkModule/1-8 0 0 +0.00% BenchmarkModule/4-8 0 0 +0.00% BenchmarkModule/8-8 0 0 +0.00% BenchmarkModule/16-8 0 0 +0.00% BenchmarkModule/32-8 0 0 +0.00% BenchmarkModule/64-8 0 0 +0.00% BenchmarkModule/128-8 0 0 +0.00%

vigna · 2020-10-06T14:29:16Z

I have stumbled in the code for PCGSource (we decided to teach Go as first language to undergrads at my university), and in case you are considering it for inclusion in the standard library, I think it would IMHO a good idea to add to the code some documentation about the perils of seeding the generator with states that share a large number of lower bits. For example:

package main
import . "fmt"
import . "github.com/golang/exp/rand"

func main() {
   var pcg0, pcg1 PCGSource

   pcg0.UnmarshalBinary([]byte { 128, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 });
   pcg1.UnmarshalBinary([]byte { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 });

   for i := 0; i < 1000000000; i++ {
      pcg0.Uint64()
      pcg1.Uint64()
   }

   for i := 0; i < 10; i++ { Printf("0x%x\t0x%x\n", pcg0.Uint64(), pcg1.Uint64()) }
}

This program creates two PCGSource instances whose initial states differ only in the upper bit by unmarshaling; then, it advances the generators a billion times, and then prints ten outputs:

0xd8e3ee0662fca74c	0x62fca74cd8e2ee06
0x2e769d76dc6e4f1c	0xdc4e4f1c2e769d76
0x73a027a3ae1b700f	0xae1b700f53a027a3
0x746dd81a74fe2b99	0x74de2b99746dd81a
0xc0fb5795c97c77e7	0xc97c77e7e0fb5795
0xd2a86c8b49b7c14d	0x49b7c14d52a86c8b
0xce996d79a596d046	0xa596d446ce996d79
0xa5df4766d500990b	0xd500994ba5df4766
0x12681951f83f039c	0xf83f039c52681951
0x4616238466319291	0x663192914616238c

It is visible even to the naked eye that the outputs are the same, modulo a 16-bit rotation and single-bit change, whereas we would expect to obtain, maybe after a few iterations, different and uncorrelated streams.

The problem lies in the basic flaw of LCGs with power-of-2 modulus: if you create two LCGs with the same multiplier and additive constant, and start them from states with the same k lower bits, this will be true forever—the next-state operations all propagate bit changes to the left, never to the right. So the two generators above, even starting from different states, will have forever the same 127 lower bits.

This is an extreme example—127 lower bits in common. But you can still experience a strong correlation between streams generate by different seeds with less low bits in common, even if you cannot see the correlation as easily. For example, the following program uses the standard seeding interface to create two different generators, and then write their output to standard output in binary:

package main
import . "os"
import eb "encoding/binary"
import . "github.com/golang/exp/rand"

func main() {
   var pcg0, pcg1 PCGSource

   pcg0.Seed(0);
   pcg1.Seed(uint64(1) << 63);

   for i := 0; i < 1000000000; i++ {
      pcg0.Uint64()
      pcg1.Uint64()
   }

   b := make([]byte, 8)
   for {
      eb.LittleEndian.PutUint64(b, pcg0.Uint64())
      Stdout.Write(b)
      eb.LittleEndian.PutUint64(b, pcg1.Uint64())
      Stdout.Write(b)
   }
}

Since we are using two different seeds, we expect the two associated streams, which are non-overlapping, to be uncorrelated. In particular, if we interleave the streams the resulting stream should still look random. But if we pipe the output of the program in PractRand, we obtain after few seconds of testing

rng=RNG_stdin, seed=unknown
length= 32 megabytes (2^25 bytes), time= 9.2 seconds

  Test Name                         Raw       Processed     Evaluation
  BCFN(0+1,13-3,T)                  R= +17.9  p =  3.0e-8   very suspicious
  BCFN(0+2,13-4,T)                  R= +30.8  p =  2.0e-13    FAIL
  ...and 935 test result(s) without anomalies

All classical PRNGs (MCGs with prime modulus, F₂-linear, (C)MWC, etc.) mix wholly their state even after the smallest state change (e.g., one bit). Some are very slow (e.g., the Mersenne Twister with 19937 bits takes a few million iterations), but sooner or later the state change will propagate to all bits. At that point, they will pass the interleaving test above. All PRNGs based on cryptographic primitives will, too, as the smallest change in state generates a random perturbation of the output.

This is not true for PCG because of the very weak base LCG.

For the same reason, I suggest that in case the "jump" functionality of PCG is made available in the future, the documentation should state clearly that jumping by a large power of 2 is very dangerous, as it will yield a new state with a large number of lower bits in common with the starting state, leading to the consequences above.

nsajko · 2020-11-12T04:42:05Z

Shouldn't there be a discussion regarding the other potential alternative Source implementations? And regarding the requirements that such a replacement will have to fulfill?

Currently math/rand is explicitly meant for applications that are not "security sensitive", but wouldn't it be in the spirit of Go to provide safe defaults? Standard library users will not always know or figure out in time that their application requires resistance to attacks on the PRNG. I think that this applies in the case of an algorithm that uses randomization being exposed on the public Internet, for example.

Randen is an alternative to PCG that seems to pass the relevant statistical tests (as PCG likewise does), but should also provide resistance against prediction of future inputs and empirical indistinguishability from randomness while the generator state is not compromised, and also it ensures that past outputs can't be reconstructed even after the state is compromised.

The Randen paper also shows that, given hardware acceleration of AES, Randen can be faster than PCG in real world applications like shuffling and sampling.

The Randen paper on Arxiv (it includes a kind of overview of other available "modern" PRNGs): https://arxiv.org/abs/1810.02227

The Randen Git repository: https://github.com/google/randen

However, considering Randen raises the question of whether the alternate Go PRNG should, given the same seed, generate the same pseudorandom bits across all supported computer architectures; because presumably one wouldn't want to use Randen on, e.g., the old ARMs or Pentiums (because of the lack of AES acceleration instructions). I guess PCG is good enough to be a default for the low-powered machines?

nsajko · 2020-11-21T00:56:36Z

There seems to be much discussion on the PCG family on the Numpy Github, a lot of which is probably relevant to this issue:
numpy/numpy#13635
numpy/numpy#16313

It's a bit much to consume, but I noticed at least these important facts:

The author of PCG published a new member of the family on her Github in 2019. There was no blog post and I think the only announcement was on one of the linked Numpy Github issues, but the DXSM variants are supposed to be faster and have even better statistical qualities according to the relevant empirical tests, as well as being immune to some of the peculiarities of the earlier PCG schemes. Here's the commit: imneme/pcg-cpp@871d049
Prof Vigna (appears a couple posts above here), a rival non-crypto PRNG designer (a small but "lively" field, it seems), proposes some supposed improvements applicable to PCG, even though he doesn't recommend using generators like PCG.

Another thing that may be of import is how nontrivial or "challenging" predictability is explicitly a design goal for the PCG family. AFAIK, "challenging" just means (in the context of PCG) that there are/were no publicly known efficient attacks, but without a stronger kind of guarantee, such as applicable to a CSPRNG or to Randen, making the claim imprecise in such a way that it is somewhat deceptive, coming from an expert. This "challenging" stuff seems to come at the cost of trade-offs at the expense of speed or statistical quality, even though it is of dubious utility (considering that no real guarantee is given, and that a relatively efficient attack has recently been demonstrated).

Plazmaz · 2020-12-04T19:48:25Z

Sebastiano Vigna wrote a pretty scathing review of some of the claims made in O'Neill's paper:
http://pcg.di.unimi.it/pcg.php
Edit: apologies for stirring up something that's seemingly already been discussed to death here!

zephyrtronium · 2020-12-04T21:15:54Z

Dr. Vigna's criticisms are mostly about (extraordinarily) highly correlated sequences given nearby initial states. This is a very important concern for people performing intensive mathematical, scientific, or statistical work. However, based on discussion here and in #26263, that does not seem to be the Go team's target audience for math/rand. In particular, such people are more likely to understand the reasons for choosing or avoiding a particular class of PRNG.

Instead, the goals for a new math/rand generator seem to be speed, reasonable distribution quality, and a state size small enough that many goroutines can use independent generators. PCG fits these traits[n1][n2], as do many other generators, including several mentioned here.

Now, since this proposal is for PCG, suggesting other generators here does not make progress. Instead, we should be looking at how people currently use math/rand – keeping in mind that it is already a weak generator – to verify that the problems proven with PCG are, in a sense, beside the point. Another idea is to look at how many projects implement or import specific PRNG algorithms such as PCG, xoshiro, Mersenne Twister, &c., or collections like gonum/mathext/prng that provide several PRNGs, to see the reasons people choose those algorithms over math/rand.

[n1] The issue with speed on 32-bit platforms mentioned in #21835 (comment) does seem like an important concern, especially as Wasm becomes more prominent.

[n2] Notably, the idea mentioned in the proposal to seed different PCGs with the generators' respective addresses, or having many goroutines each seed with the current nanosecond time, may exacerbate its problems with highly correlated states. On the other hand, a high-quality seeding algorithm might avoid that.

lemire · 2020-12-04T21:28:08Z

@zephyrtronium I was under the impression that wasm did have 64-bit integer support universally today? Is this not the case? Or do you mean to refer to something else with the remark "especially as Wasm becomes more prominent". Or is there a particular issue with how Go generate wasm outputs?

cc @jfbastien

imneme · 2020-12-04T22:00:13Z

Sebastiano Vigna wrote a pretty scathing review of some of the claims made in O'Neill's paper:
http://pcg.di.unimi.it/pcg.php

It's worth noting that I responded to his critiques here.

zephyrtronium · 2020-12-04T22:03:25Z

@zephyrtronium I was under the impression that wasm did have 64-bit integer support universally today? Is this not the case? Or do you mean to refer to something else with the remark "especially as Wasm becomes more prominent". Or is there a particular issue with how Go generate wasm outputs?

So it does, my mistake. I must have misread something about Wasm at some point.

imneme · 2020-12-04T22:54:50Z

Also, as observed by some of the commentators here, more recently Vigna has raised concerns about self-similarity in PCG. If we interleave outputs from carefully chosen seedings of PCG, although the outputs are superficially different, the closer scrutiny applied by sensitive PRNG tests will reveal correlations of some kind indicating that the generator exhibits self-similarity between distant parts of its output.

I think this kind of self-similarity is interesting (and well known for the LCGs that underlie PCG), but Vigna's apparent grave concern about this issue was rather undercut when, on the NumPy mailing list, Tyge Løvset pointed out that similar issues affected Vigna's own most-recent PRNG, xoshift** (comment, code).

If users follow good practice for seeding PRNGs (which they should for a wide variety of other reasons), these kinds of issues won't arise (or rather, they are vanishingly unlikely). An alternative is to scramble the output more aggressively, but the trade-off is execution time. PCG's newest output permutation, DXSM, does represent an attempt to scramble better at no added cost on many platforms, but it is still making a trade off in this space.

Plazmaz · 2020-12-05T00:21:00Z

I wasn't sure of the impact of the claims made, but the general tone was harsh enough it made it more difficult to take his critiques seriously, although I felt there was some interesting points mentioned. I'm happy you took the time to address them, and thank you for providing that information here. His post was certainly at best unintentionally misleading.

jfbastien · 2021-01-28T22:27:20Z

@zephyrtronium I was under the impression that wasm did have 64-bit integer support universally today? Is this not the case? Or do you mean to refer to something else with the remark "especially as Wasm becomes more prominent". Or is there a particular issue with how Go generate wasm outputs?

@lemire sorry for missing this, I don't look at github notifications often.

Yes, wasm supports 64-bit integers universally as a first-class type. The initial discussion in 2016 hedged a bit in case there were JS-based implementations in some browsers, but now all relevant browsers support wasm natively (that's 92% of all users on the web). Non-browser implementations of wasm also support 64-bit integers natively.

Maybe the confusion is around "32-bit platforms"? wasm is currently ILP32, thought there's a proposal for wasm64 with 64-bit address spaces. Limiting the address space to 32 bits doesn't limit the use of 64-bit integers.

trafalmuffti · 2021-04-02T17:05:41Z

There is no mention of a multiply-with-carry generator in these discussions. Unlike many of these other algorithms, it is listed on wikipedia and is simply a LCG with less constants.

If the multiply-with-carry generator had a lag of at least 2 (according to the parlance used), it would be parallelizable by typical cpu dispatch units as there would be less dependencies.

Hopefully whatever the generator used, it will allow for 64-bit seeds or at least frequent reseeding.

Randen is an alternative to PCG that seems to pass the relevant statistical tests (as PCG likewise does), but should also provide resistance against prediction of future inputs and empirical indistinguishability from randomness while the generator state is not compromised, and also it ensures that past outputs can't be reconstructed even after the state is compromised.

This AES-NI accelerated RNG is faster and has performance nearly that of a CPU's memory transfer limit: dragontamer/aesrand

Adding an extra AES round might make it cryptographic.

rkern · 2021-05-05T18:38:12Z

Hi everyone! I'm a numpy developer with a particular focus on our PRNG capabilities. For some background, the older PCG64 with the XSL-RR output function is our current default PRNG for the past couple of years. The next release will have an optional implementation with the stronger DXSM output function. We expect we will make it the default some undetermined time down the road.

I have written up a what I hope to be an accessible explanation of the consequences so you don't have to reconstruct everything from those long digressing threads. My prose style is prone to long, complicated sentences, so please ask if anything needs to be clarified; that will help me edit this down for my own audience. I hope to provide a point of view as a PRNG library maintainer that is helpfully complementary to those of our PRNG algorithm authors. To be clear about where I'm writing from, numpy is an anchor library for scientific computing and machine learning, so we have a particular risk profile that is driving our decisions. The extreme conditions that would make this weakness non-negligibly likely are not really a practical use case right now, but you can see it on the horizon, at least for my audience. This is likely a different risk profile for you, but I try to give the information needed to evaluate things for yourself rather than boil everything down to a universal safe/unsafe declaration.

If I were still asked my opinion for what you should do, if you do decide to use a PCG algorithm, I would recommend using the DXSM variant over the XSL-RR. While the risks of XSL-RR are rare outside of extreme use cases and usually inconsequential, I think they are still worth documenting. And it's a pain to document and a pain for the user to work through the math to determine that yes, they are safe using it. DXSM provides an extra safety margin that renders those computations (and my article) unnecessary. I can think about the DXSM variant with a mental model that is a lot less complicated than one I would now use for XSL-RR. As a library maintainer and documenter, that's golden.

lemire · 2021-05-05T19:12:46Z

@rkern

My prose style is prone to long, complicated sentences,...

You sir have no idea what long and complicated sentences are... :-)

imneme · 2021-05-05T19:16:07Z

Thanks for the write up! Nicely done.

(It might be too tangential to include, but the only thing I'd add is that detectable self-similarity is not unique to LCGs (and PCG); it's findable in a number of non-cryptographic PRNGs.)

rsc · 2023-06-13T02:01:06Z

See #60751 for a GitHub Discussion that I hope will lead to a math/rand/v2. Closing as duplicate of that.

robpike added v2 A language change or incompatible library change Proposal labels Sep 11, 2017

gopherbot added this to the Proposal milestone Sep 11, 2017

robpike self-assigned this Sep 11, 2017

robpike changed the title ~~proposal: fix math/rand for Go 2~~ proposal: use PCG source in math/rand for Go 2 Sep 11, 2017

robpike changed the title ~~proposal: use PCG source in math/rand for Go 2~~ proposal: use PCG Source in math/rand for Go 2 Sep 11, 2017

golang deleted a comment from jimmyfrasche Oct 1, 2017

golang deleted a comment from mvdan Oct 1, 2017

robpike mentioned this issue Oct 23, 2017

all: consider generalising the random source parameter type gonum/gonum#228

Closed

This was referenced Jan 3, 2018

math/rand: remove statistical correlation in ziggurat algorithms #8731

Closed

proposal: math/rand: speed up Int31n with multiply/shift instead of modulo #16213

Closed

josharian mentioned this issue Jan 11, 2018

proposal: spec: add support for int128 and uint128 #9455

Open

tpudlik mentioned this issue Feb 25, 2018

proposal: math/rand: Expose constructor for lockedSource #24121

Closed

ianlancetaylor added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Mar 20, 2018

adamdecaf mentioned this issue Apr 10, 2018

proposal: Go 2: move math/rand to x/math/rand #24800

Closed

wking mentioned this issue May 24, 2018

proposal: crypto/rand: Add an implementation for math/rand's Source #25531

Closed

josharian mentioned this issue Jul 7, 2018

proposal: math/rand: rework for Go 2 #26263

Closed

mpx mentioned this issue Nov 28, 2019

proposal: x/exp/rand: add support for PRNG state serialisation and deserialisation #35886

Closed

robpike mentioned this issue Dec 13, 2019

math/rand: documentation on rng.go is lacking important context and information #36133

Open

DeedleFake mentioned this issue Nov 12, 2020

sirdsc: switch to proper random number generator DeedleFake/sirdsc#8

Closed

psampaz mentioned this issue Jan 10, 2021

Add Random(), RandomSeed() and RandomGaussian() go-p5/p5#26

Merged

hwh33 mentioned this issue Feb 19, 2021

Harry/9 getlantern/tlsmasq#12

Merged

1 task

charris mentioned this issue May 4, 2021

ENH: Add PCG64DXSM BitGenerator numpy/numpy#18906

Merged

robpike mentioned this issue Dec 1, 2021

math/rand: reconsider lock-free source #49892

Closed

rsc unassigned robpike Jun 23, 2022

rsc mentioned this issue Sep 6, 2022

math/rand: seed global generator randomly #54880

Closed

bakkot mentioned this issue Dec 4, 2022

Thoughts on choice of algorithm tc39/proposal-seeded-random#19

Open

rsc closed this as not planned Won't fix, can't repro, duplicate, stale Jun 13, 2023

rsc mentioned this issue Aug 2, 2023

math/rand/v2: revised API for math/rand #61716

Closed

sukunrt mentioned this issue Feb 20, 2024

webrtc: wait for FIN_ACK before closing data channels libp2p/go-libp2p#2615

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: use PCG Source in math/rand for Go 2 #21835

proposal: use PCG Source in math/rand for Go 2 #21835

robpike commented Sep 11, 2017 •

edited

odeke-em commented Sep 11, 2017

MichaelTJones commented Sep 12, 2017

dongweigogo commented Sep 12, 2017

mvdan commented Sep 12, 2017

btracey commented Oct 24, 2017

MichaelTJones commented Oct 24, 2017 via email

lemire commented Jan 10, 2018

lemire commented Jan 10, 2018

MichaelTJones commented Jan 10, 2018 via email

josharian commented Jan 11, 2018

MichaelTJones commented Jan 11, 2018 via email

josharian commented May 2, 2018

josharian commented May 2, 2018

vigna commented Oct 6, 2020

nsajko commented Nov 12, 2020 •

edited

nsajko commented Nov 21, 2020 •

edited

Plazmaz commented Dec 4, 2020 •

edited

zephyrtronium commented Dec 4, 2020

lemire commented Dec 4, 2020

imneme commented Dec 4, 2020

zephyrtronium commented Dec 4, 2020

imneme commented Dec 4, 2020

Plazmaz commented Dec 5, 2020 •

edited

jfbastien commented Jan 28, 2021 •

edited

trafalmuffti commented Apr 2, 2021 •

edited

rkern commented May 5, 2021

lemire commented May 5, 2021

imneme commented May 5, 2021

rsc commented Jun 13, 2023

proposal: use PCG Source in math/rand for Go 2 #21835

proposal: use PCG Source in math/rand for Go 2 #21835

Comments

robpike commented Sep 11, 2017 • edited

odeke-em commented Sep 11, 2017

MichaelTJones commented Sep 12, 2017

dongweigogo commented Sep 12, 2017

mvdan commented Sep 12, 2017

btracey commented Oct 24, 2017

MichaelTJones commented Oct 24, 2017 via email

lemire commented Jan 10, 2018

lemire commented Jan 10, 2018

MichaelTJones commented Jan 10, 2018 via email

josharian commented Jan 11, 2018

MichaelTJones commented Jan 11, 2018 via email

josharian commented May 2, 2018

josharian commented May 2, 2018

vigna commented Oct 6, 2020

nsajko commented Nov 12, 2020 • edited

nsajko commented Nov 21, 2020 • edited

Plazmaz commented Dec 4, 2020 • edited

zephyrtronium commented Dec 4, 2020

lemire commented Dec 4, 2020

imneme commented Dec 4, 2020

zephyrtronium commented Dec 4, 2020

imneme commented Dec 4, 2020

Plazmaz commented Dec 5, 2020 • edited

jfbastien commented Jan 28, 2021 • edited

trafalmuffti commented Apr 2, 2021 • edited

rkern commented May 5, 2021

lemire commented May 5, 2021

imneme commented May 5, 2021

rsc commented Jun 13, 2023

robpike commented Sep 11, 2017 •

edited

nsajko commented Nov 12, 2020 •

edited

nsajko commented Nov 21, 2020 •

edited

Plazmaz commented Dec 4, 2020 •

edited

Plazmaz commented Dec 5, 2020 •

edited

jfbastien commented Jan 28, 2021 •

edited

trafalmuffti commented Apr 2, 2021 •

edited