math/big: extend ProbablyPrime with Baillie-PSW test #13229

jfcg · 2015-11-13T08:45:50Z

BPSW is the current popular way for probabilistic primality testing.

More detailed info about BPSW is at https://en.m.wikipedia.org/wiki/Baillie–PSW_primality_test

A patch and initial discussion is at https://groups.google.com/forum/m/#!topic/golang-dev/AOAbwvCwgwo

minux · 2015-11-13T09:04:03Z

I don't like the idea of use negative n in ProbablyPrime(n) to test for primality using Baillie-PSW. That will limit us to Miller-Rabin and Baillie-PSW tests forever. What if there are even better tests in the future? (Say if a future version of AKS primality test become practical or a probability version of deterministic AKS primality test.) If we want to add it to math/big, we should choose another name. But then what should the user choose? There is no proof that Baillie-PSW doesn't have false positives, but there is clear and well-understood bound for Miller-Rabin test giving false positives. Perhaps we can add it to a x/math subrepo and see how many people actually cares about the speed of primality testing and also willing to take the risk of using algorithm with unproven result.

jfcg · 2015-11-13T09:35:00Z

We can use BPSW for n=0 and reserve n<0 for any future improvement.

I agree that there should not be multiple apis for primality testing. In function doc we clearly inform users of what we use for each n and what the success guarantees are.
For a composite, failure rate of a single MR is 1/4. For BPSW:

BPSW is available since 1980, people have tried every available prime & composite on the planet and it stood the test of time. It has a sound well-studied algorithm, notice it is an extension of MR. No serious library/language/software uses only MR for testing. Everybody suplements MR with a Lucas(jacobi=-1).
They "complement" each other. Blindly rejecting this without reading the relevant literature is a deep ignorance & shame for the Go project.

This is probabilistic primality testing, as the name suggests. We dont claim to have a primality proof in the first place. This does not mean we can provide lousy algorithms to users. On the contrary BPSW is the state-of-the-art primality testing.

cznic · 2015-11-13T09:42:54Z

We can use BPSW for n=0 and reserve n<0 for any future improvement.

I agree with @minux that the existing function should be kept alone.

BPSW is available since 1980, people have tried every available prime & composite on the planet and it stood the test of time.

This is mathematically irrelevant. The lack of proof is what matters. I object to putting an unproven, possibly failing primality-checking algorithm into the stdlib.

jfcg · 2015-11-13T11:21:28Z

Have you read "The danger of relying only on Fermat tests" at https://en.m.wikipedia.org/wiki/Baillie–PSW_primality_test ??

ALTree · 2015-11-13T14:16:04Z

is a deep ignorance & shame for the Go project

Please be civil. There's no shame in wanting to understand advantages and disadvantages of a proposal before implementing it. You failed to provide some information.

without reading the relevant literature

you did not provide "the relevant literature" (wikipedia is not literature). I understand that you may expect a reviewer to go on wikipedia and then doing a literature search, but this is really something that you should help with as part of your proposal.

I have a few comments. First of all:

[Java's isPrime for bigInts] does one or more Miller-Rabin tests with random bases. If n, the number being tested, has 100 bits or more, this method also does a non-strong Lucas test

So what they do is MR for bits < 100 and MR + Lucas on bits >= 100. But this is not what you're doing in your patch. First of all there's no 100 threshold: you make the caller choose what the method does - i.e. n > 0 means MR and n <= 0 means calling sprp() && slprp() (as a side note, the function names you have choosen really do not help understanding your code). This is confusing. Java's behaviour is nice: the method configures itself. This would not be the case in go (at least not with your proposal).

When should the user call isProbablyPrime with n == 0 ? And when not? You have to explain this (since we'll have to explain it to our users if we are to implement this).

Maple's isPrime function requires no configuration. Mathematica's PrimeQ function requires no configuration.

Also Java uses a non-strong Lucas test. Your patch seems to test for strong-Lucas primality. Care to explain why?

Second:

GNU Multiple Precision Arithmetic Library's mpz_probab_prime_p function uses a Miller-Rabin test, but does not use a Lucas test.

GMPL is is a widely used bignum library and what they do is MR. I'm not saying we should use pure MR just because GMPL does that, but the fact that they do suggests that a pure MR primality test is not an horrible crime. Go big package is much much simpler and smaller than GMPL.

Third: you have been asked several times to explain what guarantees (if any) the test gives regards to the probability of false-positives or false-negatives. The current PM implementation is pretty clear about that (the usual 4^-k bound). What about BPSW?
In your patch you only write

returns true only if x is a probable prime

maybe it's me, but this mean everything and nothing.

Fourth: performances. Is this faster or slower? You only did a single benchmark with n = 100 on MR, but almost nobody will call MR with parameter 100. n = 20 is a tipical value.

jfcg · 2015-11-13T15:19:07Z

Alberto,
If you really cared, you would have read my patch and see the two relevant literature references as comments.
Also, yes I would appreciate if a reviewer god damn reads the .ucking wiki article. I did not expect them to do a literature research on the subject!
Just read the comments made to this issue and the original thread. Most commenters really did not care to read the wiki or the patch.

bits<100 is just a made up limit, I havent read or come up with any "bit limit" on when lucas should be used. In any case, real applications will care for primes of 200+ bits. For example EdDsa uses 250+ bits primes.

sprp: strong probable prime (standard term in number theory)
slprp: strong lucas probable prime (standard term in number theory)
Really easy definitions, you would quickly recognize them if you cared to read my patch, the wiki or the papers ;)

As you know, Go has a compatibility promise. I did not want to clutter the api, just extend the current api with how Sage/Pari/Gp does it. Notice that there is no loss of MR(n) functionality.
That's why we need 'configuration'.
Users just use whatever they want. n=0 for BPSW, n>0 if they want MR(n).

If you cared to read some basic info about primality testing, you would not ask why we use the strong versions, but i'll explain in any case: strong tests are better detectors of primes/composites. Also the existing MR code is itself a strong probable prime test ;)

Gmp is the only serious lib that does not employ lucas tests, dont know why. Why dont you ask them and let us know.

BPSW is:

either necessary/sufficient (notice the math term;) for primality
or it has an extremely extremely low error rate (MR's is 1/4). Nobody was able to provide a composite that fools this test since 1980! You can contruct composites that fool MR(n=64) for example.

Nobody knows which yet.

You should use BPSW, if not, you should use MR(n=40+) for a serious application. BPSW is faster than the latter case.

Probable prime means:

either prime
very rare composite that fooled your prp test.

That is the mathematically correct term for what all these functions produce.

I hope I covered all your questions, let me know if I missed anything.

ALTree · 2015-11-13T16:32:29Z

First of all: I did not study the patch carefully because we are not doing code review. Understanding if your proposed extension is fine should not require one to study (or even open) a code patch. And certainly not one with functions named sprp and slprp.

sprp: strong probable prime (standard term in number theory)
slprp: strong lucas probable prime (standard term in number theory)
Really easy definitions, you would quickly recognize them if you cared to read my patch, the wiki or the papers ;)

This is a non-argument. I quickly glanced to the diff in ProbablyPrime, and what I found is

+ return x.abs.sprp(natTwo, nm1, q, k) && x.abs.slprp(b == 2)

compare with Java SDK implementation

return passesMillerRabin(rounds, random) && passesLucasLehmer();

As you know, Go has a compatibility promise. I did not want to clutter the api, just extend the current api with how Sage/Pari/Gp does it. Notice that there is no loss of MR(n) functionality.
That's why we need 'configuration'.
Users just use whatever they want. n=0 for BPSW, n>0 if they want MR(n).

Yes, I understand that. And yet everyone who has read your proposal said that the proposed API is ugly. Maybe a new fuction would be better (as Minux said). There's no agreement, not even on the signature of the function, and yet you expect someone to carefully study the whole patch.

If you cared to read some basic info about primality testing, you would not ask why we use the strong versions, but I'll explain in any case: strong tests are better detectors of primes/composites. Also the existing MR code is itself a strong probable prime test ;)

I know the difference between strong and weak tests, thank you. My point was that you can't both write "we should do it this way because this is how everyone is doing it" and avoid being questioned about remarkable differences between your implementation and other languages/libraries/platforms implementations.

I look at Java's implementation and I see weak-Lucas, I look at GMPL and they don't even do Lucas. Mathematica and PARI/GP do exactly what you're doing in your patch. The sage reference returns 404. Perl and Python implement Lucas and BPSW in external libraries.

To sum the evidence up:

Mathematica and PARI/GP do BPSW out of the box for isPrime
Java does MR + weak-Lucas after a certain threshold
GMPL does MR only
Other languages provide Lucas and BPSW in external libraries

You should use BPSW, if not, you should use MR(n=40+) for a serious application. BPSW is faster than the latter case.

How much faster for n=40 ?

Probable prime means:
either prime
very rare composite that fooled your prp test.
That is the mathematically correct term for what all these functions produce.

I was specifically asking if we do have a bound. You write "very rare", I deduce that a bound is not known. That's fine.

IMHO the first thing to do is decide how the API will look if this goes in (new function? n <= 0 trick? Out-of-the-box Lucas in isProbablyPrime?).

rsc · 2015-11-13T18:51:23Z

The docs are very clear:

// ProbablyPrime performs n Miller-Rabin tests to check whether x is prime.
// If x is prime, it returns true.
// If x is not prime, it returns false with probability at least 1 - ¼ⁿ.
//
// It is not suitable for judging primes that an adversary may have crafted
// to fool this test.

While we could start overloading non-positive values of n, that will be confusing to read and also confusing when you compile code expecting the new meaning of (say) 0 but instead it panics saying 0 is not a valid argument.

There are two possible ways forward for adding other primality test algorithms:

If a test makes sense to do all the time or in some limited context (such as small numbers), just do it, in addition to the currently-guaranteed Miller-Rabin tests.
Add a new method.

There have been a few things suggested here, including BPSW and various Lucas variants. I would like to see, for each test being proposed, a crisp statement containing:

the proposed new test
the class of numbers being tested for which the test makes sense over the current implementation, and
the cost compared to one Miller-Rabin iteration.

rsc · 2015-11-13T18:54:22Z

@jfcg, you wrote:

Blindly rejecting this without reading the relevant literature is a deep ignorance & shame for the Go project.

to which @ALTree reasonably replied:

Please be civil. There's no shame in wanting to understand advantages and disadvantages of a proposal before implementing it. You failed to provide some information. ... you did not provide "the relevant literature" (wikipedia is not literature). I understand that you may expect a reviewer to go on wikipedia and then doing a literature search, but this is really something that you should help with as part of your proposal.

To which you replied:

If you really cared, you would have read my patch and see the two relevant literature references as comments. Also, yes I would appreciate if a reviewer god damn reads the .ucking wiki article.

This is not really a productive way to have a conversation. You're not doing anything to help your case.

jfcg · 2015-11-13T20:26:05Z

Hi,
I directly time each call x.ProbablyPrime(n) with my patch on core i5 4210m laptop with ubuntu 14.04, 3.19 kernel.
For starters, I got these numbers. Separated by type:

// composite fermat/mersenne numbers, strong pseudoprimes to base 2
// 2^256+1, 2^512+1, 2^257-1, 2^509-1
n= 40
257 bits took 207.514µs
513 bits took 584.49µs
257 bits took 69.83µs
509 bits took 207.916µs
total: 1.06975ms

n= 0
257 bits took 911.742µs
513 bits took 1.25268ms
257 bits took 295.564µs
509 bits took 837.612µs
total: 3.297598ms

// 5 primes with 256 256 697 730 995 bits
n= 40
256 bits took 1.745521ms
256 bits took 1.719863ms
697 bits took 18.613694ms
730 bits took 21.783088ms
995 bits took 46.10569ms
total: 89.967856ms

n= 0
256 bits took 517.818µs
256 bits took 509.545µs
697 bits took 3.270944ms
730 bits took 4.458831ms
995 bits took 7.606551ms
total: 16.363689ms

// pairwise products of 5 primes
n= 40
512 bits took 229.017µs
953 bits took 959.979µs
986 bits took 1.142523ms
1251 bits took 2.032602ms
953 bits took 946.264µs
986 bits took 1.110485ms
1250 bits took 2.087854ms
1427 bits took 3.505977ms
1691 bits took 4.834519ms
1725 bits took 4.44336ms
total: 21.29258ms

n= 0
512 bits took 496.858µs
953 bits took 1.881229ms
986 bits took 2.041718ms
1251 bits took 4.105593ms
953 bits took 1.907105ms
986 bits took 2.037402ms
1250 bits took 3.454044ms
1427 bits took 5.377545ms
1691 bits took 7.636516ms
1725 bits took 7.461523ms
total: 36.399533ms

I was wrong for composites, MR(n=40) works faster for them. I think that sounds right because for a prime, MR needs to try all the 40 bases, for a composite there are chances to get eliminated for each base.

I think MR enjoys a well implemented expNN, lucasMod is bare minimum.

There are constant costs as well. I'll make polynomial fits to these data & come back :)

minux · 2015-11-14T00:40:33Z

Speed is not the only concern. The major client for ProbablyPrime is crypto/rsa. I did a survey of major open source crypto libraries to see how do they test for primes when generating RSA keys: OpenSSL: https://www.openssl.org/docs/manmaster/crypto/BN_generate_prime.html Only uses Miller-Rabin probabilistic primality test, with bounded false positive rate of at most 2^-80 for random input. NSS: function mpp_make_prime in http://hg.mozilla.org/projects/nss/file/tip/lib/freebl/mpi/mpprime.c#l392 uses Fermat test followed by Miller-Rabin tests. CryptoPP: RSA key generation uses the VerifyPrime function at https://github.com/weidai11/cryptopp/blob/master/nbtheory.cpp#L249 and VerifyPrime uses strong lucas test followed by Miller-Rabin tests. It's interesting to see that: 1. all of them uses MR test in the last step (ranging from 2 rounds to 40 rounds, our crypto/rand.Prime uses 20 rounds) 2. none of them uses Baillie-PSW, 3. only CryptoPP uses strong lucas test.

rsc · 2015-11-14T04:18:11Z

Let's take cryptography explicitly off the table. This is about math/big.ProbablyPrime, not crypto/rand.Prime. There are many other concerns for the generation of large random primes in a cryptographic context, and in fact in that case only a few rounds of Miller-Rabin are required. See Menezes et al., Handbook of Applied Cryptography, 1997, pp. 145-149. If you can't find that, see Appendix F of FIPS 186-4, which has some of the same information.

rsc · 2015-11-14T04:20:51Z

@jfcg, we're not going to put BPSW in just for performance reasons. In my list above, by "the class of numbers being tested for which the test makes sense over the current implementation" I meant a non-performance (that is, correctness) rationale.

jfcg · 2015-11-14T16:06:20Z

So far with my limited computational capabilities, I only found 206981=263*787 which passes MR(n=2) but fails MR(n=3) & BPSW. It is a strong pseudoprime to 34320 bases.

jfcg · 2015-11-14T17:37:05Z

just got:
556421=431 * 1291 passes MR(n=2), strong pseudoprime to 92448 bases
587861=443 * 1327 passes MR(n=3), strong pseudoprime to 97680 bases

danaj · 2015-11-14T17:57:08Z

@jfcg, below 2^64, there are almost 32 million base-2 strong pseudoprimes. Of those, 1.5 million are also base-3 strong pseudoprimes. Of those, 131,157 are also base-5 strong pseudoprimes. 16,826 will pass M-R with bases 2,3,5 and 7. No number below 2^64 will pass a properly written BPSW test. Hence it is deterministic below that size. [edit: see the Galway/Feitsma database, also see Pseudoprime Statistics, Tables, and Data]

Strong vs. standard Lucas test: the 1980 paper rightly points out that there is no reason not to use the strong test. The pseudoprimes are not only a subset but the performance is the same or better. Why Java chose to ignore that is beyond me, but a clue may be gained by their calling it a "Lucas-Lehmer" test, which is a completely different test. This is not written by someone familiar with the literature.

Re performance, on primes, a good implementation of BPSW should take 2.5 to 3.5 times the time of a single M-R test. A practical primality test will include some trial division to very quickly weed out composites. For under 2^64 this is probably just to 20-100, but for larger numbers it is better on average to look much further. I use primes to 53 for 32-bit and to 89 for 64-bit, but that's a particular C implementation. The first M-R test will cut out most of the remaining composites. Using base-2 strong pseudoprimes as the test set is of course the worst case for BPSW as you guarantee it must run the Lucas test (which is ~1.5 - 2.5 times the cost of an M-R test). Anyway, I cannot discuss this implementation, but BPSW ought to be much faster than M-R with 40 bases.

For more on performance, under 64-bit the fastest solutions (that I am aware of) involve tuning small numbers, tuning the trial division pre-tests, then using hashed Miller-Rabin for 32-bit, and either hashed-Miller-Rabin or BPSW above that. For native integers they also use M-R and Lucas tests using Montgomery math and snippets of assembly, but I assume that is far beyond the scope of this.

I have a BPSW implementation for Perl6 using MoarVM. It is 4-20x faster than the current one for large numbers (10^50 - 10^8000). It also removes some known counterexamples (they don't use random bases). This is not directly comparable (the code in question uses C+libtommath and lives in the VM internals), but gives some idea.

Most of the literature is linked on the Wikipedia page. The paper by Baillie and Wagstaff (October 1980) is IMO the definitive paper (the July 1980 paper cites it, but got published first). Nicely's site is nice but has a few misleading comments (e.g. the trial division step is not part of BPSW, and his comments on the extra-strong Lucas test are not right). I would also recommend Pinch 1993 which is one reason all the math programs switched to BPSW in the 90s. Looking at Nicely's GMP pseudoprimes page is also interesting.

The main problem I've had trying to discuss BPSW with people who don't know the field is the "what is the exact error bound!" question. The second is a lack of understanding of the difference between fixed and random bases, but that's not an issue here. For error bounds, the best we can do with BPSW as a guarantee is 1/4 * 4/15 (M-R 2 plus strong Lucas) or 1/4 * 1/8 (M-R 2 plus ES Lucas). It should be obvious, given that we know there are no counterexamples below 2^64 and haven't seen one above, that this error bound is ludicrously conservative. However there isn't a strong statement we can give. Strong pseudoprimes and strong Lucas pseudoprimes are less dense as the input size increases, and the tests are anti-correlated so we would expect them to be hard to find once past the small numbers.

If one is very uptight about correctness, I would suggest separating out these cases:

below 2^64, do whatever is best for performance and gives completely correct results. This is where a lot of users will spend time calling you in loops. BPSW is one way to get correct results, so is deterministic M-R, so is hashed deterministic M-R.
See Sorenson and Webster, Sep 2015 for deterministic M-R to 33170 44064 67988 73859 61981 (over 2^81). You can call it directly or first call BPSW then if that is true and the input is below the threshold, call the 12 extra M-R tests to get a deterministic correct answer).
BPSW plus a few random-base M-R tests. BPSW gives you the result that anything that passed your test would have also passed a number of other programs. The extra random-base M-R tests will give you extra warm fuzzies, not take too long if there are only a few (e.g. 2 to 5), and defeat the nefarious NSA agents who are holding onto the secret (your variant) BPSW counterexamples.

jfcg · 2015-11-15T23:05:17Z

Hi @danaj,
Thanks a lot for your comprehensive comments. I'll read those papers right away.

The most interesting composites I have found so far are:
3673744903, 3281593591, 2385076987, 2738053141, 2009621503, 1502682721, 255866131, 117987841, 587861 which pass the current MR(n=3) test. These all come from a brute search.

random bases currently are generated by a pseudo RNG seeded by n[0], and mostly determined by BitLength(n-3). Chances are powerful attacks can be designed to generate strong pseudoprimes to multiple random bases that will be sequentially generated by the current generation process, hence pass the test. My crude guess :)

For sprp test I have an idea:
Testing n for base a, n-1 = q * 2^k with odd q, we check
a^q = 1 or -1 (mod n)
or
a^(q * 2^i) = -1 (mod n), for some 0< i <k

if I choose a with jacobi(a,n)=-1, since q is odd, a^q is not a square mod n
therefore we cant have a^q = 1 (mod n), therefore we tighten the sprp test. Let's check:
n, # of a, # of a with (a/n)=-1
3481 56 0
3721 58 0
4087 16 8
4331 48 24
4453 52 36
4489 64 0
4819 16 8
4891 16 8
5041 68 0
5293 16 0
5329 70 0
5429 4 4
5767 16 8
5917 52 36
5963 240 120
6161 148 0
6241 76 0
6283 16 8
6497 20 0
6499 16 8
6649 52 0
6889 80 0
6893 4 4
6901 16 0
7081 196 144
7171 48 24
7303 16 8
7373 4 4
7519 16 8
7663 16 8
7747 16 8
7921 86 0
7957 484 324
7991 48 24
8023 96 48
8137 16 0
8249 20 16
8357 4 4
8479 16 8
8509 16 0
8611 16 8
8633 20 16
8989 4 4
9017 96 0
9089 4 0
9211 448 224
9271 160 80
9301 48 0
9313 16 0
9409 94 0
9577 52 0
9701 4 4
9797 4 4
9943 16 8
9991 16 8

As you see, choosing the base with (a/n)=-1 is a good idea. I think it is better than choosing constant or random bases. I can even utilize jacobi calculation for both sprp & slprp tests. What do you guys think ?

danaj · 2015-11-16T09:23:54Z

Re using PRNG bases is basically what GMP does. The link I gave to Nicely's GMP pseudoprimes page discusses it and shows a number of cases where we can make it fail with what seems like a reasonable number of tests. It's deterministic so one can search for numbers that pass many bases.

The idea of choosing parameters where (D|n)=-1 is discussed in Baillie/Wagstaff 1980. It's used in most Frobenius style tests, as it gives far better results.

I will reiterate that below 2^64, this should be a different discussion centered around the best performance vs. effort / maintainability. There is no good reason to be doing a probable prime test, as there are well known methods for fast deterministic correct results.

Above 64-bit, there are a variety of methods with different tradeoffs. My take (which is certainly debatable) is that BPSW (M-R base 2 followed by a strong, AES, or ES Lucas test) is the best starting point. There are decent arguments for that being adequate by itself, but I can see why some might want additional tests, whether they be a Frobenius variant, additional fixed M-R (such as Mathematica adding a base 3 test), additional deterministic M-R, or additional non-deterministic (however far down the rathole one wants to chase that) M-R. There will always be a small group that desires a proof for any input, but that has serious downsides for everyone else.

It's not a horrible sin to use just M-R, but it's a little disappointing. GAP, Maple, Mathematica, Maxima, PARI/GP, and Sage all use some variant of BPSW, and as mathematics programs that have been around for a while they obviously care about correctness. At a computational cost of ~3 M-R tests it is also quite efficient.

jfcg · 2015-11-16T10:51:39Z

Yes, classical BPSW = MR(base=2) && Lucas( (D/n)=-1 )

For the MR step, choosing (base/n)=-1 is a tighter test, so I want to use D of the Lucas test as the MR base, that is MR(base=D) && Lucas( (D/n)=-1 )
Reusing the result of the jacobi calculation. What do you think ?
Notice that discriminant of the sequence a^(n-1)-1 is always (a-1)^2, so MR always has (discriminant / n)=1, and stays "independent" with Lucas ((D/n)=-1)

jfcg · 2015-11-17T17:33:17Z

I am thinking of keeping the initial deterministic check up to 2^64. What do you think ?

What is the list of strong pseudoprimes to bases 2,3,5,7,11,13,17,19,23 that are less than 2^64 ?
1st one is 3825123056546413051 (~2^62), are there others ?

danaj · 2015-11-17T18:15:47Z

Re using a different base for BPSW, the biggest issue is that we have a complete list of all base-2 strong pseudoprimes to 2^64. That lets us quickly check actual results for different solutions. Without base 2, we're limited to ~2^48 given a few months of computation. Also, everyone else has been using base 2 -- I'm not sure you want to start deviating. I think you should also use the standard methods for selecting the Lucas parameters -- Selfridge method for strong test, Baillie method if extra strong (Pari/GP does something different, so I suppose that's ok also since it's been used a lot).

3825123056546413051 is the only counterexample for bases 2-23 below 2^64.

There are more efficient bases at the best known SPRP bases page. You have to use some care using the smaller sets since the base will typically be larger than the input (read the remarks). Naive solution -- just use the 7-base solution which is two fewer than {2,3,5,7,11,13,17,19,23} and doesn't need a check for 3825123056546413051. Only 3 bases needed for 32-bit inputs. If you have BPSW, then it ought to be more efficient than the 7-base or your first-9-prime-bases method, unless your users are always testing base-2 pseudoprimes.

ALTree · 2015-11-17T18:30:09Z

(just a reminder that we're in the freeze phase of the go development cycle, so if you decide to send a CL be aware that it won't be merged for go1.6)

jfcg · 2015-11-17T19:03:58Z

This is Baillie method, right?

P = 3; Q = 1;
while (jacobi(P*P-4,n) != -1)
    P++;

Method B on p.1401 of Baillie & Wagstaff (1980) requires P to be odd. Why do we allow even P ?
Both methods A & B require D = 1 mod 4.

So are the bases 2, 325, 9375, 28178, 450775, 9780504, 1795265022 sufficient for all inputs < 2^64 ?

danaj · 2015-11-17T19:57:17Z

Yes, the Feitsma/Galway list of base-2 SPSPs can be used to verify there are no counterexamples under 2^64 for those 7 bases.

For the strong test, method A (Selfridge) is typically used. I'm not aware of anyone using method B.

Yes, that's what Baillie called Method C in 2013 on OEIS. Using it in combination with the extra strong test yields the known set of ES Lucas pseudoprimes (which have no overlap with base 2 strong pseudoprimes under 2^64, and are anti-correlated so we expect overlaps to be rare). I like the extra strong Lucas test, as it is computationally faster, has a smaller known error bound, and has a smaller set of pseudoprimes making overlaps less likely. But the strong Lucas test is fine and arguably gets more use.

jfcg · 2015-11-17T20:09:59Z

Yes we use the ES lucas.
Method B "is" what Pari/GP does. Put Q=1 (extra strong lucas), then D=P^2-4 must in the set 5, 9, 13, 17.. so P must be odd & starts from 3.
Both methods A & B require D = 1 mod 4 for some reason. Do you know why? Do you have a link to a page explaining why C allows even P ?

danaj · 2015-11-17T21:20:26Z

PariGP sets Q=1, P = 3,5,7,... That is neither method A or B from Baillie/Wagstaff. It's method C skipping evens.

Method B would choose the first of these to give (D|n)=-1 and gcd(n,2PQD)=1:
D=5 P=3 Q=1
D=9 P=5 Q=4
D=13 P=5 Q=3
D=17 P=5 Q=2
D=21 P=5 Q=1
D=25 P=7 Q=6
...

The parameter selection use by Pari/GP is fairly similar to Baillie's, just using P += 2 so only odd values of P are selected. Other than that, its basically the same (Q=1, D=P^2-4). Pari/GP also doesn't do the full extra strong test.

I don't believe Baillie wrote a paper with method C. The only thing I have to go on is his OEIS entry. I came up with the same idea at about the same time (it's rather obvious), but his name pulls far more weight than mine :).

I think good methods are:

strong Lucas plus method A (Selfridge)
almost extra strong with method C (or the odds-only-P variant -- so this test is the Pari/GP varant)
extra strong with method C (odds-only-P if you'd like)

jfcg · 2015-11-17T21:27:16Z

Yes we are saying the same thing :D
Method B constrained to Q=1 (for appliying the ES lucas) is what Pari does. C also allows even P.

almost variant just skips U=0 check right ? so it is easier to calculate.

jfcg · 2015-11-24T15:13:09Z

Well JL & MR-round comparison was for that but just for convenience (on windows 7):

For generating 1000 random odd ints i am getting:
// 501 to 512 bits, 5 of them are primes
MR(n=1) took 66 ms
BPSW took 69 ms

// 1013 to 1024 bits, 4 of them are primes
MR(n=1) took 338 ms
BPSW took 359 ms

// 5 primes with 256 256 697 730 995 bits
MR(n=1) total 3 ms
BPSW total 15 ms

// pairwise products of 5 primes
MR(n=1) total 25 ms
BPSW total 25 ms

Note that n=1 is not suitable for any of these tasks.

danaj · 2015-11-24T17:15:31Z

Measuring the speed of the methods is often done in a unit of the time taken for a single M-R test (occasionally this unit is called a 'Selfridge', but that's rare). Hence the desire for the cost comparison. I think more fine-grain timing is needed, without the cost to generate the random numbers. Perhaps measure the Lucas test separately and use more of them.

All things being equal, an AES Lucas test costs less than 2 M-Rs, so the whole test is something like 2.5 M-Rs. But all things aren't equal here, with the math library using Montgomery math for modular exponentiation, and the whole M-R path being performance tuned for crypto needs. To get similar performance the Lucas test would have to be done in Montgomery math as well (fortunately there is no need to convert anything back out of Monty form). That's presumably why this is taking closer to 7 M-Rs using the earlier timing data you did 3 days ago for 5 primes:

MR(n=40) took 88ms => 1 MR takes 2.2ms
BPSW took 14ms => BPSW costs 6.4 M-Rs

That isn't way off base. Your comparison with single M-R a few hours ago was 5x, but I think more timing resolution is needed.

The big questions seem to be:

Does this belong in (1) math/big/ProbablyPrime, (2) math/big/SomethingElse, (3) an external library.
If (1) above, should it be using n=0? Can we have a no-argument form?
Is BPSW rigorous enough compared to M-R (n=whatever)?

None of these are particular to your implementation. That has a bunch of other questions and comments best left to review.

Is the argument for ProbablyPrime always required? Callers are expected to understand the tradeoffs of different arguments? Assuming there isn't a no-argument form, then (in my opinion) n=0 makes sense. It's the option used by people who don't want to micro-manage the implementation and just want an efficient solid answer. As for the last, BPSW is far better than 7 Miller-Rabin tests. It also gives us deterministic results for all 64-bit inputs (admittedly this can be achieved using set M-R bases). Is it better than 20 tests? IMO yes but we're getting closer, and now the performance tradeoff is getting important.

jfcg · 2015-11-24T20:30:27Z

yes, windoze timers have much less resolution. The n that gives me (on my ubuntu laptop) equal times is 5 (for generating 1000 random odd ints):

// 501 to 512 bits, 5 of them are primes
MR(n=5) took 60.024664 ms
BPSW took 60.833635 ms

// 1013 to 1024 bits, 4 of them are primes
MR(n=5) took 299.454722 ms
BPSW took 298.373894 ms

For 512 bit numbers, 261 of them exactly pass basic tests. So 256+5*5=281 MR-rounds took 60 ms.
For BPSW 261 base2 tests + 5 JL took 60 ms.
So a JL > 4 MR-rounds. (about ~5-7 MR-rounds. base2 test is slightly faster than base-x now, hence the bigger sign)

For 1024 bit numbers, 251 of them exactly pass basic tests. 247+4*5=267 MR-rounds took 299 ms.
For BPSW 251 base2 tests + 4 JL took 299 ms.
So again JL > 4 MR-rounds, about ~5-7 MR-rounds.

I think we have done enough timing measurements and have a pretty good idea about what component of each test takes how long.
Since Lucas test is executed only when the input is a strong probable prime to base 2, we have a pretty good average run time for BPSW. It is strictly faster than MR-only test when rounds > 5.
Any decent application, sure of random input, has to use rounds >= 40 in my opinion for MR-only.

BPSW could become faster if we spend time to mongomerify lucasMod function, but I think the effort is not worth it. We will increase its speed only for strong pseudoprimes-to-base-2 which are very rare.

I think it is time to decide on how to proceed further. What do you think of the signature issue ?

rsc · 2015-11-25T01:28:52Z

I am not interested in the signature issue until I understand the performance characteristics. Please bear with me.

Can you update the CL you sent out with Go benchmarks (in the sense of package testing's benchmark support) demonstrating the timings (for MR(n=1) and BPSW)? I'd like to run them on a few different machines.

If it's the case that, as suggested above in the preliminary MR(n=1) numbers, that the cost of the BPSW test is roughly the same as n=1, then one possible proposal for the API is to change ProbablyPrime to do what it has always done (n rounds of MR) followed by a BPSW test. Then old code would automatically improve in accuracy at minimal performance cost. But again I'd like to see the numbers more clearly and try them on a few different machines first.

Thanks.

jfcg · 2015-11-25T18:04:49Z

It gave me two new changes instead of updating the first:
https://go-review.googlesource.com/17222
https://go-review.googlesource.com/17223

gopherbot · 2015-11-25T19:01:31Z

CL https://golang.org/cl/17222 mentions this issue.

rsc · 2015-11-25T19:07:25Z

To update the first, use git commit --amend to collapse the changes back into the original. Each commit that gets pushed is a separate CL.

danaj · 2015-11-25T22:25:56Z

@rsc Timing on other machines would be nice to see.

There are also unrelated changes. The way trial division is done is quite different in the patch. The code on master (in nat.go) does a mod of either 29# or 53# then simple modulos. The patch replaces this with a much more convoluted method. I think there needs to be some benchmarking justification for this less obvious code.

Assume for the moment that these are equal in performance and results. I still find it very hard to figure out what the cost of the Lucas test is because the benchmark results mix multiple kinds of inputs. Assume for the moment that the Lucas test is 5x a M-R test.

For most composites we should get similar performance. A single M-R test, whether BPSW's base 2 or the old random base, will cull a huge number of them.

For the set of composites that are base-2 strong pseudoprimes with no small divisors, the patch would run about 3x slower (a cost of 1+5 vs. 2 or 3).

For primes, the cost for the old method is whatever 'n' is set to, while the patch would be 1+5. Since for normal use 'n' would have to be significantly larger than 6, we would see a net gain.

Your signature idea is worth considering. n=0 means BPSW, n=1+ means BPSW plus that many additional random base M-R tests. If the input is less than 2^64 we could ignore the argument. Values above 5 seem excessive but that's up to the user I guess. For performance sake the steps probably should be:

M-R base 2 (because jfcg says it is a little faster than random bases)
if x > 2^64, run n random-base M-R tests
run the Lucas test.

If n=0 or if x is 64-bit, then we just have BPSW. If we have a larger n, then this ordering has the least cost for composites (any ordering would be identical time for primes). This is actually what I did on my Perl 6 branch (but Perl6 has default arguments so the user can just say x.is-prime without having to specify a number).

danaj · 2015-11-25T22:46:30Z

More questions about goals / requirements:

Performance for inputs less than 2^32: reasonable, very fast, or fastest possible?

Performance for inputs less than 2^64: reasonable, very fast, or fastest possible?

The above two are just a performance vs. code tradeoff. There are multiple ways to give deterministic solutions -- they vary in the amount of code and tweaking needed. As a simple example, below some threshold, trial division will be faster than other methods, so we could add some code to do that and get slightly faster results for small inputs. My guess is that "reasonable" is the answer for both, which BPSW and deterministic M-R both give. Maybe a few small optimizations (e.g. return after mods if x < 3481, and perhaps deterministic M-R for small inputs since the Lucas test is rather expensive).

For inputs between 2^64 and 2^81: would we like to implement a deterministic solution -- definitive non-probabilistic results at the cost of 12-13 M-R tests? This could be done by either ignoring their argument or applying only if they asked for at least that many tests. This starts getting into the distinction between them just calling x.millerRabin(n) and x.probablePrime(n) -- the former lets them run some M-R tests, while the latter is asking us to determine if x is prime as best we can. The code here is very simple.

jfcg · 2015-11-25T23:52:16Z

does a mod of either 29# or 53# then simple modulos. The patch replaces this with a much more convoluted method. I think there needs to be some benchmarking justification for this less obvious code.

yes, originally basic tests could return different results for 32/64 bit platforms due to a shorter list of primes for 32 bits. I think bringing them in-line matters because later functions can now safely make assumptions about what basicPrime() (consistently) does. for example you can always assume input's prime divisors > 53.

jacobiP() needs a little help when recognizing squares. Since we already calculate "input mod smallPrime" why not also check whether this modulo is a quadratic non-residue. if so then input cant be a square. this extra info is very cheap to get, and it helps jacobiP() a lot.

M-R base 2 (because jfcg says it is a little faster than random bases)

if x > 2^64, run n random-base M-R tests

run the Lucas test.

I like this idea. random bases can be chosen from [ 3, (x-3)/2 ]

For small inputs I think this is good enough. They will take very short/hard-to-measure times to test. We will slowdown the significantly common use case (of testing numbers of 200+ bits) very slightly for numbers less then 32/64 bits. I dont think it is worth it.

jfcg · 2015-12-01T14:51:51Z

Any stats on other machines @rsc ?

rsc · 2015-12-01T16:40:22Z

I was unable to compile your code because it depends on external libraries that I can't make math/big depend on. I was going to make it use math/rand but have not had time. I also did not understand the code. It is quite non-idiomatic as far as naming and structure, and at a macro level is simply not written the way Go benchmarks usually are. So I wanted to try to understand it a bit more before running it, to make sure I believed it was a correct benchmark, and I haven't had time. Right now my main focus is trying to get Go 1.5.2 and then Go 1.6 out the door. When things settle down a bit I intend to return to trying to understand the code and its performance characteristics. Thanks.

jfcg · 2016-03-03T07:19:11Z

https://go-review.googlesource.com/20170
lucas_test.go times lucas test with base=2
latest test is:

basic division test
n strong probable prime tests (Miller Rabin)
strong Lucas probable prime test

jfcg · 2016-03-06T18:35:39Z

added lucas benchmark to commit msg:
Lucas test = ~5x sprp(base=2) test

jfcg · 2016-03-17T17:17:05Z

Hi @rsc, did you have a chance to inspect the patch and test on other machines ?

adg · 2016-08-15T23:41:18Z

Accepted pending a reviewer with domain expertise. @agl ?

rsc · 2016-10-07T20:10:22Z

I am reviewing this still. Sorry for the delay. I've spent the past few days looking at it.

gopherbot · 2016-10-11T16:20:27Z

CL https://golang.org/cl/30770 mentions this issue.

jfcg · 2016-10-11T18:53:21Z

hi
added my comments there.
Russ chose to implement the whole test in a single function instead of reusable functions (like lucasMod(), divBySmall(), basicPrime() in my patch), I prefer mine but you guys have the commit power ;)

rsc added the Proposal label Nov 13, 2015

rsc added this to the Proposal milestone Nov 13, 2015

rsc changed the title ~~math/big: Extending ProbablyPrime(n) with Baillie-PSW~~ proposal: math/big: extend ProbablyPrime with Baillie-PSW test Nov 13, 2015

briansmith mentioned this issue Mar 17, 2016

Improve RSA prime generation briansmith/ring#145

Closed

adg added the Proposal-Accepted label Aug 15, 2016

adg modified the milestones: Go1.8, Proposal Aug 15, 2016

rsc changed the title ~~proposal: math/big: extend ProbablyPrime with Baillie-PSW test~~ math/big: extend ProbablyPrime with Baillie-PSW test Oct 11, 2016

rsc self-assigned this Oct 11, 2016

rsc removed the Proposal label Nov 21, 2016

gopherbot closed this as completed in 37d078e Nov 22, 2016

golang locked and limited conversation to collaborators Nov 22, 2017

gopherbot added the FrozenDueToAge label Nov 22, 2017

rsc removed their assignment Jun 23, 2022

math/big: extend ProbablyPrime with Baillie-PSW test #13229

math/big: extend ProbablyPrime with Baillie-PSW test #13229

Comments

jfcg commented Nov 13, 2015

minux commented Nov 13, 2015 via email

jfcg commented Nov 13, 2015

cznic commented Nov 13, 2015

jfcg commented Nov 13, 2015

ALTree commented Nov 13, 2015

jfcg commented Nov 13, 2015

ALTree commented Nov 13, 2015

rsc commented Nov 13, 2015

rsc commented Nov 13, 2015

jfcg commented Nov 13, 2015

minux commented Nov 14, 2015 via email

rsc commented Nov 14, 2015

rsc commented Nov 14, 2015

jfcg commented Nov 14, 2015

jfcg commented Nov 14, 2015

danaj commented Nov 14, 2015

jfcg commented Nov 15, 2015

danaj commented Nov 16, 2015

jfcg commented Nov 16, 2015

jfcg commented Nov 17, 2015

danaj commented Nov 17, 2015

ALTree commented Nov 17, 2015

jfcg commented Nov 17, 2015

danaj commented Nov 17, 2015

jfcg commented Nov 17, 2015

danaj commented Nov 17, 2015

jfcg commented Nov 17, 2015

jfcg commented Nov 24, 2015

danaj commented Nov 24, 2015

jfcg commented Nov 24, 2015

rsc commented Nov 25, 2015

jfcg commented Nov 25, 2015

gopherbot commented Nov 25, 2015

rsc commented Nov 25, 2015 via email

danaj commented Nov 25, 2015

danaj commented Nov 25, 2015

jfcg commented Nov 25, 2015

jfcg commented Dec 1, 2015

rsc commented Dec 1, 2015 via email

jfcg commented Mar 3, 2016

jfcg commented Mar 6, 2016

jfcg commented Mar 17, 2016

adg commented Aug 15, 2016

rsc commented Oct 7, 2016

gopherbot commented Oct 11, 2016

jfcg commented Oct 11, 2016 • edited

jfcg commented Oct 11, 2016 •

edited