math/big: large fibonacci calculation is faster in GMP #30943

Borthralla · 2019-03-20T02:36:17Z

What version of Go are you using (`go version`)?

go1.12.1 windows/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (`go env`)?

go env Output

set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\Philip\AppData\Local\go-build
set GOEXE=.exe
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOOS=windows
set GOPATH=C:\Users\Philip\go
set GOPROXY=
set GORACE=
set GOROOT=C:\Go
set GOTMPDIR=
set GOTOOLDIR=C:\Go\pkg\tool\windows_amd64
set GCCGO=gccgo
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\Users\Philip\AppData\Local\Temp\go-build593709294=/tmp/go-build -gno-record-gcc-switches

What did you do?

I wrote a Fibonacci calculator in Go using the math/big library. It was ~instant in run time up until n=10^7, after which it was extremely slow. I then used https://github.com/ncw/gmp to replace math/big with a gmp wrapper. No code was changed.

What did you expect to see?

A small performance benefit, maybe 2x or 3x.

What did you see instead?

With the same set of code, gmp quickly calculates n=10^8 in 5 seconds, while the normal math/big library takes more than 5 minutes. Gmp is even able to calculate n=10^9 after 1 minute.

The text was updated successfully, but these errors were encountered:

Borthralla · 2019-03-20T02:46:03Z

Here is the fibonacci implementation:

func fib(n int) *big.Int {

	left := big.NewInt(1)
	right := big.NewInt(1)

	helper1 := big.NewInt(0)
	helper2 := big.NewInt(0)


	bin := fmt.Sprintf("%b", n)
	for i := 1; i < len(bin); i++ {

		helper1.Mul(big.NewInt(2), right)
		helper1.Sub(helper1, left)
		helper1.Mul(left, helper1)

		helper2.Mul(right, right)
		left.Mul(left, left)
		helper2.Add(helper2, left)

		if (bin[i] == '0') {
			left.Set(helper1)
			right.Set(helper2)
		} else {
			left.Set(helper2)
			right.Add(helper1, helper2)
		}
	}

	return left
}

Using math/big n=3*10^7:

     flat  flat%   sum%        cum   cum%
    31.93s 54.18% 54.18%     31.93s 54.18%  math/big.mulAddVWW
    23.17s 39.32% 93.50%     23.17s 39.32%  math/big.subVV
     2.34s  3.97% 97.47%      2.34s  3.97%  math/big.addMulVVW
     0.31s  0.53% 98.00%      0.31s  0.53%  runtime.cgocall
     0.15s  0.25% 98.25%      1.79s  3.04%  math/big.basicSqr
     0.14s  0.24% 98.49%     55.11s 93.52%  math/big.nat.divLarge
     0.13s  0.22% 98.71%      0.99s  1.68%  math/big.basicMul
     0.02s 0.034% 98.74%      0.38s  0.64%  internal/poll.(*FD).writeConsole
     0.01s 0.017% 98.76%      1.19s  2.02%  math/big.karatsuba
         0     0% 98.76%     55.71s 94.54%  fmt.(*pp).doPrintf

Using https://github.com/ncw/gmp at n=3*10^7 (same code):

     flat  flat%   sum%        cum   cum%
     1.07s 89.92% 89.92%      1.07s 89.92%  runtime.cgocall
     0.04s  3.36% 93.28%      0.04s  3.36%  runtime.memmove
     0.02s  1.68% 94.96%      0.02s  1.68%  runtime.stdcall2
     0.02s  1.68% 96.64%      0.02s  1.68%  unicode/utf16.Encode
     0.01s  0.84% 97.48%      0.01s  0.84%  indexbytebody
     0.01s  0.84% 98.32%      0.01s  0.84%  runtime.makemap_small
     0.01s  0.84% 99.16%      0.01s  0.84%  runtime.stdcall1
     0.01s  0.84%   100%      0.01s  0.84%  runtime.stdcall3
         0     0%   100%      0.53s 44.54%  fmt.(*pp).doPrintf
         0     0%   100%      0.53s 44.54%  fmt.(*pp).handleMethods

Using https://github.com/ncw/gmp at n=10^9 (same code):

flat  flat%   sum%        cum   cum%
    50.01s 91.81% 91.81%     50.01s 91.81%  runtime.cgocall
     1.14s  2.09% 93.90%      1.14s  2.09%  runtime.memmove
     0.54s  0.99% 94.90%     10.37s 19.04%  internal/poll.(*FD).writeConsole
     0.50s  0.92% 95.81%      0.50s  0.92%  runtime.procyield
     0.48s  0.88% 96.70%      0.60s  1.10%  unicode/utf16.Encode
     0.34s  0.62% 97.32%      0.34s  0.62%  runtime.memclrNoHeapPointers
     0.29s  0.53% 97.85%      0.29s  0.53%  unicode/utf8.DecodeRune
     0.01s 0.018% 97.87%      1.17s  2.15%  runtime.systemstack
         0     0% 97.87%     34.33s 63.03%  fmt.(*pp).doPrintf
         0     0% 97.87%     34.33s 63.03%  fmt.(*pp).handleMethods

agnivade · 2019-03-20T09:11:02Z

gmp uses CGo. math/big is pure Go.

And according to their site https://gmplib.org/

GMP is carefully designed to be as fast as possible, both for small operands and for huge operands. The speed is achieved by using fullwords as the basic arithmetic type, by using fast algorithms, with highly optimised assembly code for the most common inner loops for a lot of CPUs, and by a general emphasis on speed.

It is expected that highly optimized C code will be faster than Go.

crvv · 2019-03-20T09:26:17Z

0.14s  0.24% 98.49%     55.11s 93.52%  math/big.nat.divLarge

divLarge is slow.
Please see #21960.

crvv · 2019-03-20T10:40:11Z

The code doesn't use divLarge, so the pprof result posted previously includes (*big.Int).String().

Without (*big.Int).String(), for n = 1e8, math/big uses 32 s and gmp uses 1.6 s on my machine.

The time complexity of math/big is about n**(log(3)/log(2)), which is the complexity of karatsuba algorithm.
For gmp, it is is about n**(log(2.3)/log(2))

The pprof result of math/big is

      flat  flat%   sum%        cum   cum%
    17.27s 59.80% 59.80%     17.27s 59.80%  math/big.addMulVVW
     1.88s  6.51% 66.31%      1.88s  6.51%  runtime.memmove
     1.87s  6.48% 72.78%      1.87s  6.48%  math/big.subVV
     1.51s  5.23% 78.01%      1.51s  5.23%  runtime.madvise
     1.38s  4.78% 82.79%     12.85s 44.49%  math/big.basicSqr
     1.34s  4.64% 87.43%      1.34s  4.64%  math/big.addVV
     1.09s  3.77% 91.20%      9.13s 31.61%  math/big.basicMul
     0.74s  2.56% 93.77%      0.74s  2.56%  runtime.memclrNoHeapPointers
     0.31s  1.07% 94.84%      0.31s  1.07%  math/big.shlVU
     0.24s  0.83% 95.67%      0.69s  2.39%  runtime.stkbucket

thombrem · 2019-03-20T11:49:57Z

Though I am not yet a Go developer, I am tempted to chime in, since I have implemented this in Python3.

This is not how I would find the nth Fibonacci Number. Instead use the direct formula using the Golden Ration (Phi) to achieve results in O(1) time!

I am uploading the code to a Public repository soon, will share link once done.

Thx
Milind

thombrem · 2019-03-20T12:03:13Z

I believe the correct term to Google would be Binet's Formula... and here it is..

F(n) = round( Phi^n / √5 )

This is yet another reason to know good math and not just code!

Regards
Milind

Borthralla · 2019-03-20T12:14:56Z

I believe the correct term to Google would be Binet's Formula... and here it is..

F(n) = round( Phi^n / √5 )

This is yet another reason to know good math and not just code!

Regards
Milind

While the fibonacci implementation is not the focus of this thread, it is clear that you have some fundamental misunderstandings about floating point as well as big integer arithmetic. Floating point numbers are only o(1) multiplication because they have finite precision. Big Integers, on the other hand, have arbitrary precision. In particular, the size we are referring to in this thread are in the millions of digits.
At the level of precision necessary to get an exact answer, you would need extremely precise floating point values which would take much longer to multiply. Additionally, assuming you are using exponentiation by squaring and not naive exponentiation, it is the same number of multiplications (log(n)) as in the function I posted.
My implementation is about as fast as the native gmp implementation of fibonacci, which means that it doesn't really get much faster.
If you're in doubt, try plugging in n=10000000 to that function and see if you get the correct 2-million digit answer.

thombrem · 2019-03-20T13:14:29Z

@Borthralla Thanks for the insight! I do have a misunderstanding about the difference between floats and Big Integers. I will read this up in a day or two.

On a side note, this is yet another reason for me to be on this forum, thanks @Borthralla.

AND Exponentiation does take O(n) you are right yet again!

However, also note that there may be a great benefit in writing this in a single line of code as well, esp for (smaller) integer numbers.

Borthralla · 2019-03-20T14:09:50Z

The code doesn't use divLarge, so the pprof result posted previously includes (*big.Int).String().

Without (*big.Int).String(), for n = 1e8, math/big uses 32 s and gmp uses 1.6 s on my machine.

The time complexity of math/big is about n**(log(3)/log(2)), which is the complexity of karatsuba algorithm.
For gmp, it is is about n**(log(2.3)/log(2))

The pprof result of math/big is
      flat  flat%   sum%        cum   cum%
    17.27s 59.80% 59.80%     17.27s 59.80%  math/big.addMulVVW
     1.88s  6.51% 66.31%      1.88s  6.51%  runtime.memmove
     1.87s  6.48% 72.78%      1.87s  6.48%  math/big.subVV
     1.51s  5.23% 78.01%      1.51s  5.23%  runtime.madvise
     1.38s  4.78% 82.79%     12.85s 44.49%  math/big.basicSqr
     1.34s  4.64% 87.43%      1.34s  4.64%  math/big.addVV
     1.09s  3.77% 91.20%      9.13s 31.61%  math/big.basicMul
     0.74s  2.56% 93.77%      0.74s  2.56%  runtime.memclrNoHeapPointers
     0.31s  1.07% 94.84%      0.31s  1.07%  math/big.shlVU
     0.24s  0.83% 95.67%      0.69s  2.39%  runtime.stkbucket

@crvv Thanks for the explanation. Are there plans to support faster operations for extremely large integers in math/big similar to gmp? Is gmp's method prohibitively complex?

josharian · 2019-03-20T14:15:13Z

cc @griesemer @bmkessler

Do you know offhand whether gmp uses karatsuba or a different algorithm for this?

ALTree · 2019-03-20T14:29:05Z

@josharian

Definitely not. gmp stops using karatsuba pretty early:

NxN limb multiplications and squares are done using one of seven algorithms, as the size N increases.

Algorithm	Threshold
Basecase	(none)
Karatsuba	MUL_TOOM22_THRESHOLD
Toom-3	MUL_TOOM33_THRESHOLD
Toom-4	MUL_TOOM44_THRESHOLD
Toom-6.5	MUL_TOOM6H_THRESHOLD
Toom-8.5	MUL_TOOM8H_THRESHOLD
FFT	MUL_FFT_THRESHOLD

Source: https://gmplib.org/manual/Multiplication-Algorithms.html#Multiplication-Algorithms

ALTree · 2019-03-20T14:34:42Z

The thresholds are calibrated by a calibration procedure, but for reference MUL_TOOM33_THRESHOLD (the point where they stop using karatsuba) is pretty low. Like in the order of a 100 limbs (= machine words): https://gmplib.org/repo/gmp/file/tip/mpn/x86/coreisbr/gmp-mparam.h#l56

ericlagergren · 2019-03-20T14:38:29Z

@josharian GMP uses schoolbook -> Toom-Cook -> Schönhage-Strassen for multiplication, depending on the input size.

Borthralla · 2019-03-20T18:22:23Z

Does math/big include an implementation of Schönhage-Strassen?

ericlagergren · 2019-03-20T18:25:19Z

No, it uses karatsuba for large numbers.

bmkessler · 2019-03-20T18:30:57Z

For faster asymptotic multiplication, @remyoudompheng has https://github.com/remyoudompheng/bigfft which is much faster than math/big and only ~2x slower than gmp for very large numbers. Maybe he would like to contribute that implementation to math/big.

Also, for printing large integers @crvv noted above see #20906

griesemer · 2019-03-25T17:17:16Z

Most observations have already been made here. To summarize:

We know that certain low-level ops could be faster (e.g., addMulVVW)
math/big only implements Karatsuba. It should use the various Toom versions and FFT for very large numbers.
Division is slow (for output).

One should probably start with the Tooms.

Borthralla · 2019-03-25T23:50:51Z

Good new, guys! They found an nlog(n) multiplication algorithm, we should just use that!
https://hal.archives-ouvertes.fr/hal-02070778/document
(Mostly joking, but the algorithm is real and it just came out today)

gopherbot · 2019-04-14T06:32:20Z

Change https://golang.org/cl/172018 mentions this issue: math/big: implement recursive algorithm for division

thombrem · 2019-04-16T06:59:12Z

I can think of a O(log(log(n))) best case time complexity algorithm for computing exponents, which means Binet's formula might be doable for fibonacci(n).

Sorry to bring this up again
~Milind

The current division algorithm produces one word of result at a time, using 2-word division to compute the top word and mulAddVWW to compute the remainder. The top word may need to be adjusted by 1 or 2 units. The recursive version, based on Burnikel, Ziegler, "Fast Recursive Division", uses the same principles, but in a multi-word setting, so that multiplication benefits from the Karatsuba algorithm (and possibly later improvements). benchmark old ns/op new ns/op delta BenchmarkDiv/20/10-4 38.2 38.3 +0.26% BenchmarkDiv/40/20-4 38.7 38.5 -0.52% BenchmarkDiv/100/50-4 62.5 62.6 +0.16% BenchmarkDiv/200/100-4 238 259 +8.82% BenchmarkDiv/400/200-4 311 338 +8.68% BenchmarkDiv/1000/500-4 604 649 +7.45% BenchmarkDiv/2000/1000-4 1214 1278 +5.27% BenchmarkDiv/20000/10000-4 38279 36510 -4.62% BenchmarkDiv/200000/100000-4 3022057 1359615 -55.01% BenchmarkDiv/2000000/1000000-4 310827664 54012939 -82.62% BenchmarkDiv/20000000/10000000-4 33272829421 1965401359 -94.09% BenchmarkString/10/Base10-4 158 156 -1.27% BenchmarkString/100/Base10-4 797 792 -0.63% BenchmarkString/1000/Base10-4 3677 3814 +3.73% BenchmarkString/10000/Base10-4 16633 17116 +2.90% BenchmarkString/100000/Base10-4 5779029 1793808 -68.96% BenchmarkString/1000000/Base10-4 889840820 85524031 -90.39% BenchmarkString/10000000/Base10-4 134338236860 4935657026 -96.33% Fixes #21960 Updates #30943 Change-Id: I134c6f81a47870c688ca95b6081eb9211def15a2 Reviewed-on: https://go-review.googlesource.com/c/go/+/172018 Reviewed-by: Robert Griesemer <gri@golang.org> Run-TryBot: Robert Griesemer <gri@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>

gopherbot · 2020-10-29T03:52:51Z

Change https://golang.org/cl/266201 mentions this issue: math/big: implement Schönhage–Strassen fft algorithm

ALTree added the Performance label Mar 21, 2019

julieqiu changed the title ~~math/big: Large Fibonacci Calculation is Tremendously Faster in GMP~~ math/big: large fibonacci calculation is faster in GMP May 28, 2019

julieqiu added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 28, 2019

bcmills added this to the Unplanned milestone May 28, 2019

SparrowLii mentioned this issue Oct 29, 2020

math/big: Implement fft algorithm #42267

Open

thehowl mentioned this issue Apr 14, 2023

feat: type conversion, opInc, opDec for bigint and bigdec type gnolang/gno#676

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

math/big: large fibonacci calculation is faster in GMP #30943

math/big: large fibonacci calculation is faster in GMP #30943

Borthralla commented Mar 20, 2019

Borthralla commented Mar 20, 2019 •

edited

agnivade commented Mar 20, 2019

crvv commented Mar 20, 2019

crvv commented Mar 20, 2019

thombrem commented Mar 20, 2019

thombrem commented Mar 20, 2019

Borthralla commented Mar 20, 2019

thombrem commented Mar 20, 2019

Borthralla commented Mar 20, 2019

josharian commented Mar 20, 2019

ALTree commented Mar 20, 2019

ALTree commented Mar 20, 2019 •

edited

ericlagergren commented Mar 20, 2019 •

edited

Borthralla commented Mar 20, 2019

ericlagergren commented Mar 20, 2019

bmkessler commented Mar 20, 2019

griesemer commented Mar 25, 2019

Borthralla commented Mar 25, 2019 •

edited

gopherbot commented Apr 14, 2019

thombrem commented Apr 16, 2019

gopherbot commented Oct 29, 2020

math/big: large fibonacci calculation is faster in GMP #30943

math/big: large fibonacci calculation is faster in GMP #30943

Comments

Borthralla commented Mar 20, 2019

What version of Go are you using (go version)?

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using (go env)?

What did you do?

What did you expect to see?

What did you see instead?

Borthralla commented Mar 20, 2019 • edited

agnivade commented Mar 20, 2019

crvv commented Mar 20, 2019

crvv commented Mar 20, 2019

thombrem commented Mar 20, 2019

thombrem commented Mar 20, 2019

Borthralla commented Mar 20, 2019

thombrem commented Mar 20, 2019

Borthralla commented Mar 20, 2019

josharian commented Mar 20, 2019

ALTree commented Mar 20, 2019

ALTree commented Mar 20, 2019 • edited

ericlagergren commented Mar 20, 2019 • edited

Borthralla commented Mar 20, 2019

ericlagergren commented Mar 20, 2019

bmkessler commented Mar 20, 2019

griesemer commented Mar 25, 2019

Borthralla commented Mar 25, 2019 • edited

gopherbot commented Apr 14, 2019

thombrem commented Apr 16, 2019

gopherbot commented Oct 29, 2020

What version of Go are you using (`go version`)?

What operating system and processor architecture are you using (`go env`)?

Borthralla commented Mar 20, 2019 •

edited

ALTree commented Mar 20, 2019 •

edited

ericlagergren commented Mar 20, 2019 •

edited

Borthralla commented Mar 25, 2019 •

edited