Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

math/big: zero divide in ProbablyPrime (garbage collection bug?) #4869

Closed
gopherbot opened this issue Feb 21, 2013 · 30 comments
Closed

math/big: zero divide in ProbablyPrime (garbage collection bug?) #4869

gopherbot opened this issue Feb 21, 2013 · 30 comments
Milestone

Comments

@gopherbot
Copy link

by huseby@lindenlab.com:

What steps will reproduce the problem?
If possible, include a link to a program on play.golang.org.
1. Run this on linux_i386: http://play.golang.org/p/05udxnXbLH
2. Wait for it to have a divide by zero error (only happens very rarely)

What is the expected output?
The rsa.GenerateKey() shouldn't cause a panic.


What do you see instead?
panic: runtime error: integer divide by zero
[signal 0x8 code=0x1 addr=0x80a6898 pc=0x80a6898]

goroutine 10 [running]:
math/big.divWW(0xe49cf0e9, 0x7b5ae8c8, 0x0, 0x188fcac8, 0x17, ...)
    /home/huseby/Projects/go/src/pkg/math/big/arith_386.s:21 +0x8
math/big.nat.divLarge(0x18821870, 0x10, 0x24, 0x188fcaa0, unexpected fault address 0x21,
...0xeef97277
)
    /home/huseby/Projects/go/src/pkg/math/big/nat.go:564 +0x811
math/big.nat.div(0x18821870, 0x10, 0x24, 0x188fcaa0, 0x20, ...)
    /home/huseby/Projects/go/src/pkg/math/big/nat.go:519 +0x26f
math/big.nat.expNNWindowed(0x18821900, 0x20, 0x24, 0x1885f500, 0x10, ...)
    /home/huseby/Projects/go/src/pkg/math/big/nat.go:1358 +0x923
math/big.nat.expNN(0x1885f550, 0x10, 0x14, 0x1885f500, 0x10, ...)
    /home/huseby/Projects/go/src/pkg/math/big/nat.go:1257 +0x3f7
math/big.nat.probablyPrime(0x188ad4b0, 0x10, 0x14, 0x14, 0x6, ...)
    /home/huseby/Projects/go/src/pkg/math/big/nat.go:1442 +0x573
math/big.(*Int).ProbablyPrime(0x1892e0a0, 0x14, 0x35, 0x0)
    /home/huseby/Projects/go/src/pkg/math/big/int.go:726 +0x42
crypto/rand.Prime(0x1873d120, 0x1873d140, 0x200, 0x1892e0a0, 0x0, ...)
    /home/huseby/Projects/go/src/pkg/crypto/rand/util.go:97 +0x337
crypto/rsa.GenerateMultiPrimeKey(0x1873d120, 0x1873d140, 0x2, 0x400, 0x18863480, ...)
    /home/huseby/Projects/go/src/pkg/crypto/rsa/rsa.go:154 +0x192
crypto/rsa.GenerateKey(0x1873d120, 0x1873d140, 0x400, 0x0, 0x0, ...)
    /home/huseby/Projects/go/src/pkg/crypto/rsa/rsa.go:127 +0x51
main.CreateCert(0x1872c540, 0x0)
    /home/huseby/Projects/onion_finder/of.go:35 +0x4f
created by main.main
    /home/huseby/Projects/onion_finder/of.go:121 +0x155


Which compiler are you using (5g, 6g, 8g, gccgo)?

8g

Which operating system are you using?

Ubuntu 12.04 32-bit

Which version are you using?  (run 'go version')

$ go version
go version devel +8d71734a0cb0 Tue Feb 05 09:54:01 2013 -0800 linux/386

Please provide any additional information below.

This is a dual, quad-core Intel(R) Xeon(R) CPU X5492  @ 3.40GHz machine.  I was running
with GOMAXPROCS=10 and --ncpu 9.  It takes many, many iterations to get the panic.

Attachments:

  1. of (2192416 bytes)
@gopherbot
Copy link
Author

Comment 1 by huseby@lindenlab.com:

I attached the executable.

@rsc
Copy link
Contributor

rsc commented Feb 22, 2013

Comment 2:

There really is a zero on the stack as the argument to divWW (it's not a divide by -1,
for example), but that word is the most significant word of the divisor, which is never
zero. If the divisor itself were zero it would have len(v) == 0, so that v[len(v)-1]
would have panicked. Worse, in the context of the called code the divisor is the modulus
passed to expNN, which remains constant throughout the execution of expNNWindowed. And
yet the code died in the div at line 1358:
  1339      for i := len(y) - 1; i >= 0; i-- {
  1340          yi := y[i]
  1341          for j := 0; j < _W; j += n {
  1342              if i != len(y)-1 || j != 0 {
  1343                  // Unrolled loop for significant performance
  1344                  // gain.  Use go test -bench=".*" in crypto/rsa
  1345                  // to check performance before making changes.
  1346                  zz = zz.mul(z, z)
  1347                  zz, z = z, zz
  1348                  zz, r = zz.div(r, z, m)
  1349                  z, r = r, z
  1350  
  1351                  zz = zz.mul(z, z)
  1352                  zz, z = z, zz
  1353                  zz, r = zz.div(r, z, m)
  1354                  z, r = r, z
  1355  
  1356                  zz = zz.mul(z, z)
  1357                  zz, z = z, zz
  1358                  zz, r = zz.div(r, z, m)
  1359                  z, r = r, z
  1360  
  1361                  zz = zz.mul(z, z)
  1362                  zz, z = z, zz
  1363                  zz, r = zz.div(r, z, m)
  1364                  z, r = r, z
  1365              }
  1366  
  1367              zz = zz.mul(z, powers[yi>>(_W-n)])
  1368              zz, z = z, zz
  1369              zz, r = zz.div(r, z, m)
  1370              z, r = r, z
  1371  
  1372              yi <<= n
  1373          }
  1374      }
If m is a denormalized zero at 1358, why is not a denormalized zero at 1353 or 1348?
My hypothesis is that this is a garbage collection bug, that the underlying array
backing m has been collected and zeroed prematurely. That would resolve all the
mysteries.
It's easy to believe there are garbage collection bugs at tip, sadly. The garbage
collector has been having some work done recently and I've seen a couple strange crashes
I haven't tracked down yet. If you need stability my suggestion would be to stay at the
release version.
How many times have you seen this crash? How long does it typically take to happen?

Labels changed: added priority-later, go1.1, removed priority-triage, go1.1maybe.

Status changed to Accepted.

@gopherbot
Copy link
Author

Comment 3 by huseby@lindenlab.com:

This crash usually takes anywhere between 5 minutes and an hour to happen.  I've never
seen this exe run longer than an hour before failing with this exact crash.

@rsc
Copy link
Contributor

rsc commented Feb 27, 2013

Comment 4:

The repository tip now has at least one important garbage collection fix,
as well as some other structural garbage collection changes. I'd be very
interested to hear whether you find that the crash still happens if you
rebuild at tip.
Thanks.
Russ

@gopherbot
Copy link
Author

Comment 5 by huseby@lindenlab.com:

I updated to version:
go version devel +e0b23d56fd6a Wed Feb 27 21:17:53 2013 +0200 linux/386
And now it's failing with:
panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x1 pc=0x80a7bbf]
I'm not sure which callstack it was so here's all of them:
goroutine 3 [running]:
math/big.nat.setBytes(0x1, 0x10, 0x14, 0x18411880, 0x40, ...)
    /home/huseby/Projects/go/src/pkg/math/big/nat.go:1494 +0xbf
math/big.(*Int).SetBytes(0x184f4b60, 0x18411880, 0x40, 0x40, 0x40, ...)
    /home/huseby/Projects/go/src/pkg/math/big/int.go:567 +0x48
crypto/rand.Prime(0x1834b400, 0x1834b020, 0x200, 0x184f4b60, 0x0, ...)
    /home/huseby/Projects/go/src/pkg/crypto/rand/util.go:69 +0x22c
crypto/rsa.GenerateMultiPrimeKey(0x1834b400, 0x1834b020, 0x2, 0x400, 0x1840fde0, ...)
    /home/huseby/Projects/go/src/pkg/crypto/rsa/rsa.go:168 +0x1af
crypto/rsa.GenerateKey(0x1834b400, 0x1834b020, 0x400, 0x0, 0x0, ...)
    /home/huseby/Projects/go/src/pkg/crypto/rsa/rsa.go:127 +0x51
main.CreateCert(0x18335510)
    /home/huseby/Projects/onion_finder/of.go:35 +0x4f
created by main.main
    /home/huseby/Projects/onion_finder/of.go:121 +0x155
goroutine 1 [chan receive]:
main.main()
    /home/huseby/Projects/onion_finder/of.go:128 +0x19f
goroutine 0 [syscall]:
goroutine 4 [running]:
created by main.main
    /home/huseby/Projects/onion_finder/of.go:121 +0x155
goroutine 5 [running]:
math/big.nat.divLarge(0x184bb750, 0x10, 0x24, 0x184bb7e0, 0x21, ...)
    /home/huseby/Projects/go/src/pkg/math/big/nat.go:584 +0xbf4
math/big.nat.div(0x18461b40, 0x10, 0x24, 0x18501630, 0x20, ...)
    /home/huseby/Projects/go/src/pkg/math/big/nat.go:519 +0x26f
created by main.main
    /home/huseby/Projects/onion_finder/of.go:121 +0x155
goroutine 6 [running]:
created by main.main
    /home/huseby/Projects/onion_finder/of.go:121 +0x155
goroutine 7 [running]:
math/big.nat.divLarge(0x1839bab0, 0x10, 0x24, 0x1839bab0, 0x21, ...)
    /home/huseby/Projects/go/src/pkg/math/big/nat.go:582 +0xa36
math/big.nat.div(0x1839bab0, 0x10, 0x24, 0x1839bab0, 0x20, ...)
    /home/huseby/Projects/go/src/pkg/math/big/nat.go:519 +0x26f
math/big.nat.expNNWindowed(0x1839bab0, 0x20, 0x24, 0x1840e780, 0x10, ...)
    /home/huseby/Projects/go/src/pkg/math/big/nat.go:1363 +0xa4e
math/big.nat.expNN(0x1840e7d0, 0x10, 0x14, 0x1840e780, 0x10, ...)
    /home/huseby/Projects/go/src/pkg/math/big/nat.go:1257 +0x3f7
math/big.nat.probablyPrime(0x184f1be0, 0x10, 0x14, 0x14, 0x0, ...)
    /home/huseby/Projects/go/src/pkg/math/big/nat.go:1442 +0x573
math/big.(*Int).ProbablyPrime(0x184d9270, 0x14, 0x35)
    /home/huseby/Projects/go/src/pkg/math/big/int.go:726 +0x42
crypto/rand.Prime(0x1834b400, 0x1834b020, 0x200, 0x184d9270, 0x0, ...)
    /home/huseby/Projects/go/src/pkg/crypto/rand/util.go:97 +0x337
crypto/rsa.GenerateMultiPrimeKey(0x1834b400, 0x1834b020, 0x2, 0x400, 0x1840f4e0, ...)
    /home/huseby/Projects/go/src/pkg/crypto/rsa/rsa.go:168 +0x1af
crypto/rsa.GenerateKey(0x1834b400, 0x1834b020, 0x400, 0x0, 0x0, ...)
    /home/huseby/Projects/go/src/pkg/crypto/rsa/rsa.go:127 +0x51
main.CreateCert(0x18335510)
    /home/huseby/Projects/onion_finder/of.go:35 +0x4f
created by main.main
    /home/huseby/Projects/onion_finder/of.go:121 +0x155
goroutine 8 [running]:
math/big.nat.divLarge(0x183c2e40, 0x10, 0x15, 0x183c2e40, 0x21, ...)
    /home/huseby/Projects/go/src/pkg/math/big/nat.go:541 +0x2bc
math/big.nat.div(0x184f23f0, 0x10, 0x24, 0x184f2360, 0x10, ...)
    /home/huseby/Projects/go/src/pkg/math/big/nat.go:519 +0x26f
math/big.nat.expNNWindowed(0x18493140, 0x20, 0x25, 0x1841deb0, 0x10, ...)
    /home/huseby/Projects/go/src/pkg/math/big/nat.go:1353 +0x7f8
math/big.nat.expNN(0x1841df00, 0x10, 0x14, 0x1841deb0, 0x10, ...)
    /home/huseby/Projects/go/src/pkg/math/big/nat.go:1257 +0x3f7
math/big.nat.probablyPrime(0x18402370, 0x10, 0x14, 0x14, 0x0, ...)
    /home/huseby/Projects/go/src/pkg/math/big/nat.go:1442 +0x573
math/big.(*Int).ProbablyPrime(0x184d9130, 0x14, 0x35)
    /home/huseby/Projects/go/src/pkg/math/big/int.go:726 +0x42
crypto/rand.Prime(0x1834b400, 0x1834b020, 0x200, 0x184d9130, 0x0, ...)
    /home/huseby/Projects/go/src/pkg/crypto/rand/util.go:97 +0x337
crypto/rsa.GenerateMultiPrimeKey(0x1834b400, 0x1834b020, 0x2, 0x400, 0x183adea0, ...)
    /home/huseby/Projects/go/src/pkg/crypto/rsa/rsa.go:168 +0x1af
crypto/rsa.GenerateKey(0x1834b400, 0x1834b020, 0x400, 0x0, 0x0, ...)
    /home/huseby/Projects/go/src/pkg/crypto/rsa/rsa.go:127 +0x51
main.CreateCert(0x18335510)
    /home/huseby/Projects/onion_finder/of.go:35 +0x4f
created by main.main
    /home/huseby/Projects/onion_finder/of.go:121 +0x155
goroutine 9 [running]:
math/big.nat.divLarge(0x183ae870, 0x10, 0x24, 0x183ae870, 0x21, ...)
    /home/huseby/Projects/go/src/pkg/math/big/nat.go:596 +0xf55
created by main.main
    /home/huseby/Projects/onion_finder/of.go:121 +0x155
goroutine 10 [running]:
created by main.main
    /home/huseby/Projects/onion_finder/of.go:121 +0x155
goroutine 11 [running]:
created by main.main
    /home/huseby/Projects/onion_finder/of.go:121 +0x155
goroutine 12 [chan receive]:
main.HashCert(0xbfd854ca, 0x6, 0x18335510, 0x183354e0)
    /home/huseby/Projects/onion_finder/of.go:54 +0xff
created by main.main
    /home/huseby/Projects/onion_finder/of.go:125 +0x189

@gopherbot
Copy link
Author

Comment 6 by huseby@lindenlab.com:

I'll re-run it with a single producer and a single consumer goroutine and wait for it to
fail.

@gopherbot
Copy link
Author

Comment 7:

I created a patch that turns off some code in the garbage collector:
https://golang.org/cl/7557044/
The idea is to gradually extend the patch until the cause of the issue is pinpointed.
I am unable to reproduce the failure on my notebook. Please apply
https://golang.org/cl/7557044/ to Go tip on a computer on which the issue is
reproducible and report the results.

@cznic
Copy link
Contributor

cznic commented Mar 7, 2013

Comment 8:

Why it isn't possible to (slowly but simply) bisect this heisenbug? (On a machine which
can reproduce it.)

@alberts
Copy link
Contributor

alberts commented Mar 8, 2013

Comment 9:

I'm going to see if I can reproduce this on the machine that has been doing the
linux/amd64 stress tests. I've also set up a linux/386 stress test to see if anything
falls out.

@alberts
Copy link
Contributor

alberts commented Mar 8, 2013

Comment 10:

Can you reproduce this with GOGCTRACE=1?
If so, how many collections before it typically blows up?

@alberts
Copy link
Contributor

alberts commented Mar 8, 2013

Comment 11:

I have managed to reproduce this with:
go build mal.go && file ./mal && GOGCTRACE=1 GOMAXPROCS=10 ./mal -ncpu=9
gc329284(8): 0+1+0 ms, 1 -> 0 MB 11545 -> 581 (3746016176-3746015595) objects, 14(497)
handoff, 47(697) steal, 327/166/9 yields
SIGSEGV: segmentation violation
PC=0x8065892
Segmentation fault
I've also seen something suspect while running the time unit tests (nil pointer
dereference). Investigating to see if they are related.

@alberts
Copy link
Contributor

alberts commented Mar 8, 2013

Comment 12:

issue #5007 describes some more linux/386 issues. this might be related to that.

@gopherbot
Copy link
Author

Comment 13:

Would it be possible to publish the source code of mal.go?

@alberts
Copy link
Contributor

alberts commented Mar 8, 2013

Comment 14:

mal.go is http://play.golang.org/p/05udxnXbLH

@gopherbot
Copy link
Author

Comment 15 by huseby@lindenlab.com:

I applied the patch (https://golang.org/cl/7557044/) and re-ran the code.  This
time it failed with:
unexpected fault address 0x4a983dc9
fatal error: fault
[signal 0xb code=0x1 addr=0x4a983dc9 pc=0x805178b]
several goroutines:
panic: runtime error: invalid memory address or nil pointer dereference
I'll try GOGCTRACE=1 next using the latest tip without the patch.

@gopherbot
Copy link
Author

Comment 16 by huseby@lindenlab.com:

Running the latest tip, without the patch above, and running it with GOGCTRACE=1, I get
the crash.  These are the last few lines of output before the crash:
gc60339(8): 0+0+0 ms, 1 -> 0 MB 10721 -> 683 (128080926-128080243) objects, 2(10)
handoff, 32(333) steal, 283/138/6 yields
gc60340(8): 0+0+0 ms, 0 -> 0 MB 684 -> 682 (128080928-128080246) objects, 12(491)
handoff, 42(827) steal, 315/131/4 yields
gc60341(8): 0+0+0 ms, 0 -> 0 MB 686 -> 664 (128080934-128080270) objects, 7(433)
handoff, 42(702) steal, 309/148/9 yields
gc60342(8): 0+0+0 ms, 0 -> 0 MB 670 -> 665 (128080942-128080277) objects, 12(474)
handoff, 29(648) steal, 318/154/7 yields
gc60343(8): 0+0+0 ms, 0 -> 0 MB 675 -> 665 (128080953-128080288) objects, 3(20) handoff,
34(744) steal, 278/106/4 yields
gc60344(8): 0+0+0 ms, 0 -> 0 MB 709 -> 652 (128080997-128080345) objects, 16(499)
handoff, 31(559) steal, 365/186/9 yields
unexpected fault address 0xe7f7d752
fatal error: fault
[signal 0xb code=0x1 addr=0xe7f7d752 pc=0x805178b]

@gopherbot
Copy link
Author

Comment 17 by huseby@lindenlab.com:

BTW, running this using a single producer and single consumer thread doesn't cause the
panic.  I had my test app successfully run for days without a panic when I run it with
-ncpu=1.

@dvyukov
Copy link
Member

dvyukov commented Mar 8, 2013

Comment 18:

crash on addr=1 looks like GC's "needs to be zeroed" mark

@dvyukov
Copy link
Member

dvyukov commented Mar 8, 2013

Comment 19:

I've just hit the following error on mal.go ($ GOMAXPROCS=17 ./mal -ncpu=16 on
linux/386):
failed MSpanList_Insert 0xf716b0e8 0xf716b0ac 0xf7597f80
fatal error: MSpanList_Insert
goroutine 1 [chan receive]:
main.main()
    /tmp/mal.go:128 +0x19f

@alberts
Copy link
Contributor

alberts commented Mar 9, 2013

Comment 20:

Here's one from a recent net test failure. Might be insightful.
go test -v -cpu 1,2,4 -timeout=30m std
=== RUN TestSelfConnect-4
scanstack inconsistent: g1269#0 sp=0xf4d79e84 not in [0xffffff01,0x1]
fatal error: scanstack
goroutine 150 [running]:
[fp=0x18557250] runtime.throw(0x82a453d)
    /build/go.386/go/src/pkg/runtime/panic.c:465 +0x66
[fp=0x18557284] addstackroots(0x18441000)
    /build/go.386/go/src/pkg/runtime/mgc0.c:1340 +0xf4
[fp=0x185572ac] addroots()
    /build/go.386/go/src/pkg/runtime/mgc0.c:1413 +0x1f3
[fp=0x185573dc] gc(0xf4e35fa4)
    /build/go.386/go/src/pkg/runtime/mgc0.c:1850 +0x1f0
----- stack segment boundary -----
[fp=0xf4e35fb0] runtime.gc(0x1)
    /build/go.386/go/src/pkg/runtime/mgc0.c:1788 +0xea
[fp=0xf4e35fdc] runfinq()
    /build/go.386/go/src/pkg/runtime/mgc0.c:2056 +0x166
[fp=0xf4e35fe0] runtime.goexit()
    /build/go.386/go/src/pkg/runtime/proc.c:1160
created by runtime.gc
    /build/go.386/go/src/pkg/runtime/mgc0.c:1749
go version devel +d533352b414d Sat Mar 09 05:39:15 2013 +0100 linux/386

@dvyukov
Copy link
Member

dvyukov commented Mar 9, 2013

Comment 21:

This is really weird... do you have stack trace for goroutine 1269?

@dvyukov
Copy link
Member

dvyukov commented Mar 9, 2013

Comment 22:

Here is one more:
[signal 0xb code=0x1 addr=0xb572fde4 pc=0x805157b]
  #0: 0x8057d59 runtime.startpanic src/pkg/runtime/panic.c:406
  #1: 0x8058025 runtime.throw src/pkg/runtime/panic.c:481
  #2: 0x8060fe7 runtime.sigpanic src/pkg/runtime/thread_linux.c:223
  #3: 0x805157b runtime.MCentral_AllocList src/pkg/runtime/mcentral.c:64
  #4: 0x80511b7 runtime.MCache_Alloc src/pkg/runtime/mcache.c:24
  #5: 0x8062291 runtime.mallocgc src/pkg/runtime/zmalloc_linux_386.c:47
  #6: 0x805ea30 makeslice1 src/pkg/runtime/slice.c:61
  #7: 0x805e88d runtime.makeslice src/pkg/runtime/slice.c:36
  #8: 0x809ff7f math/big.nat.make src/pkg/math/big/nat.go:76
  #9: 0x80a1c1f math/big.nat.mul src/pkg/math/big/nat.go:400
  #10: 0x80a73bc math/big.nat.expNNWindowed src/pkg/math/big/nat.go:1332
  #11: 0x80a6b47 math/big.nat.expNN src/pkg/math/big/nat.go:1257
  #12: 0x80a81c3 math/big.nat.probablyPrime src/pkg/math/big/nat.go:1442
  #13: 0x809e782 math/big.(*Int).ProbablyPrime src/pkg/math/big/int.go:726
  #14: 0x80667b7 crypto/rand.Prime src/pkg/crypto/rand/util.go:97
  #15: 0x80671bf crypto/rsa.GenerateMultiPrimeKey src/pkg/crypto/rsa/rsa.go:168
  #16: 0x8066ff1 crypto/rsa.GenerateKey src/pkg/crypto/rsa/rsa.go:127
  #17: 0x8048c4f main.CreateCert /tmp/mal.go:35
  #18: 0x805b110 runtime.goexit src/pkg/runtime/proc.c:1160

@alberts
Copy link
Contributor

alberts commented Mar 9, 2013

Comment 23:

goroutine 1269 [chan send]:
net.runDatagramPacketConnServer(0x183c08a0, 0x81aea28, 0x3, 0x81b3ee8, 0xb, ...)
    /build/go.386/go/src/pkg/net/server_test.go:385 +0x3b0
created by net.TestTimeoutUDP
    /build/go.386/go/src/pkg/net/timeout_test.go:244 +0x156

@dvyukov
Copy link
Member

dvyukov commented Mar 9, 2013

Comment 24:

Malloc/free deadlocks seem to be reincarnation of the same problem:
fatal error: malloc/free - deadlock 
 #0: 0x8057d59 runtime.startpanic src/pkg/runtime/panic.c:406
  #1: 0x8058025 runtime.throw src/pkg/runtime/panic.c:481
  #2: 0x8062217 runtime.mallocgc src/pkg/runtime/zmalloc_linux_386.c:33
  #3: 0x8063453 runtime.mal src/pkg/runtime/zmalloc_linux_386.c:611
  #4: 0x804fd97 copyin src/pkg/runtime/iface.c:152
  #5: 0x804ff4e runtime.convT2E src/pkg/runtime/iface.c:220
  #6: 0x8049c99 runtime.newErrorString src/pkg/runtime/error.go:74
  #7: 0x80580f1 runtime.panicstring src/pkg/runtime/panic.c:497
  #8: 0x806105a runtime.sigpanic src/pkg/runtime/thread_linux.c:216
  #9: 0x80511fc runtime.MCache_Alloc src/pkg/runtime/mcache.c:33
  #10: 0x8062291 runtime.mallocgc src/pkg/runtime/zmalloc_linux_386.c:47
  #11: 0x805ea30 makeslice1 src/pkg/runtime/slice.c:61
  #12: 0x805e88d runtime.makeslice src/pkg/runtime/slice.c:36
  #13: 0x80a2e2c math/big.nat.divLarge src/pkg/math/big/nat.go:541
  #14: 0x80a2b1f math/big.nat.div src/pkg/math/big/nat.go:519
  #15: 0x80a7958 math/big.nat.expNNWindowed src/pkg/math/big/nat.go:1353
  #16: 0x80a6b47 math/big.nat.expNN src/pkg/math/big/nat.go:1257
  #17: 0x80a81c3 math/big.nat.probablyPrime src/pkg/math/big/nat.go:1442
  #18: 0x809e782 math/big.(*Int).ProbablyPrime src/pkg/math/big/int.go:726
  #19: 0x80667b7 crypto/rand.Prime src/pkg/crypto/rand/util.go:97
  #20: 0x80671bf crypto/rsa.GenerateMultiPrimeKey src/pkg/crypto/rsa/rsa.go:168
  #21: 0x8066ff1 crypto/rsa.GenerateKey src/pkg/crypto/rsa/rsa.go:127
  #22: 0x8048c4f main.CreateCert /tmp/mal.go:35
  #23: 0x805b110 runtime.goexit src/pkg/runtime/proc.c:1160

@alberts
Copy link
Contributor

alberts commented Mar 10, 2013

Comment 25:

Here's one that might be easier to reproduce
go test image/jpeg
go version devel +16e0e01c2e9b Sat Mar 09 18:14:00 2013 -0800 linux/386
GOMAXPROCS=50
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
    panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x19 pc=0x80728dd]
goroutine 5 [running]:
testing.func·004()
    /build/go.386/go/src/pkg/testing/testing.go:341 +0xf7
image/jpeg.(*decoder).ensureNBits(0x1841e000, 0x1, 0x810c2c0, 0x46)
    /build/go.386/go/src/pkg/image/jpeg/huffman.go:38 +0x3d
image/jpeg.(*decoder).decodeHuffman(0x1841e000, 0x1841e078, 0x0, 0x46, 0x0, ...)
    /build/go.386/go/src/pkg/image/jpeg/huffman.go:173 +0xa1
image/jpeg.(*decoder).processSOS(0x1841e000, 0xa, 0x18423498, 0x2)
    /build/go.386/go/src/pkg/image/jpeg/scan.go:196 +0x14a7
image/jpeg.(*decoder).decode(0x1841e000, 0x18335860, 0x183001d0, 0xf7292c00, 0xc, ...)
    /build/go.386/go/src/pkg/image/jpeg/reader.go:296 +0x51e
image/jpeg.Decode(0x18335860, 0x183001d0, 0x183001d0, 0x0, 0x0, ...)
    /build/go.386/go/src/pkg/image/jpeg/reader.go:320 +0x4a
image/jpeg.decodeFile(0x1835d450, 0x2e, 0x0, 0x0, 0x0, ...)
    /build/go.386/go/src/pkg/image/jpeg/reader_test.go:78 +0xba
image/jpeg.TestDecodeProgressive(0x1837f180)
    /build/go.386/go/src/pkg/image/jpeg/reader_test.go:35 +0x278
testing.tRunner(0x1837f180, 0x81fa3ec)
    /build/go.386/go/src/pkg/testing/testing.go:346 +0x87
created by testing.RunTests
    /build/go.386/go/src/pkg/testing/testing.go:426 +0x684
goroutine 1 [chan receive]:
testing.RunTests(0x817100c, 0x81fa3e0, 0x5, 0x5, 0x1, ...)
    /build/go.386/go/src/pkg/testing/testing.go:427 +0x69f
testing.Main(0x817100c, 0x81fa3e0, 0x5, 0x5, 0x81fa420, ...)
    /build/go.386/go/src/pkg/testing/testing.go:358 +0x69
main.main()
    image/jpeg/_test/_testmain.go:61 +0x81
FAIL    image/jpeg  0.075s

@dvyukov
Copy link
Member

dvyukov commented Mar 10, 2013

Comment 26:

Just 2 days of debugging to discover that it's... drumroll... unaligned 64-bit atomic...
again!
Episodically GC sweeps some spans twice, and episodically it leads to double free. It
perfectly explains all the crashes we see, a lot of crashes in freelists, crashes only
on 386 and only with GOMAXPROCS>1.

Owner changed to @dvyukov.

@gopherbot
Copy link
Author

Comment 27:

So the problem is somewhere in parfor?

@dvyukov
Copy link
Member

dvyukov commented Mar 10, 2013

Comment 28:

yup

@dvyukov
Copy link
Member

dvyukov commented Mar 10, 2013

Comment 29:

mailed https://golang.org/cl/7534044/

@dvyukov
Copy link
Member

dvyukov commented Mar 10, 2013

Comment 30:

This issue was closed by revision 433824d.

Status changed to Fixed.

@rsc rsc added this to the Go1.1 milestone Apr 14, 2015
@rsc rsc removed the go1.1 label Apr 14, 2015
@golang golang locked and limited conversation to collaborators Jun 24, 2016
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants