Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV in bsdthread_create instead of useful error #549

Closed
jackpal opened this issue Jan 19, 2010 · 13 comments
Closed

SIGSEGV in bsdthread_create instead of useful error #549

jackpal opened this issue Jan 19, 2010 · 13 comments

Comments

@jackpal
Copy link
Contributor

jackpal commented Jan 19, 2010

I only saw this once. (i.e. it doesn't reproduce often.)

The app had about 130 active goroutines when the error occured.

What steps will reproduce the problem?
1. build and run Taipei-Torrent app, let it run for a few minutes

What is the expected output? What do you see instead?

Saw a SIGSEGV. (See attached for the program output)


What is your $GOOS?  $GOARCH?

OSX Leopard AMD 64


Which revision are you using?  (hg identify)

$ hg identify
c3169cad2f47+ tip


Please provide any additional information below.

The source of the app is at

http://github.com/jackpal/Taipei-
Torrent/tree/d1b2ecd5db95229b5ef16d84e5765307ce898a19

Attachments:

  1. gocrash.txt (2141077 bytes)
@jackpal
Copy link
Contributor Author

jackpal commented Jan 19, 2010

Comment 1:

I just saw a second crash, essentially the same symptoms.
Actually the program always either hangs or crashes within 1-to-10 minutes of 
starting. I originally thought that the hang was unrelated to the crash, but now I am 
thinking it might be two symptoms of the same bug.
If you would like to try and reproduce the bug, I have attached the source code, make 
files, and a torrent file (this is for the  Ubuntu 10.4 alpha 1 iso image, which may not 
remain an active torrent for very much longer, but it's what I used to repro the bug.)
Repro steps:
Unzip attached directory somewhere
bash make.bash
bash test.bash
Wait 10 minutes to see if the crash or hang reproduces.
(You may need to fiddle with the test.bash script depending on whether or not you 
are testing on a network that has UPnP enabled or not.)

Attachments:

  1. Taipei-Torrent.zip (387610 bytes)

@jackpal
Copy link
Contributor Author

jackpal commented Jan 19, 2010

Comment 2:

Ah, one more build step: after unzipping Tapei-Torrent.zip, and before running:
cd Taipei-Torrent/testData
mkdir downloads

@rsc
Copy link
Contributor

rsc commented Jan 19, 2010

Comment 3:

The error message should be better.  bsdthread_create called notok (not ok) because it
got an error from the system 
call.  Which error?  The register dump says AX = 0xc, so ENOMEM.  In other words, you've
run the system out of 
threads.
Your gocrash.txt has 2640 goroutines in total (!). 
; grep -c '^goroutine' gocrash.txt
2640
;
That's really cool - I've never seen that many goroutines in a real program before.
That wouldn't be a problem except that 2558 of them are either running or blocked in
system calls, due to the many 
time.Tickers you've created:
; egrep -c '^goroutine [0-9]+ \[[23]\]:' gocrash.txt
2558
;
The system has decided that's enough OS threads for one program.  ;-)
We need to make time.Tick a little better but you'd still have a memory leak from
creating all the tickers.  If you're need 
a tick only temporarily, you should use time.NewTicker() and then call t.Stop() when
you're done.  I've sent a CL to make 
t.Stop a little more robust but even in its current form it's worth using.
I'll leave this issue open to track fixing the error message.
Have fun!

Owner changed to r...@golang.org.

Status changed to Accepted.

@jackpal
Copy link
Contributor Author

jackpal commented Jan 19, 2010

Comment 4:

Thanks for the explanation of what's going on.
There should "only" be about 150 goroutines -- 40 clients * 3 per peer + some 
overhead.
One of the 3 goroutines per client is for ticks -- I am trying to do a "wait for either
a 
message to be sent or 120 seconds to expire". I will write my own resettable-ticker 
to handle this case.

@rsc
Copy link
Contributor

rsc commented Jan 19, 2010

Comment 5:

There's already a resettable ticker, time.Ticker.
The problem is that it seems like when a peer goes
away you need to explicitly release the ticker
used for that peer.  You can do that with t.Stop().
Russ

@jackpal
Copy link
Contributor Author

jackpal commented Jan 19, 2010

Comment 6:

time.Ticker doesn't seem to do what I want. I just want a 120 second timeout on a select 
{} statement.
But I worked around it by keeping track of when the last message was sent for each 
peer, and periodically checking every peer to see I need to send a keepalive message. So 
now I have just one Ticker for all 40 peers and the ticker is kept around rather than 
thrown away on each select call.

@hoisie
Copy link
Contributor

hoisie commented Jan 19, 2010

Comment 7:

Sounds like another example where select(timeout) would be useful :)
http://groups.google.com/group/golang-
nuts/browse_thread/thread/b7c0177cfce15937/7f37b1b754bc9aa8

@gopherbot
Copy link
Contributor

Comment 8 by stephenm@golang.org:

Issue #636 has been merged into this issue.

@rsc
Copy link
Contributor

rsc commented Apr 28, 2010

Comment 9:

Simple demo of crash
package main
import (
    "os"
    "runtime"
    "strconv"
    "time"
)
func main() {
    n, _ := strconv.Atoi(os.Args[1])
    for i := 0; i < n; i++ {
        go time.Sleep(1e10)
        runtime.Gosched()
    }
}

@rsc
Copy link
Contributor

rsc commented Apr 29, 2010

Comment 10:

This issue was closed by revision 718da33.

Status changed to Fixed.

@gopherbot
Copy link
Contributor

Comment 11 by konrad.meyer:

This isn't just a darwin issue -- this commit doesn't appear to touch anything
outside of darwin stuff. Is this fixed for linux, bsd, etc?

@rsc
Copy link
Contributor

rsc commented Apr 29, 2010

Comment 12:

If not feel free to file a bug for those systems. 
This one was about OS X.

@gopherbot
Copy link
Contributor

Comment 13 by konrad.meyer:

I did, but I marked it as a duplicate of this one. I'll try it out and re-open it if
it's still broken.

@jackpal jackpal added the fixed label Apr 29, 2010
@golang golang locked and limited conversation to collaborators Jun 24, 2016
@rsc rsc removed their assignment Jun 22, 2022
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants