Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: segmentation fault with Go 1.7.1 #17162

Closed
mark-rushakoff opened this issue Sep 19, 2016 · 2 comments
Closed

runtime: segmentation fault with Go 1.7.1 #17162

mark-rushakoff opened this issue Sep 19, 2016 · 2 comments

Comments

@mark-rushakoff
Copy link
Contributor

What version of Go are you using (go version)?

go version go1.7.1 darwin/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN="/Users/michaeldesa/go/bin"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/michaeldesa/go"
GORACE=""
GOROOT="/usr/local/Cellar/go/1.7.1/libexec"
GOTOOLDIR="/usr/local/Cellar/go/1.7.1/libexec/pkg/tool/darwin_amd64"
CC="clang"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/gs/ppbkd_gd0njbysrt1mxkg7vw0000gn/T/go-build172969900=/tmp/go-build -gno-record-gcc-switches -fno-common"
CXX="clang++"
CGO_ENABLED="1"

We saw the same panic when either running with a Darwin build, or running on Linux with a build cross compiled from Darwin.

What did you do?

This setup process is slightly involved. I tried to capture all the relevant details on the first pass here; but I'm eager to get this fixed, so I'll try and respond quickly if there's any trouble reproducing the issue.

If you don't want to go through building and running InfluxDB, we have a 206MB gzipped core dump from linux/amd64, running with GOTRACEBACK=crash, and the corresponding binary, built from influxdata/influxdb@7e515cf.

If you want to go through building from source to reproduce the crash:

go get github.com/sparrc/gdm # dependency management tool we use
go get -d github.com/influxdata/influxdb
cd $GOPATH/src/github.com/influxdata/influxdb
$GOPATH/bin/gdm restore # Configures dependencies into $GOPATH
go run cmd/influxd/main.go
# Alternatively, cross compiled with:
# GOBIN="" GOOS=linux GOARCH=amd64 go install ./...

On OSX, we've seen the crash with and without SSA enabled. We didn't try linux without SSA.

In a separate shell, run this load generator against the influxdb instance. It reliably (>80% of the time?) panics within the first iteration of the tool, which completes in around 30 seconds.

What did you expect to see?

No panic. The offending part of the stack trace looks like it may be susceptible to a nil pointer dereference, but Go should prevent a segfault, right?

What did you see instead?

Panic starting with:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x5fa88f]

goroutine 5490 [running]:
panic(0x9b6480, 0xc42000a090)
    /usr/local/Cellar/go/1.7.1/libexec/src/runtime/panic.go:500 +0x1a1 fp=0xc47a811b18 sp=0xc47a811a88
runtime.panicmem()
    /usr/local/Cellar/go/1.7.1/libexec/src/runtime/panic.go:62 +0x6d fp=0xc47a811b48 sp=0xc47a811b18
runtime.sigpanic()
    /usr/local/Cellar/go/1.7.1/libexec/src/runtime/sigpanic_unix.go:24 +0x214 fp=0xc47a811ba0 sp=0xc47a811b48
github.com/influxdata/influxdb/tsdb.(*Measurement).AppendSeriesKeysByID(0xc420332960, 0xc47ad44c80, 0x0, 0x1, 0xc420331120, 0x1, 0x1, 0x0, 0x0, 0x0)
    /Users/michaeldesa/go/src/github.com/influxdata/influxdb/tsdb/meta.go:600 +0xff fp=0xc47a811c18 sp=0xc47a811ba0
github.com/influxdata/influxdb/tsdb.(*seriesIterator).nextKeys(0xc47ab56180, 0xc47a811ce8, 0x44233b)
    /Users/michaeldesa/go/src/github.com/influxdata/influxdb/tsdb/shard.go:1050 +0x138 fp=0xc47a811c88 sp=0xc47a811c18
github.com/influxdata/influxdb/tsdb.(*seriesIterator).Next(0xc47ab56180, 0xc4a7c31810, 0x1, 0x1)
    /Users/michaeldesa/go/src/github.com/influxdata/influxdb/tsdb/shard.go:1007 +0x1f6 fp=0xc47a811cf8 sp=0xc47a811c88
github.com/influxdata/influxdb/influxql.(*floatSortedMergeIterator).pop(0xc477b0f0e0, 0x0, 0xc420320c40, 0x10)
    /Users/michaeldesa/go/src/github.com/influxdata/influxdb/influxql/iterator.gen.go:388 +0x3d9 fp=0xc47a811dc0 sp=0xc47a811cf8
github.com/influxdata/influxdb/influxql.(*floatSortedMergeIterator).Next(0xc477b0f0e0, 0x0, 0xc47a811f10, 0x68784e)
    /Users/michaeldesa/go/src/github.com/influxdata/influxdb/influxql/iterator.gen.go:350 +0x2b fp=0xc47a811df0 sp=0xc47a811dc0
github.com/influxdata/influxdb/influxql.(*floatInterruptIterator).Next(0xc47ab52c80, 0x0, 0x0, 0x0)
    /Users/michaeldesa/go/src/github.com/influxdata/influxdb/influxql/iterator.gen.go:761 +0x52 fp=0xc47a811e20 sp=0xc47a811df0
github.com/influxdata/influxdb/influxql.(*floatFastDedupeIterator).Next(0xc47ab52ca0, 0x0, 0x0, 0x0)
    /Users/michaeldesa/go/src/github.com/influxdata/influxdb/influxql/iterator.go:1191 +0x48 fp=0xc47a811ef0 sp=0xc47a811e20
github.com/influxdata/influxdb/influxql.(*bufFloatIterator).Next(0xc47ab52cc0, 0x1, 0x1, 0xdc2860)
    /Users/michaeldesa/go/src/github.com/influxdata/influxdb/influxql/iterator.gen.go:96 +0x3c fp=0xc47a811f20 sp=0xc47a811ef0
github.com/influxdata/influxdb/influxql.(*floatAuxIterator).stream(0xc47a4aa4e0)
    /Users/michaeldesa/go/src/github.com/influxdata/influxdb/influxql/iterator.gen.go:880 +0x38 fp=0xc47a811fa8 sp=0xc47a811f20
runtime.goexit()
    /usr/local/Cellar/go/1.7.1/libexec/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc47a811fb0 sp=0xc47a811fa8
created by github.com/influxdata/influxdb/influxql.(*floatAuxIterator).Start
    /Users/michaeldesa/go/src/github.com/influxdata/influxdb/influxql/iterator.gen.go:844 +0x3f

Full panic attached: panic.txt

@bradfitz
Copy link
Contributor

No panic. The offending part of the stack trace looks like it may be susceptible to a nil pointer dereference, but Go should prevent a segfault, right?

No. Not right.

I see no evidence that this isn't a bug in InfluxDB.

Do you have reason to believe otherwise?

@mark-rushakoff
Copy link
Contributor Author

Skimming through InfluxDB's closed issues (against Go 1.6) with "nil pointer", they seem to all follow this form:

panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0x0 pc=0x886334]

It was the line [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x5fa88f] that caused me to believe we ran into a bug with Go. I've just confirmed in a separate test that this format is the normal format in Go 1.7.

Closing this issue now and assuming it's a bug on our end.

@golang golang locked and limited conversation to collaborators Sep 20, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants