Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/trace: NewEventLog() occasionally panics with slice bounds out of range #35792

Closed
ThomasJannaud opened this issue Nov 23, 2019 · 7 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@ThomasJannaud
Copy link

What version of Go are you using (go version)?

go1.12.11 on a debian docker image (I cannot access go version directly, it is running on a system in prod)

Does this issue reproduce with the latest release?

I cannot tell either I don't have control on this. I can run my code locally with a more recent version but it is hard to reproduce the panic I describe.
Our service is running on prod with a pretty high qps and I see only a handful of panics a day.

What operating system and processor architecture are you using (go env)?

debian docker image

What did you do?

Using grpc library upfront (if it matters), using a custom connection

https://play.golang.org/p/kRC1ZwTeN9J

What did you see instead?

panic: runtime error: slice bounds out of range at ...vendor/golang.org/x/net/trace/events.go:140
from the line 39: NewEventLog... in the playground link

which corresponds to the line el.stack = el.stack[:n] below (from x/net/trace/events.go:140)

func NewEventLog(family, title string) EventLog {
	el := newEventLog()
	el.ref()
	el.Family, el.Title = family, title
	el.Start = time.Now()
	el.events = make([]logEntry, 0, maxEventsPerLog)
	el.stack = make([]uintptr, 32)
	n := runtime.Callers(2, el.stack)
	el.stack = el.stack[:n]

	getEventFamily(family).add(el)
	return el
}

I have no idea how a panic can happen here. newEventLog() seems to return a very "threadsafe" result.

@toothrot toothrot changed the title panic in trace/events: NewEventLog() net/trace: NewEventLog() occasionally panics with slice bounds out of range Nov 25, 2019
@toothrot toothrot changed the title net/trace: NewEventLog() occasionally panics with slice bounds out of range x/net/trace: NewEventLog() occasionally panics with slice bounds out of range Nov 25, 2019
@gopherbot gopherbot added this to the Unreleased milestone Nov 25, 2019
@toothrot
Copy link
Contributor

/cc @mikioh as the owner

@toothrot toothrot added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Nov 25, 2019
@ThomasJannaud
Copy link
Author

@toothrot hi, thanks for following up, however it seems that mikioh doesn't have much recent activity on github, so I wondered if you had some thoughts or other owners or pointers around this (I can try to do more investigation on my side too but I'm not sure where to start) , thanks!

@toothrot
Copy link
Contributor

toothrot commented Dec 2, 2019

Hey @ThomasJannaud,

I know that the relevant code was written by @Sajmani, but that @bradfitz @ianlancetaylor may also be able to provide more context here.

I'm not sure where the hard-coded 32 comes from, or why runtime.Callers may (apparently) return an n larger than the slice passed in, unless there is an off-by-one error here.

@bradfitz
Copy link
Contributor

bradfitz commented Dec 2, 2019

I can't reproduce this or see how it can happen.

I didn't look at the source to older versions of Go's runtime or x/net/trace, though.

@toothrot
Copy link
Contributor

toothrot commented Dec 2, 2019

@bradfitz Is it possible that runtime.Callers could be returning the wrong value? I don't understand how else this panic could happen.

@bradfitz
Copy link
Contributor

bradfitz commented Dec 2, 2019

@toothrot, maybe. I looked at master's version of Callers/gentraceback and didn't see anything but I didn't look at old versions of Go.

/cc @cherrymui @aclements

@ThomasJannaud
Copy link
Author

ThomasJannaud commented Dec 5, 2019

after investigation, we were calling Finish() multiple times and this sometimes led to confuse the reference counting (another way to say it: it would end up putting back the event log in the "free" queue when there was still a reference to it)

Sorry for the false alarm and thank you guys for having looked

@golang golang locked and limited conversation to collaborators Dec 4, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

4 participants