Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

text/template: provide full stacktrace when the function call panics #59400

Open
wxiaoguang opened this issue Apr 3, 2023 · 24 comments
Open
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@wxiaoguang
Copy link

At the moment, the code is:

// safeCall runs fun.Call(args), and returns the resulting value and error, if
// any. If the call panics, the panic value is returned as an error.
func safeCall(fun reflect.Value, args []reflect.Value) (val reflect.Value, err error) {
	defer func() {
		if r := recover(); r != nil {
			if e, ok := r.(error); ok {
				err = e
			} else {
				err = fmt.Errorf("%v", r)
			}
		}
	}()
	ret := fun.Call(args)
	if len(ret) == 2 && !ret[1].IsNil() {
		return ret[0], ret[1].Interface().(error)
	}
	return ret[0], nil
}

If the function in template panics: {{.TheUnstableFunc}} , users and developers only know that it panics, but there is no stracktrace.

In a complex system, the template functions could also be complex, without stacktrace, it's difficult to locate the root problem.

It's really helpful to provide a full stacktrace when the template function panics.

We have encounters a related bug today, the TheUnstableFunc has concurrency bug, template package only reports runtime error: slice bounds out of range [2:1], it costs a lot of time for debugging the bug.

Thank you.

@gopherbot gopherbot added this to the Proposal milestone Apr 3, 2023
@seankhliao
Copy link
Member

cc @robpike

@ianlancetaylor
Copy link
Contributor

I don't see any reason for this to be a proposal, so taking it out of the proposal process.

Can you show us a complete, standalone example that demonstrates the problem? I don't understand it from the description. Thanks.

@ianlancetaylor ianlancetaylor changed the title proposal: text/template: provide full stacktrace when the function call panics text/template: provide full stacktrace when the function call panics Apr 4, 2023
@ianlancetaylor ianlancetaylor added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. and removed Proposal labels Apr 4, 2023
@ianlancetaylor ianlancetaylor modified the milestones: Proposal, Unplanned Apr 4, 2023
@wxiaoguang
Copy link
Author

wxiaoguang commented Apr 4, 2023

We have encounters a related bug today, the TheUnstableFunc has concurrency bug, template package only reports runtime error: slice bounds out of range [2:1], it costs a lot of time for debugging the bug.

The background is: we have a very complex function to be called in template: {{.TheComplexFunc}} , in TheComplexFunc, there are a lot of function calls including TheBuggyFunc, which has a concurrency problem, like this (just for example, to show it is complex):

func (t *Type) TheComplexFunc() {
    func1()...
    func2()...
    while ... {
         if (func3()) {
            TheBuggyFunc()
         } else {
             func4()
         }
    }
    func5()
}

Then we call the TheComplexFunc in template:

html code ...
{{$t.TheComplexFunc ....}}
html code ...

Since the TheBuggyFunc has concurrency bug, on high concurrent requests, it panics with slice bounds out of range [2:1].

But in error log, we only see: executing "my_template" at <$t.TheComplexFunc>: error calling TheComplexFunc: runtime error: slice bounds out of range [2:1]

TheComplexFunc is quite complex, it's unclear where the panic happens. I know that there could be some methods to help to locate the problem, but if we can see the full stacktrace in error:

executing "my_template" at <$t.TheComplexFunc>: error calling TheComplexFunc: runtime error: slice bounds out of range [2:1]

/framework.go:123 cases.String()
/my-code.go:123 TheComplexFunc()
/my-code.go:456 TheComplexFunc()
/template.go:... execute()

We can find the problem in first time.

Related issue: go-gitea/gitea#23872

@ianlancetaylor
Copy link
Contributor

It's impossible to be sure without an example, but I think you are referring to https://go.googlesource.com/go/+/refs/heads/master/src/text/template/funcs.go#355 . When a template calls a function, and the function panics, the template turns that into an error. That means that the default stack trace will not appear.

I don't think that we can change that behavior today. That would be too likely to break existing working code.

@wxiaoguang
Copy link
Author

wxiaoguang commented Apr 4, 2023

I don't think that we can change that behavior today. That would be too likely to break existing working code.

To avoid breaking, how about allowing developers to use a self-defined panic handler?

func safeCall(fun reflect.Value, args []reflect.Value) (val reflect.Value, err error) {
	defer func() {
		if r := recover(); r != nil {
			rErr := t.panicHandler(r) // <------ HERE
			if e, ok := rErr.(error); ok {
				err = e
			} else {
				err = fmt.Errorf("%v", rErr)
			}
		}
	}()
	...
}


t.SetPanicHandler(func (v any) error {
	// we have a chance to handle the panic
})

t.Execute(...)

@wxiaoguang
Copy link
Author

Hi @ianlancetaylor , what do you think about this proposal?

@robpike
Copy link
Contributor

robpike commented Apr 6, 2023

I would rather not make this change for a number of reasons, compatibility being one. But to find your bug you can either fork the package to add the traceback code (easy), delete the defer block (even easier), or just use a debugger (maybe easy, depending on the environment), which will in effect breakpoint when the trap happens before the template package catches it.

In other words, it is your responsibility to find the bug, not the library's. And really, it should be very easy to do.

@wxiaoguang
Copy link
Author

compatibility being one

Sorry but I didn't see any compatibility problem here, we just need to add a new function, nothing breaks.

If there is any compatibility problem, please let me know.

But to find your bug you can either ...........

If a developer could catch the panic stacktrace in first time, why they should do so much unnecessary work? Especially for a production rare crash. A clear stacktrace saves everyone's time. It's really helpful.

it is your responsibility to find the bug

It is developer's responsibility to find bugs, but it's framework / language's responsibility to help developers to find bug, instead of hiding important clues / just returning an unclear error message.

I would say the template package do wrong here, because it doesn't follow Golang's philosophy: use panic stracktrace to locate the incorrect code.

And really, it should be very easy to do.

In many cases, not that easy. If framework could help, it is easier.


Could you please re-consider about this proposal? Thank you very much.

@ianlancetaylor
Copy link
Contributor

This issue is related to the fact that fmt.Printf and friends catch panics in String and GoString methods, and the way that net/http catches panics in handlers. The behavior in the fmt, net/http, and text/template packages is normally good. Occasionally it impedes debugging. If we're going to change anything--and I'm not sure we are--perhaps we should consider some way to change it for all of them.

I don't think SetPanicHandler is a good choice as that injects too much flexibility. The choices are "recover panic" (as we do today) and "don't recover panic."

For example, one could consider setting the environment variable GODEBUG=recoverpanic=0.

@robpike
Copy link
Contributor

robpike commented Apr 7, 2023

Rather than change the libraries by requiring them to examine an environment variable, which will lead to a cascade of clunky changes, we could instead - as @ianlancetaylor may be saying above - just change the runtime to disable recovery of traps and panics (separately; panic is often used programmatically and it may just break things to disable it completely, while a trap caused by an array index out of bounds or a floating-point error is almost universally a bug).

@wxiaoguang
Copy link
Author

wxiaoguang commented Apr 7, 2023

and the way that net/http catches panics in handlers. The behavior in the fmt, net/http, and text/template packages is normally good. Occasionally it impedes debugging.

This problem is different from net/http:

  1. We can wrap every handler for net/http, we can easily catch all panics in our code. So I believe the net/http is very friendly to developers and we do not need to change it for panic handling.
  2. But we can't wrap every function for text/template, because some are called on a struct directly like {{.MyVar.BadFunnc}}.

For example, one could consider setting the environment variable GODEBUG=recoverpanic=0.

I do not think recoverpanic=0 is feasible. It is just a whack-a-mole game, and it might not catch the panic I described above.

For example, many large projects like Gitea, it uses quite a lot of 3rd packages, and it self sometimes also uses recover() to catch some well-defined panics.

In a busy production system, before the race panic occurs, a lot of other unrelated panics would occur ahead, I can't imagine the recoverpanic=0 could help to catch the real race panic.

@ianlancetaylor
Copy link
Contributor

As @robpike says we can just disable recovering a runtime panic, while still recovering an explicit call to panic. That may break a few packages out there but it seems unlikely to affect many. It should give a stack trace for the case described in the original comment, as "slice range out of bounds" is a runtime panic.

To be clear, I'm not sure whether this is a good idea.

@wxiaoguang
Copy link
Author

just disable recovering a runtime panic, while still recovering an explicit call to panic.

I see, if runtime panics like slice range out of bounds / nil pointer dereference / etc could be exposed, it's definitely better than before.

@robpike
Copy link
Contributor

robpike commented Apr 7, 2023

@ianlancetaylor If it breaks something, it's would only be under the flag, where almost by definition it is OK, compatibility-wise.

If we do anything, this seems the best. Programming each library to handle panics differently sometimes would be a mess.

@wxiaoguang
Copy link
Author

Hello, is there some progress? We meet new runtime panic - error again go-gitea/gitea#24342 (comment) , it's really difficult to debug, developers have to write a lot of temp "recover" code.

@andreyvit
Copy link

Just to add a data point, I have printstack snippet in my editor that expands into:

	defer func() {
		if e := recover(); e != nil {
			log.Printf("** PANIC: %v", e)
			debug.PrintStack()
			panic(e)
		}
	}()

and it's there solely for use with template functions, and every time it's a nuisance and takes a bunch of guesswork to debug, and every time I wish template didn't ever recover from panics.

I'd welcome an option to disable recover in safeCall for text/template specifically. I never want those panics to be recovered from, not even in production, so would just disable recovery permanently.

Like others noted, other panic recoveries (in net/http, for example) are not a problem, this really is specific to templates.

@wxiaoguang
Copy link
Author

Ping, could the proposal be considered and accepted?

@ianlancetaylor
Copy link
Contributor

I don't think we know exactly what to do. Do you want to try writing a trial implementation?

@wxiaoguang
Copy link
Author

wxiaoguang commented Aug 7, 2023

Some approaches:

  1. Don't introduce any new option, make safeCall panic again if the err is runtime error:
    • not ideal enough, I think it's better to keep consistent "panic" behavior
  2. Add an option: PanicHandler, make safeCall calls the PanicHandler first when the panic occurs
  3. Make safeCall return type ErrorWithStack struct { ... } when panic occurs
    • it could also be controlled by an option if necessary
  4. Add an option: NoPanicRecover or Option("onpanic=nop"), do not recover any panic in safeCall.

(I can see that there were already some options like "missingkey=zero", so adding a new option for the panic behavior seems not that difficult or breaking)

Which one do you prefer? Or could there be more possible approaches?

@robpike
Copy link
Contributor

robpike commented Aug 7, 2023

I still prefer doing nothing. If you want a traceback, edit safeCall to not recover. The existing behavior is right almost always.

@wxiaoguang
Copy link
Author

wxiaoguang commented Aug 7, 2023

I still prefer doing nothing. If you want a traceback, edit safeCall to not recover. The existing behavior is right almost always.

Sorry but I didn't get the point. How could "the existing behavior (hide the panic stacktrace)" be "right"? It causes some template function calls hard to debug their panics. There are also other reports like #59400 (comment) .

edit safeCall to not recover.

Do you mean every developer who uses Golang template should learn to build Go compiler/packages from source and use their home-made toolchain to debug the panics? I do not think it is friendly to developers and the eco-system.

@wxiaoguang
Copy link
Author

wxiaoguang commented Aug 14, 2023

I don't think we know exactly what to do. Do you want to try writing a trial implementation?

@ianlancetaylor what do you think about these possible approaches?

IMO the option 4 (reuse the option function: Option("onpanic=nop")) seems the best approach?

@ianlancetaylor
Copy link
Contributor

I think that some version of option 3 seems workable. If safeCall recovers a panic, it would return an error that includes a stack trace at the point of the panic. I think that would be reasonably backward compatible.

@robpike
Copy link
Contributor

robpike commented Aug 22, 2023

I feel it's a bad precedent to start including true stack traces in errors. If you need to do something - and it still seems to me that you don't - an opt-in option that disables error recover feels right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

6 participants