Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: errors: add (stack)trace at error annotation #60873

Open
flibustenet opened this issue Jun 19, 2023 · 26 comments
Open

proposal: errors: add (stack)trace at error annotation #60873

flibustenet opened this issue Jun 19, 2023 · 26 comments
Labels
error-handling Language & library change proposals that are about error handling. Proposal
Milestone

Comments

@flibustenet
Copy link

This proposal try to address the lack of traceback in errors but keep the way we handle errors as value annotated where the error occur.

Errors are values, a string or a custom error.
To retrieve the stack of errors we need to follow the chain of strings, each annotation should be written carefully to retrieve easily from where the error comes. If we just return err one time we loose the path.

Because of this we can embed all the stack in a custom error at the higher lever and print it at the end with the hope that we didn't loose the path with a return fmt.Errorf("... %v"). Like that we can just return err everywhere, we'll just use the full traceback to see what's happen like in most of the others languages.
Often we'll have both, a huge traceback and a chain of annotations in the end and full of return err everywhere. This is not fun and not idiomatic in Go.

Sometimes we just need an annotation, in a lib for example, and sometimes it's interesting to have the method and the exact line in an application. In a web server we generaly need only the stack after the handler and before the lib but not in the server and middlewares.

My proposal is to can add easily the trace and just this trace when we annotate the error. Something like fmt.Errorf("@trace I call foo: %v", err)

The result is a minimal traceback like that:
https://go.dev/play/p/VJSXkbOX7J_c

> main.main() prog.go:44
 f1 call f2: 
> main.f1() prog.go:13
 f2 call f3: 
> main.f2() prog.go:22
 simple error with no trace

I believe it follow the Go way of handling errors.
The error is still a value, a value with a trace as annotation (not to do for an API).
We decide at each error if we need the trace or not and not for the whole chain.
We encourage to annotate.
Explicit.
Easy to implement.

It's easy to experiment, just replace fmt.Errorf with a function like this:

func Errorf(s string, vals ...any) error {
	pc, file, line, ok := runtime.Caller(2)
	if ok {
		info := fmt.Sprintf("\n> %s() %s:%d\n",
			runtime.FuncForPC(pc).Name(), filepath.Base(file), line)
		s = strings.Replace(s, "@trace", info, 1)
	}
	return fmt.Errorf(s, vals...)
}

Hope it give at least inspiration.

@gopherbot gopherbot added this to the Proposal milestone Jun 19, 2023
@ianlancetaylor
Copy link
Contributor

Note that you can already call a function that records a stack trace for an error, such as https://pkg.go.dev/github.com/go-errors/errors. Personally I think that if we want to add this kind of functionality into the standard library, we should do it using a different function, not using an annotation in the string passed to fmt.Errorf.

@ianlancetaylor ianlancetaylor changed the title proposal: Add (stack)trace at error annotation proposal: errors: add (stack)trace at error annotation Jun 19, 2023
@flibustenet
Copy link
Author

@ianlancetaylor I'm a aware of ways to add a full traceback and I writted about this, my english is not good.
This as few caveats:
The traceback is full even if you don't need all the stack.
The annotations are all at the end and not near the line.
Returning a new error in the flow loose the trace.
It encourage to just return err every where else.
It mimick a panic. and confuse beginners.

@earthboundkid
Copy link
Contributor

There are a number of third party packages with traces. The unique thing the standard library can do is to standardize how the traces are exposed. Maybe an interface in runtime like

type ErrorTrace interface{
  error
  Trace() []Frames
  }

@ianlancetaylor ianlancetaylor added the error-handling Language & library change proposals that are about error handling. label Jun 20, 2023
@ianlancetaylor
Copy link
Contributor

@flibustenet Thanks, I think I see what you are saying now. But I don't see why this has to be in the standard library. It can be a function defined in a separate package.

@flibustenet
Copy link
Author

@carlmjohnson my point was to embed only one Frame and not all the stack.
@ianlancetaylor thanks to try to understand my english ! I thought about this because of the way wrapping was resolved with a simple addon to fmt.Errorf with %w, i suggest the same kind of things for trace.
I don't make this proposal because I need it, of course it's nothing more easy to do it alone (and it works !) but I like to imagine for Go something more original than legacy traceback.

@earthboundkid
Copy link
Contributor

earthboundkid commented Jun 20, 2023

I have errorx.Trace which adds a single trace line like this:

func traceExample(ok bool) (err error) {
	defer errorx.Trace(&err)
	if !ok {
		// returns "problem in github.com/carlmjohnson/errorx_test.traceExample (trace_example_test.go:13): oh no!"
		return errors.New("oh no!") 
	}
	return nil
}

I think it's more interesting to think about what can the standard library do that user code cannot do. Otherwise, I can just use my own code. :-)

@flibustenet
Copy link
Author

@carlmjohnson fine to see similar ideas (I didn't search much).

@neild
Copy link
Contributor

neild commented Jun 20, 2023

The original set of error proposals for 1.13 included adding stack frames to errors created with errors.New and fmt.Errorf. That change was implemented, but ultimately rolled back. As I recall, the concerns were startup overhead (var ErrFoo = errors.New("...") calls at init time got a lot more expensive), and reflect.DeepEqual no longer reporting errors with the same text as identical.

This proposal avoids these issues by making stack frames opt-in; you need to ask for one. The downside is that you don't get stack frames for errors unless the creator asks for them. Either this means most errors don't have stack frames, or we need to update the entire Go ecosystem to start asking for stack frames in errors.

Updating the entire Go ecosystem seems infeasible.

I don't know what the right approach to getting stacks into errors is. Clearly this would be useful; we dropped it from 1.13 because we couldn't make it work right, not because the goal was a bad idea. I think any successful proposal here will need to figure out how to get stacks into existing errors without rewriting the world, and without running into the issues the 1.13 implementation did.

@jaloren
Copy link

jaloren commented Jun 21, 2023

@ianlancetaylor you raise an important point. Why put this in the standard library when this can be written in third party code, especially when there are so many conflicting opinions on how error annotations should be handled?

IMHO, the answer is that in some cases, having a single way to approach a problem in the standard library with a canonical interface is a great help to the go community. The go community has high trust in the standard library in terms of quality and the backwards compatibility guarantee. And for common needs it also removes one less third party dependency i need to pull in or have a discussion about with my team. Finally, it provides a contract that allows third party libraries to opt into so an application can handle cross cutting concerns in a consistent way.

Consider the following examples:

  • we are adding structured logging to the std library soon. The proposal process was grueling. I think all told there was well over 600 comments. Structured logging with log levels is something that anyone could write and.....we did. I think at one point there were over 10 popular logging libraries out there.
  • the context library
  • errors.Join

I've been writing go since 2016. Error annotation comes up constantly on my team both writing go code, reading it, and most importantly debugging it. I am sure everyone has had a story of getting an error printed to standard out that repeats the same error message multiple times (hello improper wrapping), being forced to grep through the go source files, and them making an educated guess which error was the source of the problem.

@jaloren
Copy link

jaloren commented Jun 21, 2023

@neild I remember the discussion at the time and was disappointed about the loss of the stacktraces though I understood why. The problem as you describe it does seem intractable. So I'd like to reframe it.

I would argue that the reason developers don't annotate errors is that it's not ergonomic or easy to do so. Developers are busy (or lazy) and we will tend to go with the path of least resistance. I'd wager that if there was an ergonomic way in the standard library to annotate and format errors (e.g for logging), then developers would start widely adopting them. In particular:

  • bring back the formatting interface from 1.13 or design new ones
  • also add convenience function in errors such as errors.Source or errors.Stack (bikeshed the names by all means) and also a parallel one for fmt.Errorf
  • a go rewrite rule that replaces errors.New and fmt.Errorf with the error annotation variant
  • come up with a story about how to make this nicely interoperate with slog

@ianlancetaylor
Copy link
Contributor

  1. Errors serve multiple purposes. Some errors are intended for the users of a program. Some are intended for the developers. Errors intended for users should rarely contain stack traces. Therefore, a rewrite rule to add error annotations would be a bad idea in general.
  2. A package like context is appropriate for the standard library because it is only useful if everybody agrees on how to handle task cancellation. Different packages by different people need to be able to express cancellation in a coherent way, and using a single standard library implementation makes that easier. That argument does not apply to error stack traces.
  3. A package like log/slog is appropriate for the standard library because different packages need to log data, and it's simpler for larger programs if all their packages can log in similar ways. The log/slog package works with multiple different logging frontends and backends to coordinate this. That argument does not apply to error stack traces.
  4. The function errors.Join was added to the standard library because we observed that many different people were writing essentially the same functionality, but they weren't able to easily make it work with the standard library errors.As and errors.Is functions. See the discussion in errors: add support for wrapping multiple errors #53435. That argument does not apply to error stack traces.
  5. I agree that repetitive error messages is a problem due to inconsistent wrapping by different packages, but as far as I can tell this proposal doesn't solve that.

@flibustenet
Copy link
Author

I believe errors are values and values doesn't need traceback by default, for example we use errors for EOF, we should not add a traceback to EOF. I use error for http redirect, i would not want that returning an error is slower or consume anythings more. Like %w %v it's something that we decide conscientiously at each error because it's maybe part of the API.

@earthboundkid
Copy link
Contributor

Yes. I was thinking about my http client library. It marks 4xx and 5xx responses as an “error” but I don’t think anyone wants a traceback from my library. Whatever solution we have should be opt-in at application boundaries.

@neild
Copy link
Contributor

neild commented Jun 21, 2023

Errors have three audiences:

  • Users, who want a description of what failed. This is the string returned by the Error method.
  • Programs, which make control flow decisions based on errors. We have a number of effective patterns for this case. (Sentinels, type assertions, Is and As.)
  • Programmers, who want to debug an unexpected failure.

This last case is not well served today, and is the case where stack information would be useful. When debugging a failure, you want to know exactly where that failure happened, and as much surrounding context as possible.

I think the approach we tried in 1.13, where stack information is always present but only displayed on request, is still the right one. The only time you should need stack information in an error is when something unexpected has happened and you want to understand what. Adding stacks only at the places you expect to need them doesn't work well, because it's precisely the times that something unexpected happened that you need the stack.

If we could record and save the caller's stack frame for every errors.New and fmt.Errorf call at no cost and without breaking any programs, and display that information only when requested, this would be a clear aid in debugging unexpected failures. The problem is that there will always be a cost (but perhaps we can reduce it to a manageable level), and we don't have a good idea of how to do this without breaking existing programs.

@flibustenet
Copy link
Author

If we save the stack at errors.New or fmt.Errorf we'll loose it if we use a %v or any err.Error() on the flow right ?

@robpike https://commandcenter.blogspot.com/2017/12/error-handling-in-upspin.html was very instructive for me.

In contrast, a stack trace-like error is worse in both respects.
For those cases where stack traces would be helpful, we allow the errors package to be built with the "debug" tag, which enables them. This works fine, but it's worth noting that we have almost never used this feature. Instead the default behavior of the package serves well enough that the overhead and ugliness of stack traces are obviated.

In practice, I added full stack with actionable flag and noticed the same thing.
Are there consensus about the necessity of a full traceback on errors ?

@earthboundkid
Copy link
Contributor

If there were GO_TRACEBACK=1 environment variable to turn on traceback recordings, I'm not sure if I would turn it on in prod or not. On the one hand, I don't think the performance hit would matter to me. On the other hand, I don't usually need a traceback to know where something went wrong because my errors are already decorated.

In any event, it would need to be an interface, so custom error types could choose to opt-into it too.

@jaloren
Copy link

jaloren commented Jun 22, 2023

@neild during the 1.13 errors proposal, I don’t recall discussion of making the stacktraces opt in through an env variable or some other configuration. Was this already considered and rejected ?

Speaking personally, I’d enthusiastically opt in.

@neild
Copy link
Contributor

neild commented Jun 22, 2023

I don't recall whether we discussed that or not, but I don't think we'd want to add something that is off by default. Either it's good enough to be always there, or it isn't good enough to be part of the standard library.

@inifares23lab
Copy link

inifares23lab commented Jul 31, 2023

hi
I created a package which is a mostly a proxy or a revisited implementation of the standard errors + few extra functions for adding a stack trace:
github.com/inifares23lab/errors

I was wondering if someting similar to that package would be viable for optionally enabling stack traces in the standard library, example:

  1. leave all current functionality as is in standard errors
  2. leave functionality as is for fmt.Errorf
  3. implement new struct { errorString string; location string; innerErr error}
  4. the new struct is used only with new functions i.e.:
  • errors.TraceNew(s string)
  • fmt.TraceErrorf(format, str string) with "%w" behaving as it is now
  1. implement errors.Stack(err error) which returns the stacktrace in some way (maybe []map[string]string??)
  2. Error() string returns all of the chain like it is now (with the location if not empty)
  3. Outer() string || Last() string || ... returns only the last error with the location where valued
  4. update formatting output statements like log.Printf(), fmt.Printf().. to behave exactly as they are now and add the "%w" formatter to print the outer error (see point 7)

In my head this should:

  • leave all existing functionality behaving the same
  • optionally enable stacktraces (in the hands of the developers)
  • optionally print stack or just the outer error for maybe extra clarity (in the hands of the developers)
  • include existing errors wrapped with fmt.Errorf in the stack (without the location of course)
  • should only have a minimal memory and runtime overhead and only if using the new functions

please correct me / help me where I am wrong or where you spot possible improvement to the idea

@ianlancetaylor
Copy link
Contributor

Thanks. Our typical approach for a case like this would be to see how many programs adopt this package, or one like it. We would base a decision on whether to add to the standard library based on that.

@mitar
Copy link
Contributor

mitar commented Aug 6, 2023

As somebody who wrote an errors package which in my mind is like v2 of github.com/pkg/errors, addressing many issues found there while keeping backwards compatibility to the maximum, I would strongly suggest that if there is a standard stack tracer interface defined, it should be:

type stackTracer interface {
	StackTrace() []uintptr
}

In particular, the return type should be []uintptr (name is less important), as obtained from callers. This way only very limited information is stored until it is needed (and you can map callers then to frames, print that or whatever). The important thing is that one should not have to import any 3rd party package to get stack information out (what github.com/pkg/errors requires).

I am not too convinced that stack traces should be recorded as part of std lib, but maybe they could. It does becomes tricky through do you and when do you record additional stack traces when wrapping an existing error. Do you record it every time you wrap? In gitlab.com/tozd/go/errors I opted to not record it every time wrapping happens, but only when an explicit "cause" wrapping is done, so when you define a new error and record an existing error as an cause. Similarly, what do you do when you are joining multiple errors? (In my package, I record stack trace when joining always.)

One thing worth standardizing would also be JSON serialization of errors and associated (possible) stack traces, especially because now those errors can wrap other errors and joined errors and so on. So the tree of errors can become large.

And if we talk about structured logging, gitlab.com/tozd/go/errors also defines an interface for adding structured data to errors:

type detailer interface {
	Details() map[string]interface{}
}

You can then modify that map in-place and add additional data. I have found it very useful to attach debugging data instead of pushing that all into the error string. One advantage is also that error string is then a constant, easy to google search, translate, while details is what changes between program runs. Maybe interface like this could be standardized as well. (Details consist of multiple possible layers of details, as wrapped errors can have their own details.)

@inifares23lab
Copy link

inifares23lab commented Sep 10, 2023

Hi

following this conversation and a bit of playing with the new log/slog library I wrote github.com/golang-exp/errors-poc
This is just to play with this design idea it don't think it will be developed to a production ready state (at least not by me), although I might continue improving/playing with the design

I thought:

  • Why not have leveled errors to enable, disable some kind of stacktrace dynamically?
  • Use same syntax for standard errors and errors with stacktraces
    • this means that existing code does not need to be redacted (apart from configuration, which could be minimal changes) to enable stacktraces (imagine enabling stack traces for a huge codebase)
    • defaults to current errors behaviour so nothing changes if left unconfigured

@jimwei

This comment was marked as spam.

@fsaintjacques
Copy link

I find it really irritating that I have to sprinkle fmt.Errorf("...: %w", err) and/or an equivalent of pkg/errors.New all around the callsites that may propagate an error for the sake of either capturing a stack trace or a pseudo-stack trace via string prefixing. On top of this, it means that every libraries/dependencies that I use must ideally use the same pattern or else stack traces will always only contain frames from my package and not the one from third parties.

I'd just wish to set GO_INJECT_ERR_STACKTRACE=true in go build and voila, all usage of fmt.Errorf and errors.New (ignoring statically defined exportable variables) auto-magically attaches a stack trace if none were provided.

@ianlancetaylor
Copy link
Contributor

@fsaintjacques Thanks, based on that idea I've opened https://go.dev/issue/63358.

@mitar
Copy link
Contributor

mitar commented Oct 3, 2023

all usage of fmt.Errorf and errors.New (ignoring statically defined exportable variables) auto-magically attaches a stack trace if none were provided.

Yea, but what about other libraries which are not using standard errors package anymore, because they already want to attach stack traces? It will not enable it for them.

I think the critical piece is that we should standardize what the interface of errors which have recorded a stack trace looks like. I suggest:

type stackTracer interface {
	StackTrace() []uintptr
}

Then, we can discuss it adding it to standard errors with a flag or not. But without a standard way to obtain a stack trace, interoperability will suffer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
error-handling Language & library change proposals that are about error handling. Proposal
Projects
Status: Incoming
Development

No branches or pull requests

10 participants