Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: errors: add All and AllAs iterators #66455

Open
rogpeppe opened this issue Mar 21, 2024 · 25 comments
Open

proposal: errors: add All and AllAs iterators #66455

rogpeppe opened this issue Mar 21, 2024 · 25 comments
Labels
Milestone

Comments

@rogpeppe
Copy link
Contributor

rogpeppe commented Mar 21, 2024

Proposal Details

The original error inspection proposal contained the following paragraph:

We recognize that there are situations that Is and As don’t handle
well. Sometimes callers want to perform multiple checks against the
same error, like comparing against more than one sentinel value.
Although these can be handled by multiple calls to Is or As, each call
walks the chain separately, which could be wasteful. Sometimes, a
package will provide a function to retrieve information from an
unexported error type, as in this old version of gRPC's status.Code
function
.
Is and As cannot help here at all. For cases like these, programs
can traverse the error chain directly.

Traversing the error chain multiple times to call Is is wasteful,
and it's not entirely trivial to do it oneself. In fact, the way to
iterate through errors has changed recently with the advent of
multiple error wrapping.
If many people had been traversing the error chain directly, their
code would now be insufficient.

If we're checking for some particular code in some error type, it's
tempting (but wrong) to write something like this:

var errWithCode *errorWithCode
if errors.As(err, &errWithCode) {
	switch err.Code {
	case CodeFoo:
	case CodeBar:
	...
	}
}

The above code is wrong because there might be several errors in the
error tree of type *errorWithCode, but we will only ever see the
first one. It would be possible to abuse the Is method to consider
only the Code field when comparing errorWithCode types, but that
seems like an abuse: Is is really intended for identical errors,
not errors that one might sometimes wish to consider equivalent.

With the advent of iterators, we now have a natural way to design an
API that efficiently provides access to all the nested errors without
requiring creation of an intermediate slice.

I propose the following two additions to the errors package:

package errors
import "iter"

// Iter returns an iterator that iterates over all the non-nil errors in
// err's tree, including err itself. See [As] for the definition of the
// tree.
func Iter(err error) iter.Iter[error]

// IterAs is like [Iter] except that the iterator produces
// only items that match type T.
func IterAs[T any](err error) iter.Iter[T]

Technically only IterAs is necessary, because IterAs[error] is entirely equivalent to Iter, but Iter is more efficient and IterAs is easily implemented in terms of it.

Both Is and As are easily and efficiently implemented in terms of the above API.

I consider IterAs to be worthwhile because it's convenient and type-safe to use, and it hides the not-entirely-trivial interface check behind the API.

The flawed code above could now be written as follows, correctly this time, and slightly shorter than the original:

for err := range errors.IterAs[*errorWithCode] {
   switch err.Code {
   case CodeFoo:
   case CodeBar:
   ...
   }
}

I've pushed a straw-man implementation at https://go-review.googlesource.com/c/go/+/573357

@gopherbot gopherbot added this to the Proposal milestone Mar 21, 2024
@ianlancetaylor
Copy link
Contributor

CC @neild

@neild
Copy link
Contributor

neild commented Mar 21, 2024

Iterating the error tree became substantially more complicated when we introduced multiple-wrapping. Prior to then, you could just call errors.Unwrap in a loop, which isn't so bad. I seem to recall that we considered adding a tree-iterator, but decided to wait until we figured out what the iterator API was going to look like.

We've got an iterator API now, so it seems reasonable to add an error-tree iteration function.

Perhaps errors.All and errors.AllAs[T] might be more in line with the naming we've been using for iterator functions?

As a higher-level comment, however, I believe that any error semantics where you need to iterate over the tree for everything matching a code are too confusing. I's reasonable for the errors package to export functions for tree-walking. It isn't reasonable for any other package to expect its callers to use tree-walking functions to examine its errors.

@earthboundkid
Copy link
Contributor

See #65428

@earthboundkid
Copy link
Contributor

Adding AllAs[E error](error) iter.Seq[E] for cases where you want to capture all instance of type E seems reasonable to me. I don’t think the other helpers are as obviously necessary.

@rogpeppe
Copy link
Contributor Author

Perhaps errors.All and errors.AllAs[T] might be more in line with the naming we've been using for iterator functions?

Yup, I was thinking exactly that last night. I've updated my PoC CL accordingly.

As a higher-level comment, however, I believe that any error semantics where you need to iterate over the tree for everything matching a code are too confusing. I's reasonable for the errors package to export functions for tree-walking. It isn't reasonable for any other package to expect its callers to use tree-walking functions to examine its errors.

I'm not so sure. Consider a function F that can return an error with a code as I described. I don't think that's too unreasonable.
Then consider a function G that runs a collection of functions concurrently and returns all the errors as a single error. That seems OK too to me. Say I'm using G to run multiple calls to F. In that case I think it's perfectly reasonable to use errors.All to iterate over all the errors looking for any that has a given code.

As another concrete example, and part of the motivation for creating this proposal, the Open Container Initiative (OCI) protocol specifies that errors are returned like this:

{
        "errors": [
            {
                "code": "<error identifier, see below>",
                "message": "<message describing condition>",
                "detail": "<unstructured>"
            },
            ...
        ]
}

That is, it's always possible for an API call to return multiple errors. To me it seems natural to represent this as a single error containing an error for each entry in errors. But then there's currently no easy way to say "give me any error with a given code, including its detail information".

So although I'd like to agree with your sentiment, I think that there are legitimate cases where it's OK, or even necessary, to expect clients to use the tree-walking function directly.

@rogpeppe
Copy link
Contributor Author

@earthboundkid Apologies for duplicating your proposal! Happy to close this as a dupe if you like, although it's a little different so perhaps worth keeping for a bit until a decision is made?

Adding AllAs[E error](error) iter.Seq[E] for cases where you want to capture all instance of type E seems reasonable to me. I don’t think the other helpers are as obviously necessary.

When you say "other helpers", I think there's only one, right? errors.All.

Although it's true that that is technically unnecessary (as I pointed out, errors.AllAs[error] is exactly equivalent to errors.All), I think that it's still worth including errors.All. For one, it's the foundation of all the other functions (Is, As and AllAs). Within the errors package there is currently unnecessary duplication of the tree-walking logic which can be removed by using All. As such, I'd suggest that it's the simplest and most obvious primitive available, and hence worth providing as part of the errors API.

@gopherbot
Copy link

Change https://go.dev/cl/573357 mentions this issue: errors: implement All and AllAs

@earthboundkid
Copy link
Contributor

In the other issue there was some skepticism about how common the need for users to walk the tree themselves is. I think making all a package private function and only exporting AllAs would probably be fine for most users, but OTOH, it probably doesn't hurt to export All either.

@rogpeppe
Copy link
Contributor Author

In the other issue there was some skepticism about how common the need for users to walk the tree themselves is.

If there wasn't a good use case for users to walk the tree themselves, we wouldn't be proposing adding AllAs either :)

@rogpeppe
Copy link
Contributor Author

One other thing that's perhaps worth pointing out: I tried a generic version of As with this implementation:

func AsType[T any](err error) (T, bool) {
	for err := range All(err) {
		if x, ok := err.(T); ok {
			return x, true
		}
		if err, ok := err.(interface{ As(any) bool }); ok {
			var x T
			if err.As(&x) {
				return x, true
			}
		}
	}
	return *new(T), false
}

It turned out quite a bit more efficient than the regular As:

BenchmarkIs-8       	15840567	        67.78 ns/op	      24 B/op	       1 allocs/op
BenchmarkAs-8       	 5445650	       219.6 ns/op	      40 B/op	       2 allocs/op
BenchmarkAsType-8   	16187889	        72.72 ns/op	      24 B/op	       1 allocs/op

I also tried implementing AsType atop AllAs:

func AsType[T any](err error) (T, bool) {
	for x := range AllAs[T](err) {
		return x, true
	}
	return *new(T), false
}

Unfortunately the performance was considerably worse:

BenchmarkIs-8       	17058822	        68.82 ns/op	      24 B/op	       1 allocs/op
BenchmarkAs-8       	 5518171	       218.3 ns/op	      40 B/op	       2 allocs/op
BenchmarkAsType-8   	 4377482	       279.8 ns/op	     192 B/op	      10 allocs/op

Hopefully the compiler will be able to make that better in time.

@neild
Copy link
Contributor

neild commented Mar 22, 2024

@earthboundkid Apologies for responding to this proposal and not #65428; I must have missed yours when it was created.

@rogpeppe

Then consider a function G that runs a collection of functions concurrently and returns all the errors as a single error. That seems OK too to me. Say I'm using G to run multiple calls to F. In that case I think it's perfectly reasonable to use errors.All to iterate over all the errors looking for any that has a given code.

I don't think G should return an error that wraps the results of all the individual sub-operations. If the user is likely to care about individual errors, it should return an error that contains the per-operation errors, but does not wrap them.

As a concrete example, let's say we have a function that downloads a set of URLs to disk:

// Download writes the contents of each URL in urls to a file under target.
func Download(target string, urls ...*url.URL) error {}

If Download wraps the errors of each individual fetch, it could return an error that simultaneously indicates that the operation failed because a server rejected the request, because a file could not be written to, and because a network timeout occurred. All this, while also successfully fetching at least one other file. And unpacking this information requires the user to understand the structure of the error tree.

I'd instead either return an []error:

// Download writes the contents of each URL in urls to a file under target.
// It returns one error for each URL.
func Download(target string, urls ...*url.URL) []error {}

Or define a structured error that contains the individual errors (and does not wrap them):

type DownloadError struct {
  ErrByURL map[*url.URL]error
}

This makes the structure of the error explicit and is easy to work with.

As another concrete example, and part of the motivation for creating this proposal, the Open Container Initiative (OCI) protocol specifies that errors are returned like this:

I would probably represent this as something like:

// CodeError is an OCI error code. 
type CodeError string

func (e CodeError) Error() string { return string(e) }

const (
  ErrBlobUnknown = CodeError("BLOB_UNKNOWN")
  // etc.
)

// ErrorComponent is a single component of an OCI error response.
//
// It intentionally does not implement error, since an error can consist of multiple components.
type ErrorComponent struct {
  Code    CodeError
  Message string
  Detail  any // or string, or whatever
}

// Error is an OCI error response.
type Error struct {
  Errs []ErrorComponent
}

func (e *Error) Error() string { /* return some string here */ }
func (e *Error) Is(target error) bool {
  for _, c := range e.Errs {
    if c.Code == target {
      return true
    }
  }
}

Usage:

// Does this error contain a BLOB_UNKNOWN error?
if errors.Is(err, ErrBlobUnknown) {
}

// Examine the individual components of the error. 
// (Hopefully not common, this seems like a real pain to deal with.)
var e Error
if errors.As(err, &e) {
  for _, c := range e.errs {
    // ...
  }
}

@rogpeppe
Copy link
Contributor Author

@neild I agree that both of your suggestions are viable (although the error message strings implied by your CodeError definitely leave something to be desired), but ISTM that you're essentially arguing against errors.Join for any situation where the underlying errors aren't well known by the caller of Join.

That's a reasonable stance, but given that we have errors.Join (and it's definitely being used), I think it's reasonable to want to traverse the resulting error tree.

@mitar
Copy link
Contributor

mitar commented Mar 24, 2024

When implementing traversing inside my gitlab.com/tozd/go/errors errors package, I realized that there are multiple ways to traverse and that it is really hard to make one function how you want to traverse:

  • Depth-first or breadth-first (example: you are searching if existing error has a stack trace or not)?
  • Traverse until you find an error which matches any of the provided interfaces/targets (or) - useful when you have errors from different libraries which implement different interfaces to expose some data.
  • Traverse until you find an error which matches all of the provided interfaces/targets.

So I think traversing errors should have API closer to walking directory structure with a function callback and not an iterator. In the case of errors, the directory is an joined error. At that point you have to make a decision how you want to continue iterating and if at all. Instead of files, you can have a chain of parent errors until the joined error, or something.

@Merovius
Copy link
Contributor

Here is an observation, that is IMO interesting, but not terribly actionable: When the errors.Is and errors.As APIs where originally proposed, my main objection to them was that they are effectively implementing nominal subtyping (“inheritance”) - but purely dynamically. Under that lens, introducing errors.Join and Unwrap() []error then added multiple inheritance.

The problem @rogpeppe is now running into is thus the diamond problem of multiple inheritance: There are multiple paths along which his errorWithCode is "inherited" by the error value that is being inspected and you get different error codes, depending on which path you follow. Saying "we need a more complicated API for walking the tree" pretty much means "we should make it simpler to implement custom Method Resolution Order algorithms".

Ultimately, these design problems around the errors API are exactly why Go chose structural subtyping for its interfaces: It avoids the path-dependency in the subtype graph.

That's not actionable, of course. The API is, what it is and we must deal with it, pragmatically. But it describes and predicts the scope of the problems.

@neild
Copy link
Contributor

neild commented Mar 25, 2024

Depth-first or breadth-first (example: you are searching if existing error has a stack trace or not)?

Depth-first. Is and As do a depth-first traversal, and any new functions that traverse an error tree should follow suit.

(Why depth-first? We needed to pick one, and when we added error wrapping to std every other package that I surveyed was using depth-first traversal.)

a function callback and not an iterator

An iterator is a function that takes a callback; see #61897.

@neild
Copy link
Contributor

neild commented Mar 25, 2024

The proposal defines AllAs's type parameter as any. Is there any reason this shouldn't be error?

func AllAs[T error](err error) iter.Seq[T]

I remember there being a reason As permits its target parameter to be a *T, where T is an interface that doesn't implement error, but I can't remember what it is.

@rogpeppe
Copy link
Contributor Author

The proposal defines AllAs's type parameter as any. Is there any reason this shouldn't be error?

func AllAs[T error](err error) iter.Seq[T]

I remember there being a reason As permits its target parameter to be a *T, where T is an interface that doesn't implement error, but I can't remember what it is.

I think it's probably because not all interfaces that we might be interested in inspecting the error for necessarily implement error themselves. As a not-very-good example, it's legitimate to use errors.As to look for an error that implements json.Marshaler, even though json.Marshaler clearly doesn't implement error.

I suspect that consideration was amplified by the fact before Go 1.14, issue #6977 meant that it generally wasn't a good idea to embed interfaces because that ran the risk of conflicts with other types doing the same thing.

In general, I think I agree with you and support using error rather than any (it's always be possible to embed error at the call site).

There's one thing that gives me pause (and was the reason I chose any for this proposal): the inconsistency with errors.As, and the fact that it's possible that some As method implementations may be looking for a pointer to a particular interface type that does not contain error. I'm not sure that consideration is very important though.

@rogpeppe
Copy link
Contributor Author

@mitar

Depth-first or breadth-first (example: you are searching if existing error has a stack trace or not)?

For your particular use case (a package deeply involved with the core error implementation), I'd suggest that you'd be best off just writing your own tree-traversal function. Most users will not be in that situation.

Both of your other points are easily implemented using the proposed API.

FWIW I have also considered whether it might be worth adding an API like this:

package errors

// Find traverses the error tree of err and returns the
// first error e for which test(e) returns true.
func Find(err error, test func(error) bool) error

but on balance I thought this was too trivial to be worthwhile.

So I think traversing errors should have API closer to walking directory structure with a function callback and not an iterator.

You might be amused to see that I've actually proposed an API for walking a directory structure with an iterator: #64341
I think it can work pretty decently.

@Merovius
Copy link
Contributor

In there a benefit in making the constraint error? I don't see anything in the signature or implementation of AllAs that needs - or even benefits from - the type argument being an error.

@neild
Copy link
Contributor

neild commented Mar 25, 2024

Making the constraint error avoids accidentally passing *SomeError when you meant SomeError or vice-versa. We have a vet check to catch this mistake in errors.As, but a compile-time check is better.

@neild
Copy link
Contributor

neild commented Mar 25, 2024

The current proposal as I understand it:

package errors
import "iter"

// All returns an iterator that iterates over all the non-nil errors in err's tree.
//
// The tree consists of err itself, followed by the errors obtained by repeatedly calling its
// Unwrap() error or Unwrap() []error method. When err matches multiple errors,
// All examines err followed by a depth-first traversal of its children.
func All(err error) iter.Seq[error]

// AllAs returns an iterator that iterates over all the errors in err's tree that match target.
//
// See All for a definition of err's tree.
// See As for a definition of how an error matches the target.
func AllAs[T error](err error) iter.Seq[T]

Is that right?

@mitar
Copy link
Contributor

mitar commented Mar 25, 2024

I remember there being a reason As permits its target parameter to be a *T, where T is an interface that doesn't implement error, but I can't remember what it is.

I asked ChatGPT about that and it made a good argument: By accepting interface{} as the target, errors.As allows you to check if an error implements any interface, providing greater flexibility and utility in type assertion scenarios.

So you maybe have your errors implement an interface which does not embed error interface itself, but you still want to search for that interface. I think that is a valuable use case. For example, you could search for an error which implements Marshaler interface.

@neild
Copy link
Contributor

neild commented Mar 25, 2024

That's a great example of ChatGPT saying something that looks useful, but really isn't.

Yes, you can check for a json.Marshaler:

var m json.Marshaler
if errors.As(err, &m) {
  b, err := m.MarshalJSON()
  // ...
}

But you also write that with:

var m interface {
  json.Marshaler
  error
}
if errors.As(err, &m) {
  // ...
}

So this isn't about flexibility or utility, but maybe it's about convenience.

I still don't recall whether there was one motivating example that informed the decision on As, but I do recall now that we drew an analogy to type assertions. As is essentially a type assertion mapped across the error chain. You can type assert from an error to a json.Marshaler, so As permits the equivalent. You can't type assert from an error to an int, so As rejects this. These checks necessarily need to happen at runtime, so vet includes a static check to allow early detection of mistakes.

On one hand, we could say that the same argument applies to AllAs, and AllAs's type constraint should be any. In this case, we will need to extend the As vet check.

On the other hand, we could say that type parameters allow us a greater degree of compile-time checking than was available when we added As and that you can easily write a loop with All in the event that you do need to assert to a non-error interface type, and that therefore AllAs's type constraint should be error.

I lean somewhat towards a constraint of error, as being clearer, offering more compile-time safety, and requiring less mechanism.

@rogpeppe
Copy link
Contributor Author

I lean somewhat towards a constraint of error, as being clearer, offering more compile-time safety, and requiring less mechanism.

I agree. The only slight friction to my mind is the kind of scenario demonstrated here:
https://go.dev/play/p/Ayyuk-oAwiW?v=gotip

In this case there's no way to get errors.AllAs to produce the error that errors.As extracts.
This is, however, a fictional scenario: I'm not convinced that this situation would ever arise in practice,
and the fact that the caller can't use the type as-is is perhaps a good indication to the caller that
there might be an issue.

@rsc rsc changed the title proposal: errors: provide a way to iterate over nested errors proposal: errors: add All and AllAs iterators Apr 24, 2024
@rsc
Copy link
Contributor

rsc commented Apr 24, 2024

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Active
Development

No branches or pull requests

8 participants