Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

errors: Is and As make it impossible to maintain error hygiene #32362

Closed
rogpeppe opened this issue May 31, 2019 · 26 comments
Closed

errors: Is and As make it impossible to maintain error hygiene #32362

rogpeppe opened this issue May 31, 2019 · 26 comments
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.

Comments

@rogpeppe
Copy link
Contributor

$ go version
go version devel +e883d000f4 Wed May 29 13:27:00 2019 +0000 linux/amd64

It's typical that Go code has many lines of code that return errors as a consequence of other errors. In large programs, it can be hard to refactor this code with confidence because by changing the code, we might be changing the set of possible errors that can be returned, and some unknown caller might be broken by the refactor because they're relying on a specific error that's no longer returned.

The nature of such breakage is that it often goes unnoticed for long periods because errors are usually infrequent.

For this reason, I believe it's good to maintain what I call error hygiene: to restrict the set of returned errors to those that are explicitly part of a function's contract.

One good example of this is fmt.Errorf("message: %v", err) which hides the type and value of err from any subsequent Is or As call.

Unfortunately, I don't believe it's possible to preserve error hygiene in general with the scheme used for Is and As.

Suppose that we want to define the following function that we'll use to restrict error values to a particular type. This is a bare-minimum requirement for error hygiene.

// KeepType keeps only the given targetType as a possible target
// type for the As function.
//
// For any error e and types T and U:
//
//	As(KeepType(e, new(T)), new(U))
//
// will return true if and only if As(e, new(T)) is true and U is
// identical to T. In other words, it hides all types other than T from
// future inspections with As.
func KeepType(err error, targetType interface{}) error

How can we implement this function? For errors that don't fit the designated target type, the implementation is easy: just hide the whole error chain. However, if there's an error that does fit the target type, there's a problem: we can't preserve the target type without also preserving the error chain that's below it, because in general the chain is held in a field within the error value that we might not have access to. We can't make a new error value because then it wouldn't be the required target type.

I think that As and Is are thus potentially problematic and we should consider whether it's a good idea to keep them as is. (so to speak :) )

@mvdan mvdan added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label May 31, 2019
@rogpeppe
Copy link
Contributor Author

@mpvl @jba

@bcmills
Copy link
Contributor

bcmills commented May 31, 2019

The generic KeepType function seems like a bit of a red herring: if you're maintaining the sort of error hygiene you describe, you know the specific set of types T in question, and you can use functions or methods specific to those types to construct opaque instances.

You could even imagine adding a new well-known method (perhaps Flatten() error?) that allows the type to construct such an instance itself.

@rogpeppe
Copy link
Contributor Author

I don't believe it's entirely a red-herring.

The type you're wanting to preserve might not be defined in the package that wants to preserve it. It's not uncommon for modules to use a common errors package shared by many packages, for example. Another example: I might want to preserve a database-specific error type. Pq isn't a great example because all the fields are exported, but that might not be the case in future or with other drivers.

Also, AFAICS the flatten operation can need O(num-errors-in-chain) allocations if you want to preserve stack information, which isn't ideal. I guess there might be ways around that though.

@rogpeppe
Copy link
Contributor Author

FWIW the same issue applies to Is, although it's likely to be less of a common problem.
Consider:

var MyEOF = fmt.Errorf("my EOF: %w", io.EOF)

This satisfies both errors.Is(MyEOF) and errors.Is(io.EOF) but there's no way to preserve the former without also preserving the latter.

@bcmills
Copy link
Contributor

bcmills commented May 31, 2019

This satisfies both errors.Is(MyEOF) and errors.Is(io.EOF) but there's no way to preserve the former without also preserving the latter.

That's true, but I think that only reinforces my point: it is up to the author of the error (or error type) — not the user of the error — to decide how much information that type or value is capable of masking.

@bcmills
Copy link
Contributor

bcmills commented May 31, 2019

AFAICS the flatten operation can need O(num-errors-in-chain) allocations if you want to preserve stack information, which isn't ideal. I guess there might be ways around that though.

If you want to preserve the first k errors in a chain of N links, you only need to allocate k new links in general: the remainder can opaquely wrap the tail of the chain (using %v instead of %w).

@earthboundkid
Copy link
Contributor

I think the problem is the errors.As/Is function will check for As/Is methods on the first error, but if it returns false, they will keep unwrapping the chain and checking until they find something that returns true. They should instead short-circuit the first time an error in the chain implements an As/Is method.

@JavierZunzunegui
Copy link

var MyEOF = fmt.Errorf("my EOF: %w", io.EOF)

This satisfies both errors.Is(MyEOF) and errors.Is(io.EOF) but there's no way to preserve the former without also preserving the latter.

If I understand your issue correctly, the more general form of what you ask is

given err = ErrorA /*wraps*/ ErrorB /*wraps*/ ErrorC
can there be a way to convert it to
ErrorA /*wraps*/ ErrorC

I don't think there is a way of doing that in the errors package as it is currently proposed.

It can be done trivially easy if doing error wrapping using a linked list of errors, but that won't be supported under the new changes.

An example: JavierZunzunegui@4489292#diff-653dcb659c69a84ac6ce23e6e48509a3R6

@jba
Copy link
Contributor

jba commented May 31, 2019

I believe this is an implementation of KeepType:

func KeepType(err error, targetType interface{}) error {
    return &keepError{err, reflect.PtrTo(reflect.TypeOf(targetType))}
}

type keepError struct {
    err        error
    targetType reflect.Type
}

func (k *keepError) Error() string {
    return k.err.Error()
}

func (k *keepError) As(target interface{}) bool {
    if reflect.TypeOf(target) != k.targetType {
        return false
    }
    return errors.As(k.err, target)
}

Proof: KeepType should "return true if and only if As(e, new(T)) is true and U is identical to T. "

If direction: if U is identical to T, we skip past the initial if statement, execute As(e, new(T)) (since U is identical to T) and return its value, so if U is identical to T and As(e, new(T)) returns true, so does KeepType.

Only if direction: we need to show that if U is not identical to T or if As(e, new(T)) is false, then KeepType returns false. The initial if handles the first disjunct, and the subsequent call to errors.As handles the second.

@jba
Copy link
Contributor

jba commented May 31, 2019

They should instead short-circuit the first time an error in the chain implements an As/Is method.

We considered that seriously for a while. Ultimately we rejected it. It does give the As/Is methods more power, but the cost is that every such method must follow the chain to get the right answer. We opted for the less powerful but easier to implement semantics.

@jba
Copy link
Contributor

jba commented May 31, 2019

In my comment above I left out an important part of the proof: the keepError type has no Unwrap method, so the return value from its As method is the return value from errors.As.

I also mixed up the wording: I wrote about whatKeepType returns but I meant what errors.As(KeepType(...), ...) returns.

@earthboundkid
Copy link
Contributor

Another way to think about this problem is that there are multiple uses for error chains:

  • you may want to know about all errors in a chain, for logging
  • you may only want to know about some errors in a chain, for handling
  • you may want some ability to format error output or keep stack frames (dropped from 1.13)

The current Is/As functions tie the first two kinds of errors chains together, when conceptually they are different.

@earthboundkid
Copy link
Contributor

earthboundkid commented Jun 1, 2019

every such method must follow the chain to get the right answer.

Just call the function on your wrapped error if you want the default semantics. That’s not hard to implement. Or just don’t have a method at all and get default semantics for free. I think you only want the method at all because you want to do something unusual.

@earthboundkid
Copy link
Contributor

KeepType needs to remember its Fleetwood Mac: never break the chain!

@neild
Copy link
Contributor

neild commented Jun 1, 2019

@rogpeppe

Consider:

var MyEOF = fmt.Errorf("my EOF: %w", io.EOF)

This satisfies both errors.Is(MyEOF) and errors.Is(io.EOF) but there's no way to preserve the former without also preserving the latter.

I struggle to see a situation where you want non-transitive Is: x is MyEOF, MyEOF is io.EOF, but x is not io.EOF.

The similar case with As seems equally confusing: a can be converted to b, b can be converted to c, but a cannot be converted to c?

A concrete example would help here, I think.

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Jun 1, 2019

That's true, but I think that only reinforces my point: it is up to the author of the error (or error type) — not the user of the error — to decide how much information that type or value is capable of masking.

This seems odd to me. Why should the power of hiding errors be up to the creator of the error rather than the code that wants to hide it?

There is an alternative to walking the chain. Each time we wrap an error, instead of recording a link in the chain, we instead decide whether to keep or drop the underlying error value. I don't actually believe that it's necessary to expose the concept of an error chain at all.

Now that the printing and source location concerns have been removed from the errors package, it has become clear to me that all we really need here is a way of distinguishing error metadata from the actual error itself. By metadata, I mean extra information such as message annotation and source location info.

One thing that was concerning to me about the current proposal is the amount of mechanism involved. Not only do we have an Unwrap type, we also have other interface conventions, such as interface{Is(error)bool} and interface{As(interface{}) bool}, and the As implementation requires the non-trivial reflectlite package as a dependency.

To me, this feels a bit wrong. Error handling is one of the lowest levels of the language. I would like an errors package that feels irreducibly simple, and the current proposal doesn't currently make me feel that way.

How about something like this instead, as the entire implementation of the errors package?

package errors

// New returns an error that formats as the given text.
//
// The returned error's E-value is itself.
func New(s string) error {
	return &errorString{s}
}

// errorString is a trivial implementation of error.
type errorString struct {
	s string
}

func (e *errorString) Error() string {
	return e.s
}

// Error may be implemented by an error value to signify that the error
// value is adding metadata to some underlying error (the "E-value").
type Error interface {
	error

	// E returns the underlying error, or E-value. If it returns
	// nil, the receiver itself should be treated as the
	// E-value.
	E() error
}

// E returns the "E-value" of an error - the part of the error that
// should be used for error diagnosis.
//
// When writing code that makes a decision based on an error, the E
// value should always be used in preference to the error value itself,
// because that allows functions to add metadata to the error, such as
// extra message annotations or source location information, without
// obscuring the actual error.
func E(err error) error {
	for {
		err1, ok := err.(Error)
		if !ok {
			return err
		}
		e := err1.E()
		if e == nil {
			return err
		}
		err = e
	}
}

We don't need the Is or As methods any more, because they're trivially implemented in normal Go:

 if errors.E(err) == io.EOF { .... }

 if err, ok := errors.E(err).(*os.PathError); ok { ... }

Although the E method looks superficially similar to Unwrap, it's not the same, because error wrappers don't need to preserve the error chain - they can just keep the most recent E-value of the error that's being wrapped. This means that error inspection is O(1). The reason for the loop inside the E function is to keep error implementations honest and to avoid confusion: errors.E(err) will always be the same as errors.E(errors.E(err)).

Once we have this, error annotation and other error metadata can be left to other packages. The %v and %w verbs of the fmt package can retain their current semantics.

Here's a simple error-wrapping example:

// Notef returns err prefixed by the given formatted message.
// The returned error's E-value is left unchanged.
// If err is nil, Notef returns nil.
func Notef(err error, msg string, args ...interface{}) error {
	if err == nil {
		return nil
	}
	if m := fmt.Sprintf(msg, args...); m != "" {
		return noteError{
			msg: m,
			err: errors.E(err),
		}
	}
	return err
}

type noteError struct {
	msg string
	err error
}

func (err noteError) Error() string {
	return err.msg + ": " + err.err.Error()
}

func (err noteError) E() error {
	return err.err
}

@mvdan
Copy link
Member

mvdan commented Jun 1, 2019

One thing that was concerning to me about the current proposal is the amount of mechanism involved.

Just want to add a big +1 to this point in particular. I think the current design's reliance on reflect isn't a good idea, and the need for reflectlite to exist is a good example.

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Jun 1, 2019

Responding to a couple of previous questions:

@jba

I believe this is an implementation of KeepType:

Yes, that works. I'd forgotten about the As-method check. But, as I said above, I'm really not comfortable with the complexity of the current implementation. The As and Is interfaces seem cumbersome to me. Why should an error need to be able to pretend to be something else?

With the trivial implementation I've suggested above, all the complexity melts away AFAICS.

@neild

Consider:

var MyEOF = fmt.Errorf("my EOF: %w", io.EOF)

This satisfies both errors.Is(MyEOF) and errors.Is(io.EOF) but there's no way to preserve the former without also preserving the latter.

I struggle to see a situation where you want non-transitive Is: x is MyEOF, MyEOF is io.EOF, but x is not io.EOF.

Think about io.ErrUnexpectedEOF. You don't want errgo.Is(err, io.EOF) to return true when err is io.ErrUnexpectedEOF. They're different kinds of error, even though io.ErrUnexpectedEOF is directly related to io.EOF.

The similar case with As seems equally confusing: a can be converted to b, b can be converted to c, but a cannot be converted to c?

I don't understand this.

@jba
Copy link
Contributor

jba commented Jun 1, 2019

  • you may want to know about all errors in a chain, for logging
  • you may only want to know about some errors in a chain, for handling
    ...

The current Is/As functions tie the first two kinds of errors chains together, when conceptually they are different.

The intent of Unwrap/Is/As is to provide error information to programs, for handling (the second bullet). The Error method returns the string for logging. It should include information for the entire list of errors, even those not returned by Unwrap.

The two cases correspond to two Errorf verbs:

  • fmt.Errorf("...: %v", err): display err for logging, but not handling
  • fmt.Errrof("...: %w", err): display err for logging, and also make it available for handling

@jba
Copy link
Contributor

jba commented Jun 1, 2019

I think the current design's reliance on reflect isn't a good idea, and the need for reflectlite to exist is a good example.

It's unfortunate that we needed reflectlite, but that's an implementation detail. You need reflection for As, and we think As is important because it generalizes the common practice of type-asserting errors to the case of error chains.

@jba
Copy link
Contributor

jba commented Jun 1, 2019

I don't actually believe that it's necessary to expose the concept of an error chain at all.

We address this in our draft design: "We feel that a single error is too limited a view into the error chain. More than one error might be worth examining."

Why should an error need to be able to pretend to be something else?

So Is can be used in place of error predicates. For example, to ask if err is temporary, you write errors.Is(err, os.ErrTemporary). This is used in several places in 1.13, notably the net package.
For example, see the net.DNSError.Is method.

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Jun 1, 2019

I don't actually believe that it's necessary to expose the concept of an error chain at all.

We address this in our draft design: "We feel that a single error is too limited a view into the error chain. More than one error might be worth examining."

I find it difficult to buy the justification there. There's only one example in the design document (os.PathError), and os.PathError is almost exclusively used just for its error string annotation. The single counter-example in the standard library is arguably only there because io.Copy doesn't indicate whether the error came from the reader or the writer.

There is nothing stopping you examining more than one error - any given error type can expose access to its underlying error; but I haven't yet seen any persuasive argument that all error inspection needs to traverse the chain. I think that this single aspect of the design accounts for almost all the complexity of the result, so it's worth more justification IMHO.

Why should an error need to be able to pretend to be something else?

So Is can be used in place of error predicates. For example, to ask if err is temporary, you write errors.Is(err, os.ErrTemporary). This is used in several places in 1.13, notably the net package.
For example, see the net.DNSError.Is method.

I'm not sure that the generality of Is is needed, at least for the current use cases in the standard library.

Of the eight implementation of the Is method in the standard library, all but two are there to implement temporary/timeout errors. Is errors.Is(err, os.ErrTemporary) qualitatively better than os.IsTimeout(err)?

If we want syscall.Errno to be associated with os package errors, we could make it define a method that exposes that functionality:

func (e Errno) OSError() error {
	switch e {
	case EACCES, EPERM:
		return oserror.ErrPermission
	case EEXIST, ENOTEMPTY:
		return oserror.ErrExist
	case ENOENT:
		return oserror.ErrNotExist
	}
	return nil
}

Then in the os package, we could define:

// OSError is implemented by errors that may be associated with
// an error defined in the os package.
type OSError interface {
	OSError() error
}

// Error returns the underlying OS error of the
// given error, if there is one, or nil otherwise.
//
// For example, os.IsNotExist(err) is equivalent to
//     os.Error(err) == os.ErrNotExist.
func Error(err error) error {
	err1, ok := err.(OSError)
	if ok {
		return err1.OSError()
	}
	return nil
}

This would then cover the only other place that Is is defined in the standard library: in internal/web.HTTPError:

func (e *HTTPError) OSError() error {
	if e.StatusCode == 404 || e.StatusCode == 410 {
		return os.ErrNotExist
	}
	return nil
}

I'm very sorry that I seem to have seriously derailed this from the original issue. All this really belongs in a separate issue, as it's clear that you can maintain error hygiene with the existing design, albeit with some awkwardness (this is how KeepType would actually need to look, I think).

I'm just uncomfortable with the current design. If it lands now, we'll be stuck with it forever and I don't feel we've had enough experience so far with it to justify it.

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Jun 1, 2019

It's unfortunate that we needed reflectlite, but that's an implementation detail. You need reflection for As, and we think As is important because it generalizes the common practice of type-asserting errors to the case of error chains.

Go already has a type assertion statement. The fact that such a low level package as errors needs to resort to interface{} and reflect-based magic is surely not ideal. Beginners will need to be exposed to this from a very early stage, and it's really not easy to explain.

@neild
Copy link
Contributor

neild commented Jun 1, 2019

Think about io.ErrUnexpectedEOF. You don't want errgo.Is(err, io.EOF) to return true when err is io.ErrUnexpectedEOF. They're different kinds of error, even though io.ErrUnexpectedEOF is directly related to io.EOF.

This example confuses me, because errors.Is(io.ErrUnexpectedEOF, io.EOF) does return false. They may be related, but ErrUnexpectedEOF is not EOF.

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Jun 1, 2019

This example confuses me, because errors.Is(io.ErrUnexpectedEOF, io.EOF) does return false. They may be related, but ErrUnexpectedEOF is not EOF.

Yes, that's of course true. But if you're thinking of error chains, one could consider that io.ErrUnexpectedEOF is a kind of wrapper for io.EOF. You wouldn't want it to be visible as such, and that's a part of my point. Even though the underlying error is notionally io.EOF (that's what caused the unexpected-EOF error), you'd never want errors.Is(err, io.EOF) to return true, because the outer error takes precedence. I think that's true in general, and that any unwrapping functionality should be domain-specific rather than generic across all errors.

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Jun 3, 2019

I'm very sorry that I seem to have seriously derailed this from the original issue. All this really belongs in a separate issue, as it's clear that you can maintain error hygiene with the existing design

I'm closing this issue in favour of #32405.

@rogpeppe rogpeppe closed this as completed Jun 3, 2019
@golang golang locked and limited conversation to collaborators Jun 2, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

8 participants