Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

context: Cannot tell which deadline or timeout cause expiration #35791

Closed
darkfeline opened this issue Nov 22, 2019 · 7 comments
Closed

context: Cannot tell which deadline or timeout cause expiration #35791

darkfeline opened this issue Nov 22, 2019 · 7 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@darkfeline
Copy link
Contributor

It is not currently possible to tell which deadline or timeout caused a context to expire. This makes debugging timeouts impossible.

Here's an example of when this is a problem: we have some RPC that is constantly failing because of context.DeadlineExceeded. We want to figure out which deadline is being exceed so we can evaluate whether we should increase it. The code looks like the following (may not be correct or idiomatic, demonstrative only):

func main() {
  // ...
  err := Foo(ctx)  // This is failing with context.DeadlineExceeded
}

func Foo(ctx context.Context) error {
  ctx = context.WithDeadline(ctx, ...)
  return Bar(ctx)
}

func Bar(ctx context.Context) error {
  ctx = context.WithDeadline(ctx, ...)
  return Baz(ctx)
}

func Baz(ctx context.Context) error {
  return doThing(ctx)  // May have internal deadlines
}

There is no convenient way to tell whether Foo or Bar's deadline caused the context to expire. It is possible, strictly speaking:

func main() {
  // ...
  err := Foo(ctx)  // This is failing with context.DeadlineExceeded
  switch err {
  case fooExpiredErr:
  case barExpiredErr:
  }
}

func Foo(ctx context.Context) error {
  ctx = context.WithDeadline(ctx, ...)
  err := Bar(ctx)
  if errors.Is(err, context.DeadlineExceeded) && ctx.Err() != nil {
    return fooExpiredErr  // Implements errors.Is context.DeadlineExceeded
  }
  return err
}

func Bar(ctx context.Context) error {
  ctx = context.WithDeadline(ctx, ...)
  err := Baz(ctx)
  if errors.Is(err, context.DeadlineExceeded) && ctx.Err() != nil {
    return barExpiredErr  // Implements errors.Is context.DeadlineExceeded
  }
  return err
}

func Baz(ctx context.Context) error {
  return doThing(ctx)  // May have internal deadlines
}

Note that the above code is non-trivial to get right (I'm still not sure it's completely correct/bulletproof).

@darkfeline
Copy link
Contributor Author

API backward compatibility-wise, I have a few ideas of ways to add support to this. I think the implementation of any solution is straightforward, backward compatibility is the hardest part.

Solutions depend on what exactly should be considered part of the API of the context package.

It's not stated in the docs, but it's suggested in https://github.com/golang/go/wiki/CodeReviewComments#contexts to not create custom Context types. That suggests that adding methods to the context.Context does not break backward compatibility. If so, we can add a Origin() method that returns some value that can be associated with the originating deadline or cancellation.

It seems ambiguous to me whether Context.Err() is allowed to return values other than context.Canceled and context.DeadlineExceeded. If it can return other errors, we can add new error values that contain origination information. These errors can be errors.Is one of the two existing sentinal values.

If none of those pass the backward compatibility check, then we can add a new method/interface for the Origin() method, which all contexts are guaranteed to implement, and/or provide an Origin(Context) function like errors.Unwrap().

For adding originating information, we can add new functions WithCancelOrigin or WithDeadlineOrigin that accept a new parameter which is some origination information that would be queryable via one of the previous APIs, if it caused the context cancellation. The existing With* functions would be equivalent to calling the With*Origin functions with some generic origination value.

@toothrot toothrot added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Nov 26, 2019
@toothrot toothrot added this to the Backlog milestone Nov 26, 2019
@toothrot
Copy link
Contributor

/cc @Sajmani @bradfitz

@darkfeline
Copy link
Contributor Author

See also #26356

I am currently handling an outage that is made more difficult because I can't tell why a context was canceled. It's not fun.

@Sajmani
Copy link
Contributor

Sajmani commented Jun 25, 2020

Unfortunately I think it will be difficult to support this in a backwards-compatible way, because existing client code assumes that ctx.Err() is either nil, context.Canceled, or context.DeadlineExceeded:

    // If Done is not yet closed, Err returns nil.
    // If Done is closed, Err returns a non-nil error explaining why:
    // Canceled if the context was canceled
    // or DeadlineExceeded if the context's deadline passed.
    // After Err returns a non-nil error, successive calls to Err return the same error.
    Err() error

Switching such client code to use errors.Is(ctx.Err(), context.Canceled) would provide the flexibility we need to support this feature, but there's no way to ensure all such code is changed.

In hindsight, these error values should not have been exported, and we should have provided functions like context.IsCanceled(err) and context.IsDeadlineExceeded(err).

If we want to change this, we will need to introduce these functions long with lint/vet checks to complain about directly using the original values, and a go fix command to update client code (tricky, since we'll need to fix code that does switch ctx.Err()). This is possible but will take a year or two to allow client code to migrate before we can break things. (I don't think the Go 1 compatibility guarantee permits even this slow transition, but I think we need to start thinking about how we will make such API changes for Go standard libraries.)

CC @rsc

@Sajmani
Copy link
Contributor

Sajmani commented Jun 25, 2020

There's also this backwards-compatible proposal from @bcmills :
#26356 (comment)

@julien-may
Copy link

julien-may commented Nov 30, 2020

Can we mark this as a duplicate and close it in favor of #26356 ?

@ianlancetaylor
Copy link
Contributor

Yes, done. Thanks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

6 participants