Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: context: add API for performing tasks after Context is Done #19643

Closed
zombiezen opened this issue Mar 21, 2017 · 36 comments
Closed

proposal: context: add API for performing tasks after Context is Done #19643

zombiezen opened this issue Mar 21, 2017 · 36 comments

Comments

@zombiezen
Copy link
Contributor

zombiezen commented Mar 21, 2017

Summary

A common need when writing server code is performing a task that intentionally goes past the deadline of the request while preserving the request's context values (e.g. trace ID). These tasks may block a request handler from returning (synchronous) or may need to happen in the background, past the return of a request handler (asynchronous). Creating a Context that obtains values from a parent context correctly is difficult in the asynchronous case because:

  1. Context keys cannot be iterated over. This is intentional.
  2. Even if you could iterate over Context keys, there's no indication of which values should be preserved. For example, the trace ID of the original request should be preserved, but the background task should create a new span with its lifetime. However, any value that's tied to the lifetime of the request should not be present in the background task's Context.
  3. As we've found when forming this API, there's quite a few ways to get the concurrency wrong. Ensuring the parent and child have cancelation and wait channels piped through requires careful thought.

I would like to add these functions to the context package (but happy to start with x/net/context):

package context

// IgnoreCancel returns a new Context which takes its values from parent but
// ignores any cancelation or deadline on parent.
//
// IgnoreCancel must only be used synchronously.  To detach a Context for use in
// an asynchronous API, use Detach instead.
func IgnoreCancel(parent Context) Context

// A PreserveFunc takes a value stored in a parent Context and returns the
// corresponding value to be stored in a Context derived from it.
//
// If the PreserveFunc returns a nil interface{}, the detached Context
// will not contain any value for the key.
//
// If the detached value is non-nil, the PreserveFunc may return nil or a "close"
// function to destroy or clean up the value, which will be called exactly once
// when the value is no longer in use.
type PreserveFunc func(interface{}) (detached interface{}, close func())

// RegisterPreserveFunc associates a Context key with a PreserveFunc to be used to
// populate it.  RegisterPreserveFunc must only be called during package initialization.
func RegisterPreserveFunc(key interface{}, f PreserveFunc)

// Detach starts a new goroutine that calls f with a new Context that derives values
// from parent.
//
// The values in the new Context are obtained by calling the PreserveFunc for
// each registered key.  When f returns, the detached values will be "closed"
// using the function returned by the PreserveFunc. Keys without a PreserveFunc
// are not associated with any value in the new Context.  If the PreserveFunc
// for a key returns a nil value, the new Context will not have a value for that
// key.
//
// Detach returns a DetachedTask that can be used to cancel and/or wait for the
// function to return.  The new Context will only be Done if DetachedTask is
// canceled; the parent's cancelation or deadline is ignored.  Most users can
// safely ignore the DetachedTask.
//
// Detach must not be called until all packages have been initialized.
func Detach(parent Context, f func(ctx Context)) DetachedTask

// A DetachedTask describes the status of a detached function call
// started by calling Detach.
type DetachedTask struct {
	// ...
}

// Cancel cancels the Context passed to the function.
// After the first call, subsequent calls to Cancel do nothing.
func (t DetachedTask) Cancel()

// Finished returns a channel that is closed when the function has returned and
// the close functions for its Context have completed.
func (t DetachedTask) Finished() <-chan struct{}

/cc @bcmills @rakyll

Use Cases

Making another RPC to clean up a resource (like rolling back a transaction) after receiving a cancel. This would likely be a synchronous action (an IgnoreCancel usage).

func handler(ctx context.Context) {
  // do something...
  cleanupCtx, cancel := context.WithTimeout(context.IgnoreCancel(ctx), 5 * time.Second)
  Cleanup(cleanupCtx)
  cancel()
}

Wanting to log or update a counter after servicing a request but not wanting to block the response. This would be an asynchronous action (a Detach() usage).

func handler(ctx context.Context) {
  context.Detach(ctx, func(ctx context.Context) {
    if err := SendNotification(ctx); err != nil {
      log.Infof(ctx, "couldn't send background notification: %v", err)
    }
  })
@bcmills
Copy link
Contributor

bcmills commented Mar 21, 2017

Within the context package I think the function names Go and Task are a bit too ambiguous. How about Detach or StartDetached and DetachedTask instead?

@zombiezen
Copy link
Contributor Author

Done.

@jba
Copy link
Contributor

jba commented Mar 21, 2017

Go and Task are attractive. Too attractive. Who wouldn't want to use them instead of

ctx, cancel := context.WithCancel(ctx)
finc := make(chan struct{})
go func() {
  defer close(finc)
  ... ctx ...
}()

My point is, you've carved out a nice API that is independent of detach semantics. Is it worth separating the two?

@zombiezen
Copy link
Contributor Author

@jba We have found that separating the detachment semantics is quite error-prone. It's easy to detach and then use the context in a synchronous manner, which isn't how it should be used.

@bradfitz bradfitz changed the title context: add API for performing tasks after Context is Done proposal: context: add API for performing tasks after Context is Done Mar 21, 2017
@gopherbot gopherbot added this to the Proposal milestone Mar 21, 2017
@ianlancetaylor
Copy link
Contributor

// If the detached value is non-nil, the PreserveFunc may also return a "close"
// function to destroy or clean up the value, which will be called exactly once
// when the value is no longer in use.

When you say "may return" I take it that you mean that close may be returned as nil. You should say that.

// DetachedTask must not be called until all packages have been initialized.

That's an odd requirement. It also doesn't make sense as written. Do you mean that methods of DetachTask must not be called? Or do you mean that Detach must not be called until all packages have been initialized, to ensure that all calls to RegisterPreserveFunc have been made? Do we really need this requirement?

@zombiezen
Copy link
Contributor Author

@ianlancetaylor Clarified the nil, and fixed the bad find/replace that caused the confusion in the sentence you highlighted. I meant Detach must not be called.

@ianlancetaylor
Copy link
Contributor

Why do we need to say that Detach must not be called until all packages have been initialized?

@bcmills
Copy link
Contributor

bcmills commented Mar 21, 2017

Because we don't want Detach to race with RegisterPreserveFunc.

@ianlancetaylor
Copy link
Contributor

OK, it's not a big deal, but I sort of think we should say that instead. The current docs seem to be writing a useful guideline into the API. How about "Detach and RegisterPreserveFunc must not be called concurrently."

@bcmills
Copy link
Contributor

bcmills commented Mar 21, 2017

Since both Detach and RegisterPreserveFunc are global functions, they can both be called from libraries that bear no relationship to each other beyond both using context. I think we need to provide scalable guidance on how to avoid calling them concurrently, and the simplest way to do that seems to be to split them into "only in init" and "never in init".

@ianlancetaylor
Copy link
Contributor

I'm fine with guidance, I just don't think we should confuse guidance with API documentation.

@neild
Copy link
Contributor

neild commented Mar 22, 2017

Is RegisterPreserveFunc necessary?

type Preserver interface {
  // Preserve takes a value stored in a parent Context and returns the
  // corresponding value to be stored in a Context derived from it.
  Preserve(interface{}) interface{}
}

type Releaser interface {
  // Release takes a value returned by Preserve. It is called after the
  // detached task associated with the value completes.
  Release(interface{})
}

// Detach starts a new goroutine that calls f with a new Context that derives
// values from parent.
//
// The values in the new Context are obtained by calling the Preserve
// method for each registered key. When f returns, the key's Release
// method (if any) will be called with the new value.  Keys that do not
// implement Preserver are not associated with any value in the new
// Context. If Preserve for a key returns a nil value, the new Context
// will not have a value for that key.
func Detach(parent Context, f func(ctx Context)) DetachedTask

@zombiezen
Copy link
Contributor Author

@neild This is a subtle one. If this were purely an addition to context, then an interface like that would work. (I would be in favor of having just Preserver with a close-function return.) However, since we likely want this to work across versions of Go, the versions in x/net/context would not be able to iterate over the Context keys, making use of Preserver impossible in implementation.

@rogpeppe
Copy link
Contributor

My gut feeling is that I'm not keen, but I'd like to see more concrete use cases.

Firstly, IgnoreCancel seems easy to do, not too controversial and
orthogonal to the detach stuff (does it need interact with that at all?)

The fact that one cannot iterate across keys makes Detach problematic from
a performance perspective - every time you detach a context, you'll need
to traverse a map containing every registered key. The init requirement
also means that you can only preserve singleton keys, but there's no
particular reason keys must be singletons. Also, global registries
can be problematic with respect to package cloning - having a global
registry means that it's not possible to fork the context package and
still preserve the global Detach semantics.

I haven't seen much explanation of why the detached context values need
the "close" function. Surely if you need that, you can arrange to have
the value close itself by listening on the Done channel?

Why is it not viable to inherit all keys from the parent context but
choose to overwrite selected keys (for example to change the new span id)
in the child context? If you do this, then Detach becomes unnecessary AFAICS.

@bcmills
Copy link
Contributor

bcmills commented Mar 22, 2017

every time you detach a context, you'll need to traverse a map containing every registered key

Yes, that is an unfortunate downside of the Context interface in general. (It also prevents Context implementations from performing certain kinds of compactions to speed up Value calls.)

The init requirement also means that you can only preserve singleton keys, but there's no particular reason keys must be singletons.

True, but in practice it is rare for the keys not to be singletons. (If the keys are some dynamic value, how is that value made known to the caller? It's either stored under some other key in the Context or passed explicitly in some other parameter: either way, that suggests that the dynamic key is redundant.)

Also, global registries can be problematic with respect to package cloning

Yep, that's a good point.

I haven't seen much explanation of why the detached context values need the "close" function.

Consider a context-scoped logging package. At some point you need to flush the log entries to their destinations (disk or a monitoring server or similar). The close function indicates that the Context is no longer in use and allows the log entry to be flushed.

Surely if you need that, you can arrange to have the value close itself by listening on the Done channel?

No: the Done channel is closed as soon as the caller is no longer interested in the result, but you have no guarantee that other users of the Context have noticed that cancellation yet. They may still be using the values obtained from the Context.

Why is it not viable to inherit all keys from the parent context but choose to overwrite selected keys (for example to change the new span id) in the child context?

That is another viable approach, but it is more prone to unintended extension of Value lifetimes.

It also retains a reference to the parent Context and all of the values reachable from it, which may result in unexpectedly high memory usage if the parent Context contains many keys or large values.

@bcmills
Copy link
Contributor

bcmills commented Mar 22, 2017

One possible alternative to RegisterPreserveFunc would be to use the Preserver and Releaser interfaces that @neild suggests and augment them with an additional API for iterating over the keys of a Context. We wouldn't be able to Detach from a Context that doesn't implement the new methods, but perhaps that's not so big a problem in practice.

That might look something like:

package context

// Ranger is an optional interface allowing iteration over keys and values in a Context.
type Ranger interface {
  // Range invokes the given function sequentially for every key and value stored
  // in the Context, in no particular order.
  //
  // If the function returns false for any key, Range immediately returns nil.
  //
  // If some parent Context does not implement Ranger, Range returns ErrRangeIncomplete
  // after calling the function for all known keys and values.
  Range(func(key, value interface{}) bool) error
}

// ErrRangeIncomplete indicates that a call to Range may have omitted some keys
// due to a parent Context that does not implement the Ranger interface.
var ErrRangeIncomplete =

That leaves open the question of what Detach should do if Range returns ErrRangeIncomplete: should it panic? Return an error? (But checking Detach errors seems tedious when they aren't likely to fail in practice.) Silently omit the missing keys and values?

@rsc
Copy link
Contributor

rsc commented Mar 27, 2017

/cc @Sajmani

@Sajmani
Copy link
Contributor

Sajmani commented Apr 1, 2017

I like the detached value registry better than the Preserver, Releaser, Ranger approach. A registry avoids the whole incomplete iteration issue, and it allows client code or frameworks to decide how to manage context values defined in other packages.

I'd like to avoid having the registry be a part of the context package. We could do this by pulling the registry out into a separate ctxreg package, but I'm thinking about a different kind of change to the context package that (1) allows the detach registry and behavior to be in a separate package and (2) opens up some new possibilities.

I propose to define a new context.WithNewValues function that creates a Context that uses a fresh set of values:

package context

type Values interface {
  Value(key interface{}) interface{}
}

// WithNewValues returns a Context that satisfies Value calls using vs.Value instead of ctx.Value.
// If vs is nil, the returned Context has no values.
// The returned Context behaves like ctx in all other ways.
func WithNewValues(ctx Context, vs Values) Context

Since Context satisfies Values, IgnoreCancel is just:

ctx := context.WithNewValues(context.Background(), ctx)

(Some risk of arg-swapping confusion in WithNewValues; perhaps there's a way to improve.)

The WithNewValues API opens up some other interesting possibilities, like clearing the set of values in a Context or compacting their representation. This is the main reason for putting it in the main context package instead of a separate one.

A separate package can now define how to get the Values from a Context, in particular, it could use a registry:

package detach

func RegisterPreserveFunc(key interface{}, f PreserveFunc)

// Go starts a new goroutine running f, passing it a ctx with values detached from parent.
func Go(parent Context, f func(ctx Context)) *Task

I prefer the Go for functions that start goroutines. And detach.Task reads well.

Are there any implementation reasons why using a separate detach package is problematic? Does my proposal satisfy the goals of this proposal?

@zombiezen
Copy link
Contributor Author

@Sajmani Your proposal would satisfy the goal. However, I'm not sure how much utility WithNewValues is, and it is certainly easy to write it without aid of the context package.

@rsc
Copy link
Contributor

rsc commented Apr 10, 2017

I'm quite scared of the complexity being proposed here. This feels like atexit for contexts, with all the relevant bugs. So I'm going to push back on this a bit:

A common need when writing server code is performing a task that intentionally goes past the deadline of the request while preserving the request's context values (e.g. trace ID). These tasks may block a request handler from returning (synchronous) or may need to happen in the background, past the return of a request handler (asynchronous).

Examples please?

@bcmills
Copy link
Contributor

bcmills commented Apr 10, 2017

A few examples:

  • Writing logs after serving a (low-latency) request.
  • Finishing a request for an asynchronous protocol (perhaps after committing the request to a low-latency journal).
  • Cleaning up intermediate remote resources after an operation using those resources has been cancelled.
  • Propagating results to distributed caches, replicas, or mirrors.

@rsc
Copy link
Contributor

rsc commented Apr 10, 2017

Why is it not enough to just implement (even outside the context package) context.WithoutDeadline, that returns a Context implementation that has no deadline but reports the same key-values as the original?

@bcmills
Copy link
Contributor

bcmills commented Apr 10, 2017

Because the values stored in the Context are scoped to some particular call tree, and some of them (such as tracing and other per-request logging) need to flush entries to the log once the corresponding operation is finished.

If you know exactly which per-context logging systems are in use, WithoutDeadline would suffice (you can add explicit user code to start and/or connect traces), but that kind of assumption doesn't scale well to large programs (or, especially, to large general-use libraries).

@rsc
Copy link
Contributor

rsc commented Apr 10, 2017

Context works because it is small. The current proposal roughly doubles the API surface, and I can't see that it even covers all the important cases.

My point about context.WithoutDeadline is that having it would let you pass the context to things that shouldn't care the deadline expired. That seems defensible, and also easy to implement outside the context package.

I admit that this does not provide any kind of "atexit" functionality to automatically schedule work when the context expires, but that's a feature. There is no well-defined point when the "main work" is done, at least not that the context implementation knows. The kinds of things listed in #19643 (comment) should be run explicitly, not magically as part of a context finishing.

@bcmills
Copy link
Contributor

bcmills commented Apr 10, 2017

The kinds of things listed in #19643 (comment) should be run explicitly, not magically as part of a context finishing.

Agreed. The problem is that with the current API, there is no valid Context to pass to those explicit calls such that their actions can be included in Context-scoped logs.

The only step which we are proposing should be done "magically" is committing the Context-scoped log entries for the actions taken as part of that explicit work, and the reason to do that centrally is because the code has no way of knowing which (if any) Context-scoped logging systems are in use in the rest of the codebase.

@rsc
Copy link
Contributor

rsc commented Apr 10, 2017

I'm back to not understanding. Now it sounds like you're saying that (1) you want to run some background operation like flush logs, but (2) before that you want to run a background operation like replicate to other servers, and that operation might generate logs. So now it seems like we're kind of talking about a whole job-scheduling system baked into Context.

The tell here is "Detach starts a new goroutine...". In general we avoid APIs that do that. Users should be in charge of creating new goroutines, not libraries. There are exceptions, like time.AfterFunc and maybe finalizers, but those are exceptions, not the rule. I don't think the situation here is nearly strong enough. (I think there was a proposal to add that kind of thing to WaitGroup and we declined that too, but I can't find it.)

@bcmills
Copy link
Contributor

bcmills commented Apr 10, 2017

now it seems like we're kind of talking about a whole job-scheduling system baked into Context.

There's not really any scheduling to it. It is, essentially, "run this function and then flush the Context logs". It just happens that you don't need to flush the logs if you're not also spawning a goroutine, because you can rely on your caller to flush those same logs when they return.

You could easily envision a synchronous version instead (see below). Unfortunately, the version we tried turns out to be (empirically) too difficult to use correctly.

The tell here is "Detach starts a new goroutine...". In general we avoid APIs that do that. Users should be in charge of creating new goroutines, not libraries.

We started with an API based on keeping the goroutines in user code, along the lines of:

// Detached returns a new Context with values derived from parent.
//
// The caller must call the returned "close" function exactly once when the values from
// the new Context are no longer in use by any goroutine.
//
// The values in the new Context are obtained by calling the PreserveFunc for each
// registered key. Keys without a PreserveFunc are not associated with any value in
// the new Context. If the PreserveFunc for a key returns a nil value, the new Context
// will not have a value for that key.
//
// The new Context is never Done, even if the parent Context is canceled or reaches
// its deadline.
func Detached(parent context.Context) (_ context.Context, close func())

Empirically, a significant fraction of attempted users ended up calling Detached from the child goroutine instead of the parent, when the parent's logs (and their links to child logs) might have already been flushed. Another significant minority didn't call the close function at all (resulting in missing logs and/or memory leaks).

@rsc
Copy link
Contributor

rsc commented Apr 10, 2017

But how is "and then flush the Context logs" implemented? I assume that's user code, since otherwise I don't know what a Context log is, so "run this function and then flush the Context logs" is really "run this function and then run this other function". How does that avoid generalizing to arbitrary sequences? / Why is it appropriate to stop at two?

@bcmills
Copy link
Contributor

bcmills commented Apr 10, 2017

I assume that's user code, since otherwise I don't know what a Context log is, so "run this function and then flush the Context logs" is really "run this function and then run this other function". How does that avoid generalizing to arbitrary sequences? / Why is it appropriate to stop at two?

It is 2N+1 calls in the general case:

The pairing of calls before and after is what makes the API interesting: the first batch of N calls must be made while the parent logs have not yet been flushed (so that they can emit links to the child logs), and the last batch of N calls must not be forgotten (so that the child logs are actually written). In the general case, a library function receiving a Context does not know the value of N, nor the set of keys that need these pairings of calls.

@rsc
Copy link
Contributor

rsc commented Apr 11, 2017

OK, I reread this whole thread carefully, and while I think I understand the problem being solved, it seems like a solution tailored to a very specific problem. Is there any evidence that this is a widespread need? Is there any evidence that there aren't N more of these that need solving as well? It just feels like "I have this precisely-shaped problem hole and this precisely-shaped solution exactly fits into that hole exactly." Usually instead we look for solutions that fit an entire class of problem holes. This proposal still seems far too complex and specific, not like typical Go solutions.


Also, @Sajmani proposed a different solution and in response @zombiezen wrote in #19643 (comment):

Your proposal would satisfy the goal. However, I'm not sure how much utility WithNewValues is, and it is certainly easy to write it without aid of the context package.

If (1) Sameer's proposal satisfies the goal and (2) Sameer's proposal can be written without any help from the context package, why not start there? Introduce a new self-contained detach package in your own code base, use it, and learn from it whether there needs to be anything at all in the context package proper.

@zombiezen
Copy link
Contributor Author

@rsc @Sajmani's solution is not just the WithNewValues function. It would also have the registry of "detachable" context keys, which is the more important part. And while it does not explicitly need the aid of the context package (the Google equivalent of this package does not), having a discoverable registry when people are creating context keys is important.

@rsc
Copy link
Contributor

rsc commented Apr 11, 2017

OK, well the main point is the part above the fold in my reply: this seems overfitted to the specific problem being solved.

@Sajmani
Copy link
Contributor

Sajmani commented Apr 17, 2017

@zombiezen @rsc My proposal is specifically to move any registry to a separate package and not the context package. I don't want global state like the registry in the context package, so that it's easier to permit multiple versions of the context package to coexist in a program. Packages with global state need to be singly-versioned in a program, so best to keep them focused on that state.

Also, while context.WithNewValues could be implemented outside the context package, there is an important optimization available to context types within the context package: we can avoid starting a goroutine in propagateCancel when all the context types in a chain are from the context package. It might be interesting to make this optimization available to external Context implementations, but context.WithNewValues is small enough that it seems OK to put in the core, and it allows for detach and other interesting use cases, like optimizing the representation of Value while retaining the core cancelation functionality.

@rsc
Copy link
Contributor

rsc commented Jun 15, 2017

I talked some more with @Sajmani about this, and we both believe that there just isn't enough experience with what's needed here to commit to something in the standard library today. As such, I am going to close this proposal as declined.

This is what I see as the path forward here:

  1. Someone should survey and summarize uses of context-manipulation routines used today, including but not limited to the ones in Google's own code base, and see what commonalities there are, with an eye toward understanding what additional functionality, if any, needs to go into the standard context package.

  2. Initial experimentation with that additional functionality should, if at all possible, happen in a separate package, perhaps in a subdirectory of x/net/context, and then we can easily move into standard context once we have experience. It's OK if the initial version is less efficient because optimizations that could be made within the context package itself are not available.

  3. That experimentation may have its own proposals, but once the experimentation has converged
    to something that may be important to add to context itself, a new proposal should be submitted (separate issue #, not reusing this one).

Thanks.

@Sajmani
Copy link
Contributor

Sajmani commented Jul 6, 2017

(not reopening the issue)

In an earlier comment, I proposed defining this interface:
type Values interface {
Value(key interface{}) interface{}
}
For use with a context.WithNewValues function.

It occurred to me that the ability to separate a context's values from the rest of the context (notably its deadline and cancelation) is also useful in logging and tracing. Some logging APIs want values from a context, but it is somewhat confusing to pass a Context to a Log call, since this suggests that the Log call might be canceled. With a Values interface, we can define:
func Log(vals context.Values, ...)
Which makes it clear that the logger is only consuming the values from the Context.

@zombiezen
Copy link
Contributor Author

@neild and I tossed this exact idea around a few months ago when looking at logging APIs. Things to consider:

  • Creates more API surface that needs to be documented.

  • Unclear whether this will encourage creating types that implement Values but not Context, and whether that is harmful or not. (Requires research/experimentation, not a strict downside.)

  • You definitely would want a function that converts from a Values into a Context with no cancellation/deadline. (Again, a little more API surface.)

  • Since Values can be thought of as a "lite Context", it runs somewhat afoul of the Code Review comment about Contexts:

    Don't create custom Context types or use interfaces other than Context in function signatures.

    Our rationale has been that this can make automated refactoring to add Context difficult. However, if it was a well-known type in the context package — and you are the one that has been generally looking at such automation 😃 — that could tip the scale a bit.

(And while I'm here: thanks @rsc, the resolution here seems reasonable.)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants