Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

context: define that Err() == nil before Done is closed #19856

Closed
alercah opened this issue Apr 5, 2017 · 21 comments
Closed

context: define that Err() == nil before Done is closed #19856

alercah opened this issue Apr 5, 2017 · 21 comments

Comments

@alercah
Copy link

alercah commented Apr 5, 2017

Currently, context.emptyContext (the type of context.Background() and context.TODO()) has an Err() of nil. This is fine, since the context also has nil Done() and so the value of Err() is unspecified.

However, much code ends up erroneously expecting Err() to be nil before Done() is closed (if ever). If emptyContext were to give a non-nil Err(), which is permitted in the interface definition (since, again, it is unspecified), then this incorrect code would likely be caught more frequently. Since there is no requirement that Err() even be context.Canceled or context.DeadlineExceeded in this case, we could make emptyContext.Err() return something like fmt.Errorf("checking %s.Err() without checking %s.Done() first", ctx, ctx) to helpfully inform users of the mistake they are making.

This would possibly break existing code, but only code that was already incorrect. If necessary, an alternate context could be provided which preserves the current behavior for people unwilling or unable to correct their code.

@gopherbot gopherbot added this to the Proposal milestone Apr 5, 2017
@cespare cespare changed the title Proposal: Make context.emptyContext have non-nil Err() proposal: make context.emptyContext have non-nil Err() Apr 5, 2017
@ianlancetaylor
Copy link
Contributor

CC @Sajmani

@rsc
Copy link
Contributor

rsc commented Apr 5, 2017

Maybe instead we should define that until Done, Err returns nil. Are there any implementations where that's not true?

Put a different way, what problem are you trying to solve? The solution would cause pain for (some) people. What's the benefit?

@alercah
Copy link
Author

alercah commented Apr 5, 2017

That would probably be better, since it appears to be a common mis-idiom, but I think that would also more explicitly break the compatibility promise since valid existing Context implementations would become invalid.

The problem I'm trying to solve is that I was working with a dependency which was cancelling early
because the context I was working with had non-nil Err() prior to Done() being closed, and I wasted a fair bit of time debugging it.

@Sajmani
Copy link
Contributor

Sajmani commented Apr 6, 2017 via email

@rsc
Copy link
Contributor

rsc commented Apr 10, 2017

I can only find five Context implementations in go-corpus v0.01:

github.com/coreos/etcd/client/client_test.go
	fakeCancelContext is always Done, always has Err.

github.com/coreos/etcd/clientv3/watch.go
	valCtx is never Done, never has Err

github.com/gin-gonic/gin/context.go
	Context is never Done, never has Err

github.com/kataras/iris/context.go
	Context is never Done, never has Err

github.com/onsi/gomega/format/format_test.go
	ctx is never Done, never has Err

I searched for ' Done() <-chan struct{}' and sifted through all the non-context uses of that.

All these implementations follow the 'Err is nil until Done' rule. I suspect the main implementations do too, but I haven't checked. All evidence for changing

Err's return value is unspecified before Done is closed.

to

Err's return value should be nil before Done is closed.

and not non-nil.

@alercah
Copy link
Author

alercah commented Apr 10, 2017

This would be a better solution from my view; I didn't suggest it because I assumed it was an unacceptable breaking change.

@rsc
Copy link
Contributor

rsc commented Apr 11, 2017

I also looked in the go-corpus for checks of Err() against nil and they are all over the place (I've filtered out vendored copies of these, which would also need to be updated, and I've also exclude *_test.go files). It's telling that we can't even get this right in our own code (first two):

cloud.google.com/go/bigtable/internal/gax/invoke.go

// If the deadline is exceeded...
if ctx.Err() != nil {
	return ctx.Err()
}

golang.org/x/crypto/acme/autocert/cache.go

// prevent overwriting the file if the context was cancelled
if ctx.Err() != nil {
	return // no need to set err
}

github.com/coreos/etcd/client/client.go

case <-hctx.Done():
	// cancel and wait for request to actually exit before continuing
	reqcancel()
	rtresp := <-rtchan
	resp = rtresp.resp
	switch {
	case ctx.Err() != nil: <<< not ok
		err = ctx.Err()
	case hctx.Err() != nil: <<< ok
		err = fmt.Errorf("client: endpoint %s exceeded header timeout", c.endpoint.String())
	default:
		panic("failed to get error from context")
	}

github.com/coreos/etcd/clientv3/client.go

if ctx != nil && ctx.Err() != nil {
	return true
}

github.com/coreos/etcd/mvcc/kvstore.go

if ctx == nil || ctx.Err() != nil {
	...
	return
}

github.com/coreos/etcd/tools/benchmark/cmd/watch_get.go

for {
	st := time.Now()
	_, err := getClient[0].Get(ctx, "abc", v3.WithSerializable())
	if ctx.Err() != nil {
		break
	}
	r.Results() <- report.Result{Err: err, Start: st, End: time.Now()}
}

github.com/docker/docker/daemon/cluster/cluster.go

func initClusterSpec(node *swarmnode.Node, spec types.Spec) error {
	ctx, _ := context.WithTimeout(context.Background(), 5*time.Second)
	for conn := range node.ListenControlSocket(ctx) {
		if ctx.Err() != nil {
			return ctx.Err()
		}
		...
	}
	return ctx.Err()
}

github.com/docker/docker/daemon/cluster/executor/container/adapter.go

func (c *containerAdapter) inspect(ctx context.Context) (types.ContainerJSON, error) {
	cs, err := c.backend.ContainerInspectCurrent(c.container.name(), false)
	if ctx.Err() != nil {
		return types.ContainerJSON{}, ctx.Err()
	}
	if err != nil {
		return types.ContainerJSON{}, err
	}
	return *cs, nil
}

github.com/docker/docker/daemon/cluster/executor/container/controller.go

err := r.adapter.wait(ctx)
if ctx.Err() != nil {
	return ctx.Err()
}

github.com/docker/swarmkit/node/node.go

for {
	if ctx.Err() != nil {
		return
	}
	...
}

@rsc
Copy link
Contributor

rsc commented Apr 11, 2017

Based on the go-corpus results, I sent out CL 40291 narrowing the Context interface semantics. @Sajmani, the main designer of Context, is away this week, so I will leave this until he's back.

@rsc rsc changed the title proposal: make context.emptyContext have non-nil Err() proposal: context: define that Err() == nil before Done is closed Apr 11, 2017
@gopherbot
Copy link

CL https://golang.org/cl/40291 mentions this issue.

@quentinmit
Copy link
Contributor

I think your corpus results argue more that this should be a vet check than that we should change the definition of the interface. Even with the change, many (all?) of the examples you found should be waiting on ctx.Done instead.

@rsc
Copy link
Contributor

rsc commented Apr 11, 2017

@quentinmit, no they are not trying to wait, they are only doing a polling check. What they "should" be doing is:

var err error
select {
case <-ctx.Done():
    err = ctx.Err()
default:
   // ok
}
if err != nil

instead of if ctx.Err() != nil. But that's going to be a hard sell.

@gopherbot
Copy link

CL https://golang.org/cl/40396 mentions this issue.

gopherbot pushed a commit to golang/crypto that referenced this issue Apr 12, 2017
Context.Err() is not valid before Context.Done().

Updates golang/go#19856

Change-Id: I7605bb227bfc4cb542ef3db49870d4928ce704d1
Reviewed-on: https://go-review.googlesource.com/40396
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Alex Vaghin <ddos@google.com>
@RalphCorderoy
Copy link

In addition to the definition change, how about a one-shot stderr complaint from package context if Err() is called before Done() is closed?

@rsc
Copy link
Contributor

rsc commented Apr 12, 2017

I don't think so, for two reasons: (1) if we change the definition, then it's fine to call Err before Done, and (2) Go programs don't chatter to standard error except for very serious problems.

@bcmills
Copy link
Contributor

bcmills commented Apr 12, 2017

See also #14292 (comment).

The scalability implications (introducing a contention point due to the need to synchronize Err) might at least merit a lint check.

@alercah
Copy link
Author

alercah commented Apr 13, 2017 via email

@bcmills
Copy link
Contributor

bcmills commented Apr 13, 2017

@alercah You're correct that Err can use exactly the same synchronization as Done under the hood, and in that respect it's not really any worse. The fact that they're so similar is the problem, though: I would bet that to the median programmer ctx.Err() looks like it should be a lot cheaper than select.

@bcmills
Copy link
Contributor

bcmills commented Apr 13, 2017

Thinking about this some more, the appropriate lint warning would probably be something like:
Warn if ctx.Err() is called in a loop for which, along all codepaths, either ctx is being passed to some other function or there is already a select statement.

That condition may be a bit advanced for lint to try to resolve, though.

@rsc
Copy link
Contributor

rsc commented Apr 17, 2017

Over on CL 40291, @Sajmani and @ianlancetaylor were discussing ctx.Err() check as being racy. Moving discussion here.

At best the check err := ctx.Err(); err != nil is only half-racy: if err != nil then you know that's not going to change. If err == nil, then it might change the next moment, but that's OK and fundamental to the fact that contexts are for notification of cancellation. There's always a propagation delay between the top-level request for cancellation and the pieces of the computation noticing that request. That's not really a race. I think it's perfectly fine to do such a check before kicking off some expensive computation, or even at intervals during such a computation.

As for how to do that, I guess we could go to some trouble that finds places where code says:

err := ctx.Err()

and helpfully remind people that they should instead be writing:

var err error
select {
case <-ctx.Done():
    err = ctx.Err()
default:
   // ok
}

That doesn't seem like an improvement though. Better to redefine ctx.Err() to match existing usage, implementations, and expectations.

@alercah
Copy link
Author

alercah commented Apr 17, 2017

"There's always a propagation delay between the top-level request for cancellation and the pieces of the computation noticing that request."

I agree. For lengthy computations done in-process, this is the correct idiom and desirable behaviour. Because of the cooperative cancellation model in Go, there is no way to actually directly interrupt a computation short of quitting the process. A goroutine cannot simultaneously be block on <-Done() and do computation, so instead it has to periodically check whether or not the context has been cancelled or timed out. This will necessarily introduce unavoidable real-time delay. It's up to the implementer of the process (which is under no obligation to check the Context in any case) to decide how long between checks.

The only time you're going to get an immediate response on a cancellation is when the goroutine checking cancellation has nothing else to do because it otherwise blocked. In that case it will select on both <-Done() and at least one other channel, and it will return as soon as the channel is closed (and the goroutine can be scheduled). Even this isn't foolproof, however, because if one of the other operations returns before the select statement is reached, the random behaviour of select means that the cancellation may go undetected unless and until the goroutine checks again.

@rsc
Copy link
Contributor

rsc commented Apr 26, 2017

Sameer is happy with this and I haven't seen any strong arguments against, so I'm going to accept this proposal.

@rsc rsc modified the milestones: Go1.9, Proposal Apr 26, 2017
@rsc rsc changed the title proposal: context: define that Err() == nil before Done is closed context: define that Err() == nil before Done is closed Apr 26, 2017
@golang golang locked and limited conversation to collaborators Apr 26, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants