Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

context: recursive implementation of Context.Value triggers stack resize #47292

Closed
roudkerk opened this issue Jul 19, 2021 · 14 comments
Closed
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@roudkerk
Copy link
Contributor

As discussed in #33283, Go's implementation or valueCtx.Value is recursive which means that the cost grows linearly with the depth of the context chain.

But one issue not discussed there is that when the chain is long enough it will trigger an expensive stack resize for short lived goroutines. This is something that we see in production: according to pprof we have a server that spends 2.3% of CPU time in runtime.newstack for stacks with a chain of 12 or more Context.Value calls. (https://screenshot.googleplex.com/AE9xKYYfQ6ijJeK.)

With an iterative implementation, long context chains should be less problematic.

https://go-review.googlesource.com/c/go/+/210697 has a proposed iterative implementation originally created in 2018. I ended up writing another one before I discovered that:

@gopherbot
Copy link

Change https://golang.org/cl/335790 mentions this issue: context: implement Context.Value using iteration rather than recursion

@gopherbot
Copy link

Change https://golang.org/cl/335789 mentions this issue: context: add benchmarks for Value with a long chain

@robpike
Copy link
Contributor

robpike commented Jul 19, 2021

It seems to me that a context chain 12 long is the anomaly here. Why is it so long? That, to me, is the real problem here. We shouldn't rewrite every recursive call because in some cases the stack can grow. That removes recursion as a tool, and recursion is too elegant a tool to be castigated.

I am not in favor of going down this road.

@zikaeroh
Copy link
Contributor

I'm not sure how to measure the lengths of context chains in my own projects, but don't find 12 to be an anomaly at all; contexts are frequently used for tracing/metrics/logging, where the length of the chain corresponds to the number of interesting calls and logged values. Throw in some WithTimeout/WithDeadline/WithCancel, and it's definitely above 12 on average for my service.

@robpike
Copy link
Contributor

robpike commented Jul 20, 2021

Good to know. Thanks.

I still feel that rewriting library code to avoid recursion because it's sometimes expensive is not wise practice. Certainly not a precedent to establish.

By the way, the screenshot link above is Google-internal, I think.

@roudkerk
Copy link
Contributor Author

The context.WithValue API encourages the creation of long chains. (Note that #33283 discusses the lack of advice on how context.WithValue should be used.)

In the case of the server I am talking about, I don't think the application logic creates any of the value contexts -- they are all created at the framework level.

A cropped view of the flame graph:

flame

@roudkerk
Copy link
Contributor Author

Here is an example from another server with a chain of 26 contexts:

flame2

@roudkerk
Copy link
Contributor Author

I have been pointed to other services with chains of over 50.

@roudkerk
Copy link
Contributor Author

I managed to print a context for a server which handles HTTP traffic where the length of the chain was 19:

  • 19 contexts:
    • 1 context.Background()
    • 2 context.WithCancel()
    • 16 context.WithValue()
      • 5 were created by the standard library with keys:
        • http.ServerContextKey
        • http.LocalAddrContextKey
        • pprof.labelContextKey{}
        • trace.traceContextKey{} * 2
      • 4 were created by a server framework
      • 3 were created by an experiments framework
      • 1 was created by the application logic
      • 3 were created by other things

So the standard library "owns" 8 out of 19 links in the chain.

@neild
Copy link
Contributor

neild commented Jul 20, 2021

There doesn't seem to be a natural limit to the size of a context chain, and it's clear that long chains occur in practice.

While I agree that we shouldn't avoid recursion as a general principle, it doesn't seem like a recursive traversal of an arbitrarily-long linked list is the obviously correct approach.

@robpike
Copy link
Contributor

robpike commented Jul 21, 2021

It's correct, just not the fastest.

I am persuaded that a careful iterative approach is warranted here, provided it does not encourage a broad rewrite of recursive code in the libraries.

Still, it bothers me that a depth of 12 or 19 or even 50 is considered problematic for recursion, especially in Go, which handles deep recursion better than many languages.

@thanm thanm added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jul 23, 2021
@thanm thanm added this to the Backlog milestone Jul 23, 2021
@roudkerk
Copy link
Contributor Author

Still, it bothers me that a depth of 12 or 19 or even 50 is considered problematic for recursion, especially in Go, which handles deep recursion better than many languages.

That is because it resizes the stack rather than hitting a stack overflow. For long lived goroutines the resize cost is amortized so does not matter.

But for short lived goroutines that cost can be significant.

Off topic: What I would really like is to have the option of starting sensitive goroutines with an initial stacksize chosen based on past history. Doing that might look like

var runner = smartstacksize.New()

func foo(...) {
  runner.Go(func() { ... })
  ...
}

One could build such a runner if there was a runtime.CurrentStacksize() and runtime.GoWithStackHint().

@neild
Copy link
Contributor

neild commented Jul 26, 2021

See: #18138

@komuw
Copy link
Contributor

komuw commented Sep 30, 2021

The CL https://go-review.googlesource.com/c/go/+/335790 has two thumbs-up reviews and one thumbs-up for trust. Is there anything that is holding it back from merging?

@golang golang locked and limited conversation to collaborators Oct 2, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

7 participants