New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net/http: graceful shutdown race condition #36819
Comments
#48642 seems to be a duplicate, I've proposed a fix https://golang.org/cl/353714 for it |
#48642 was fixed |
c := srv.newConn(rw)
if !srv.trackConn(c) {
rw.Close()
return ErrServerClosed
} ooooooh, this change makes the solution non-graceful either, right? A graceful shutdown should either:
But with the code above it accepts the connection then closes it. |
I think you are right, moreover the problem seems to be a duplicate of #33313 which was fixed by https://golang.org/cl/213442 but now the "fix" for #48642 https://golang.org/cl/353714 reintroduced it back - client gets
To avoid patching the runtime a delay could be achieved via package main
import (
"context"
"fmt"
"net"
"net/http"
"os"
"os/signal"
"syscall"
"time"
)
func main() {
sig := make(chan os.Signal, 1)
signal.Notify(sig, syscall.SIGTERM, os.Interrupt)
mux := http.NewServeMux()
mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {})
srv := &http.Server{
Addr: ":8080",
Handler: mux,
ConnContext: func(ctx context.Context, c net.Conn) context.Context {
time.Sleep(3 * time.Second)
return ctx
},
}
go func() {
fmt.Println(srv.ListenAndServe())
}()
<-sig
fmt.Println(srv.Shutdown(context.Background()))
} The proper fix for #48642 should close the connection but only when server is |
Change https://go.dev/cl/405454 mentions this issue: |
The https://golang.org/cl/353714 wrongly closes accepted connection if server is shutting down. Updates golang#33313 Fixes golang#36819 Fixes golang#48642
The https://golang.org/cl/353714 wrongly closes accepted connection if server is shutting down. Updates golang#33313 Fixes golang#36819 Fixes golang#48642
The https://golang.org/cl/353714 wrongly closes accepted connection if server is shutting down. Updates golang#33313 Fixes golang#36819 Fixes golang#48642
The https://golang.org/cl/353714 wrongly closes accepted connection if server is shutting down. Updates golang#33313 Fixes golang#36819 Fixes golang#48642
This reverts CL 353714. The change closes accepted connection also in graceful shutdown which breaks the fix for #33313 (and apparent duplicate #36819). The proper fix should close accepted connection only if server is closed but not in graceful shutdown. Updates #48642 Change-Id: I2f7005f3f3037e6563745731bb2693923b654004 GitHub-Last-Rev: f6d885a GitHub-Pull-Request: #52823 Reviewed-on: https://go-review.googlesource.com/c/go/+/405454 Reviewed-by: Damien Neil <dneil@google.com> Reviewed-by: Ian Lance Taylor <iant@google.com>
Previous attempt https://golang.org/cl/353714 also incorrectly closed accepted connection while in shutdown. Fixes golang#48642 Updates golang#33313, golang#36819
Change https://go.dev/cl/409154 mentions this issue: |
Change https://go.dev/cl/409537 mentions this issue: |
Avoid race conditions when a new connection is accepted just after Server.Close or Server.Shutdown is called by waiting for the listener goroutines to exit before proceeding to clean up active connections. No test because the mechanism required to trigger the race condition reliably requires such tight coupling to the Server internals that any test would be quite fragile in the face of reasonable refactorings. Fixes #48642 Updates #33313, #36819 Change-Id: I109a93362680991bf298e0a95637595dcaa884af Reviewed-on: https://go-review.googlesource.com/c/go/+/409537 TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Damien Neil <dneil@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Bryan Mills <bcmills@google.com>
Avoid race conditions when a new connection is accepted just after Server.Close or Server.Shutdown is called by waiting for the listener goroutines to exit before proceeding to clean up active connections. No test because the mechanism required to trigger the race condition reliably requires such tight coupling to the Server internals that any test would be quite fragile in the face of reasonable refactorings. Fixes golang#48642 Updates golang#33313, golang#36819 Change-Id: I109a93362680991bf298e0a95637595dcaa884af Reviewed-on: https://go-review.googlesource.com/c/go/+/409537 TryBot-Result: Gopher Robot <gobot@golang.org> Run-TryBot: Damien Neil <dneil@google.com> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Bryan Mills <bcmills@google.com>
I think this was fixed by the https://go.dev/cl/409537 |
Update: I have found a super trivial way to reproduce it, see the step at the very bottom
Preamble: I think it's relevant to #23829
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?GOHOSTARCH="amd64"
GOHOSTOS="linux"
go env
OutputWhat did you do?
With the following code it's possible to get a case when the new connection in accepted, then closed within milliseconds, which makes the shutdown not graceful
What did you expect to see?
A graceful shutdown
What did you see instead?
How to reproduce:
Just
ab -t 20 -c 5 -A foo:foo http://localhost:8080/
is enough to capture the case. The former is much more rare, the latter is quite frequent.What you can see on screenshots: the connection is accepted, then as per https://go-review.googlesource.com/c/go/+/121419/ it should be treated as "new" (but it's not due to race), and is closed.
In first case the client had chance to send a request.
In second - it was closed even before that.
In both cases it happens with the
Connection reset by peer
on the client side, which should never be the case with non-keepalived connections and graceful termination.Both pcap files are available at: https://www.dropbox.com/s/2co7wkucfey90zz/bug.tgz?dl=0
My assumption on why it's the race and how it happens:
rw, e := l.Accept()
successfullyc.setState(c.rwc, StateNew)
statement is reached the server completes shutdownhttp.Server
is not aware of itUPDATE:
It can be easier reproducible by adding
time.Sleep(3 * time.Second)
right beforec := srv.newConn(rw)
Steps:
http/server.go
withtime.Sleep(3 * time.Second)
beforec := srv.newConn(rw)
Expected: the request should complete
Actual:
And a corresponding pcap: https://www.dropbox.com/s/5vsvwoow4ydcs3f/with-timeout.tgz?dl=0
And a corresponding wireshark screenshot:
The text was updated successfully, but these errors were encountered: