Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: local IP address change caused timeouts for outgoing HTTPS requests #23008

Open
chris-l-ross opened this issue Dec 6, 2017 · 3 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@chris-l-ross
Copy link

Using Go 1.9.2 on linux amd64.

Using the net/http client on a device whose IP address changes while the program is running, we started getting timeouts for HTTPS requests. To troubleshoot, we set up a DHCP server with very short leases and a minimal program that made requests every 10s and were able to reproduce the problem. We used a client with a transport based on http.DefaultTransport and a timeout (you can see it below).

I found I had to set DisableKeepAlives on the transport or set the IdleConnTimeout shorter than the frequency of our requests, or HTTPS requests would fail consistently after the IP address changed.

The following shows one run of the program. It was compiled with Go 1.9.2. The four routines are: HTTP with keep alive, HTTPS with keep alive disabled, HTTPS with keep alive, and HTTPS with the default http.Client (no timeout, keep alive enabled).

After the IP changes at about 4:04:10, you can see both the HTTP and HTTPS with keep alive fail. HTTP recovers (it isn't worth caching?), but HTTPS keeps timing out. As expected, HTTPS with the default client never times out or recovers.

I'll include the code used to generate this output below. But it's not trivial to reproduce because you need a DHCP server with a short lease and configurable static IPs.

In the short-term, we're just going to disable keep alives for our transports; but I'm reporting this because graceful IP renewal handling seems something Go should have.


2017/12/05 04:03:46  HTTP  / keep alive   204
2017/12/05 04:03:47  HTTPS / default      204
2017/12/05 04:03:47  HTTPS / keep alive   204
2017/12/05 04:03:47  HTTPS / disabled     204
2017/12/05 04:03:56  HTTP  / keep alive   204
2017/12/05 04:03:57  HTTPS / default      204
2017/12/05 04:03:57  HTTPS / keep alive   204
2017/12/05 04:03:57  HTTPS / disabled     204
2017/12/05 04:04:06  HTTP  / keep alive   204
2017/12/05 04:04:07  HTTPS / default      204
2017/12/05 04:04:07  HTTPS / keep alive   204
2017/12/05 04:04:07  HTTPS / disabled     204
2017/12/05 04:04:17  HTTPS / disabled     204
2017/12/05 04:04:27  HTTPS / disabled     204
2017/12/05 04:04:36  HTTP  / keep alive   Get http://www.google.com/generate_204: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
2017/12/05 04:04:37  HTTPS / keep alive   Get https://www.google.com/generate_204: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
2017/12/05 04:04:37  HTTPS / disabled     204
2017/12/05 04:04:47  HTTP  / keep alive   204
2017/12/05 04:04:47  HTTPS / disabled     204
2017/12/05 04:04:57  HTTP  / keep alive   204
2017/12/05 04:04:57  HTTPS / disabled     204
2017/12/05 04:05:07  HTTPS / keep alive   Get https://www.google.com/generate_204: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
2017/12/05 04:05:07  HTTP  / keep alive   204
2017/12/05 04:05:08  HTTPS / disabled     204
2017/12/05 04:05:17  HTTP  / keep alive   204
2017/12/05 04:05:18  HTTPS / disabled     204
2017/12/05 04:05:27  HTTP  / keep alive   204
2017/12/05 04:05:28  HTTPS / disabled     204
2017/12/05 04:05:37  HTTPS / keep alive   Get https://www.google.com/generate_204: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
2017/12/05 04:05:37  HTTP  / keep alive   204
2017/12/05 04:05:38  HTTPS / disabled     204
2017/12/05 04:05:47  HTTP  / keep alive   204
2017/12/05 04:05:48  HTTPS / disabled     204
2017/12/05 04:05:57  HTTP  / keep alive   204
2017/12/05 04:05:58  HTTPS / disabled     204
2017/12/05 04:06:07  HTTPS / keep alive   Get https://www.google.com/generate_204: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
2017/12/05 04:06:07  HTTP  / keep alive   204
2017/12/05 04:06:08  HTTPS / disabled     204
2017/12/05 04:06:17  HTTP  / keep alive   204
2017/12/05 04:06:18  HTTPS / disabled     204
2017/12/05 04:06:27  HTTP  / keep alive   204
2017/12/05 04:06:28  HTTPS / disabled     204
2017/12/05 04:06:37  HTTPS / keep alive   Get https://www.google.com/generate_204: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
2017/12/05 04:06:37  HTTP  / keep alive   204
2017/12/05 04:06:38  HTTPS / disabled     204

package main

import (
	"log"
	"net"
	"net/http"
	"time"
)

func main() {
	clients := []struct {
		label  string
		client *http.Client
		url    string
	}{
		{"HTTPS / disabled", client(true), "https://www.google.com/generate_204"},
		{"HTTPS / keep alive", client(false), "https://www.google.com/generate_204"},
		{"HTTPS / default", http.DefaultClient, "https://www.google.com/generate_204"},
		{"HTTP  / keep alive", client(false), "http://www.google.com/generate_204"},
	}

	var i int
	for i = 0; i < len(clients)-1; i++ {
		go makeRequests(clients[i].label, clients[i].client, clients[i].url)
	}
	makeRequests(clients[i].label, clients[i].client, clients[i].url)
}

func client(dka bool) *http.Client {
	return &http.Client{
		Timeout: 20 * time.Second,
		Transport: &http.Transport{
			DisableKeepAlives: dka,
			Proxy:             http.ProxyFromEnvironment,
			DialContext: (&net.Dialer{
				Timeout:   30 * time.Second,
				KeepAlive: 30 * time.Second,
				DualStack: true,
			}).DialContext,
			MaxIdleConns:          100,
			IdleConnTimeout:       50 * time.Second,
			TLSHandshakeTimeout:   10 * time.Second,
			ExpectContinueTimeout: 1 * time.Second,
		},
	}
}

func makeRequests(label string, client *http.Client, url string) error {
	for {
		sc, err := makeRequest(client, url)
		if err != nil {
			log.Printf(" %-20s %v\n", label, err)
		} else {
			log.Printf(" %-20s %d\n", label, sc)
		}
		time.Sleep(10 * time.Second)
	}
}

func makeRequest(client *http.Client, url string) (int, error) {
	req, err := http.NewRequest("GET", url, nil)
	if err != nil {
		return 0, err
	}
	resp, err := client.Do(req)
	if err != nil {
		return 0, err
	}
	defer resp.Body.Close()
	return resp.StatusCode, nil
}
@gbbr gbbr changed the title Local IP address change caused timeouts for outgoing HTTPS requests net/http: local IP address change caused timeouts for outgoing HTTPS requests Dec 6, 2017
@bradfitz bradfitz added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Dec 7, 2017
@bradfitz bradfitz added this to the Go1.11 milestone Dec 7, 2017
@meirf
Copy link
Contributor

meirf commented Jun 8, 2018

For completeness, #23427 is the server-side counterpart issue, though it does not have a breakdown of http vs https behavior.

HTTP recovers (it isn't worth caching?), but HTTPS keeps timing out.

(I wonder if this more of a difference between http/1.1 and http/2 implementations as opposed to a difference between http and https.)

@rsc
Copy link
Contributor

rsc commented Jun 11, 2018

It sounds like the problem is:

  1. connection gets made
  2. DHCP client on local system changes IP address
  3. connection gets reused, is worthless

Why is it Go's responsibility to work around this? I would have expected the kernel to make the operations on the socket in step 3 return some kind of error. What is happening instead? Do the writes just get discarded? If it's HTTPS-specific, can someone find out if maybe crypto/tls is not handling an error from the socket correctly?

@rsc rsc added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 11, 2018
@gopherbot gopherbot removed the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Jun 11, 2018
@bradfitz bradfitz modified the milestones: Go1.11, Unplanned Jun 29, 2018
@drigz
Copy link

drigz commented Oct 5, 2023

The kernel does not detect this quickly: https://stackoverflow.com/a/60366786. I haven't tested it but the default seems to be > 10 minutes. Red Hat say "Too low a setting will result in TCP connections apparently failing during load peaks".

This is also difficult to work around in the application layer without effectively disabling connection pooling (with DisableKeepAlives or low IdleConnTimeout), as even if you can detect this, http.Transport only offers you CloseIdleConnections() and there is no way to close the connections that are trying HTTP requests in vain.

Another possible workaround that I haven't tested is to log.Fatal() when we see Client.Timeout exceeded while awaiting headers and rely on systemd/Kubernetes/etc to restart the process. It's not clear how it's best to detect this, either os.IsTimeout(err) or:

if err, ok := err.(net.Error); ok && err.Timeout() {
    log.Fatal("Unexpected request timeout: This can indicate that the local IP changed and our connection pool is now useless.")
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

6 participants