Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: Client fails to connect to web server if the linux namespace is switched #26698

Closed
davrodpin opened this issue Jul 30, 2018 · 7 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@davrodpin
Copy link

What version of Go are you using (go version)?

$ go version
go version go1.10.3 linux/amd64

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using (go env)?

$ go env | grep -v GOPATH
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/davrodpin/.cache/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build822424801=/tmp/go-build -gno-record-gcc-switches"

What did you do?

I was trying to write a program to perform HTTP requests to a web server that is running on the same linux machine, but on another linux namespace using the http client provided by the net/http package, which failed constantly with the message Get http://127.0.0.1:8080: dial tcp 127.0.0.1:8080: connect: connection refused.

Steps to reproduce

The code to reproduce the issue is published as gist. Link below:

https://gist.github.com/davrodpin/6d0e7cbd8aea477a7990f9ba3e5d3692

There are some pre-steps to be executing before running the code in order to create a new linux namespace, so I am providing a step-by-step process on how to fully reproduce this bug:

  1. Create a local directory to store the source code and binaries
mkdir golang-http-client-bug && cd golang-http-client-bug
  1. Download the client and server source code from the gist I've created
curl -O https://gist.githubusercontent.com/davrodpin/6d0e7cbd8aea477a7990f9ba3e5d3692/raw/140a40e17e6dfabd8be5d8dafbb5c49da2330420/client.go
curl -O https://gist.githubusercontent.com/davrodpin/6d0e7cbd8aea477a7990f9ba3e5d3692/raw/140a40e17e6dfabd8be5d8dafbb5c49da2330420/server.go
  1. Create a linux namespace names gobug
ip netns add gobug
ip netns exec gobug ifconfig lo 127.0.0.1 netmask 255.0.0.0 up
  1. Download a lib dependency and build both client and server binaries
go get github.com/vishvananda/netns && go build server.go && go build client.go
  1. Run the web server on the recently created linux namespace (gobug)
ip netns exec gobug ./server
  1. Run the client on the current namespace
./client

The output that you should see is:

switching linux namespace to 'gobug'
request to http server using exec.Command(curl) returned with success: Hello, "/foo"
request to http server using net.Dial returned with success: HTTP/1.0 200 OK
error while sending http request using http.Client: Get http://127.0.0.1:8080: dial tcp 127.0.0.1:8080: connect: connection refused
switching linux namespace to previous one

The client first switches to the new linux namespace, gobug, using an open source library, https://github.com/vishvananda/netns, and then tries to perform a http request to a web server listening on 127.0.0.1:8080 in three (3) different ways: using curl, using net.Dial directly and using http.Client.

The first two methods (curl and net.Dial) work, which means they could reach out the server running on the linux namespace gobug, but the third fails.

I am suspicious that is related to the goroutines created by http.Transport (links to source code below) to manage the connections, since they will be running on the default linux namespace instead of gobug namespace.

This behavior is explained here: https://golang.org/doc/go1.10#runtime

Links to http.Transport source code that creates goroutines:

What did you expect to see?

I was expecting the http request, GET http://127.0.0.1:8080/foo to return with success

What did you see instead?

The error message below:

Get http://127.0.0.1:8080: dial tcp 127.0.0.1:8080: connect: connection refused

@ianlancetaylor ianlancetaylor added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 3, 2018
@ianlancetaylor ianlancetaylor added this to the Go1.12 milestone Aug 3, 2018
@erikdubbelboer
Copy link
Contributor

Your problem is indeed that Go is dialing from another goroutine which will run on a different thread. What you can do to fix this is providing a custom dialer that locks the thread and sets the namespace right before the dial.

I have not tested this code as I am not able to at the moment:

func main() {
	// ...

	if err = RequestUsingHttpClient(func() {
		// Setup
		err := netns.Set(ns)
		if err != nil {
			panic(fmt.Sprintf("can't switch to linux namespace 'gobug': %v", err))
		}
	}, func() {
		// Teardown
		netns.Set(origin)
	}); err != nil {
		fmt.Printf("%v\n", err)
	}

	// ...
}

func RequestUsingHttpClient(setup, teardown func()) error {
	defer teardown()
	defer runtime.UnlockOSThread()

	c := http.Client{
		Transport: &http.Transport{
			Proxy: http.ProxyFromEnvironment,
			DialContext: func(ctx context.Context, network, address string) (net.Conn, error) {
				runtime.LockOSThread()
				setup()

				return net.Dialer{
					Timeout:   30 * time.Second,
					KeepAlive: 30 * time.Second,
					DualStack: true,
				}.DialContext(ctx, network, address)
			},
			MaxIdleConns:          100,
			IdleConnTimeout:       90 * time.Second,
			TLSHandshakeTimeout:   10 * time.Second,
			ExpectContinueTimeout: 1 * time.Second,
		},
	}

	if _, err := c.Get(fmt.Sprintf("http://%s", serverAddress)); err != nil {
		return fmt.Errorf("error while sending http request using http.Client: %v", err)
	}

	fmt.Println("http.Client is working as expected")

	return nil
}

@davrodpin
Copy link
Author

Hi @erikdubbelboer,

Thank you very much for providing the code snippet. It works!

I have updated the gist with the working version of your code: https://gist.github.com/davrodpin/6d0e7cbd8aea477a7990f9ba3e5d3692#file-client_custom_dialer-go

That solves my specific issue with the http client, but I believe there is a broader problem to be solved when your program has a component that relies on concurrency (it spawns goroutines) and you want to make sure all of them will be executed on a given linux namespace.

I was wondering if we could have a way to pass some sort of context to a goroutine to give hints to the scheduler on how a goroutine should be executed, which could include a given OS thread that was previously changed to run on a specific linux namespace.

Pseudo golang code below:

osThread := runtime.GetCurrentOSThread()

//code to change `osThread` to a different linux namespace

hints := SchedulerHints(Hints{
  "osThread": osThread
})

go(hints) func() {
  // go scheduler use the given hints to schedule the goroutine

  //do something

  go(hints) func() {
    //do something else
  }()
}()

@erikdubbelboer
Copy link
Contributor

To be honest I don't see anything like that ever being added to Go. The use case is too specific.

If all the operations that require a specific thread are fast you could also run all of them on the main thread. To do this you can use this library: https://github.com/faiface/mainthread so you do your netns.Set in main and do Dial in a mainthread.Call(func() { ... }) closure.

If you need multiple threads you could even expand on the idea of this library and make a work queue per thread. Spawning new threads would be spawning a Goroutine that calls runtime.LockOSThread() and then waits for work to be done on that thread. Using this you could in theory have different threads with different namespaces that you can dispatch work to.

@davrodpin
Copy link
Author

Agreed. Use case might be too specific and there is a current solution for making http requests on another linux namespace by providing a custom Dialer. Maybe this is not a real issue at all.

All very valuable thoughts, @erikdubbelboer. Thanks for sharing your ideas. It will help me a lot with what I am currently working on.

@bradfitz
Copy link
Contributor

bradfitz commented Dec 4, 2018

Yeah, sorry, we're not going to modify the standard library to accommodate different OS threads being in different namespaces. If you need to do that, do it early in init before goroutines are created (or from a parent process) so all your threads (and thus goroutines) are running in a consistent environment.

@bradfitz bradfitz closed this as completed Dec 4, 2018
@judavi
Copy link

judavi commented Jul 31, 2019

I'm just leaving this here in case it could be useful for someone. I had a similar issue but using the library inside a container and calling another container in a different port. I replaced 127.0.0.1 for localhost and it worked.

@francisfuzz
Copy link

@judavi - That helped me for my case. I really appreciate you sharing! 🙇

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

7 participants