Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: Dial is not safe to use in namespaces with dual-stack enabled #44922

Closed
aojea opened this issue Mar 10, 2021 · 5 comments
Closed

net: Dial is not safe to use in namespaces with dual-stack enabled #44922

aojea opened this issue Mar 10, 2021 · 5 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.

Comments

@aojea
Copy link
Contributor

aojea commented Mar 10, 2021

What version of Go are you using (go version)?

go version go1.16 linux/amd64

Does this issue reproduce with the latest release?

yes

https://play.golang.org/p/2kpRWCMdZkj

What operating system and processor architecture are you using (go env)?

Linux amd64

What did you do?

Using net.Dial inside a "namespace" is not "safe" , at least for the developer :), in dual-stack environments, it creates the connections in the parent namespace.

Golang has enabled RFC 6555 Fast Fallback (aka HappyEyeballs) in net.Dial by default since 1.12, basically, if the destination address resolves to multiple IPs, it tries to connect to them spawning different goroutines and using the working connection..

There is a lot of literature about the topic of namespaces and goroutines in golang, perfectly summarised in this article https://www.weave.works/blog/linux-namespaces-golang-followup

do not spawn a new goroutine from a locked one if the new goroutine expected to be run on the same thread or a thread with the same modified state

What did you expect to see?

I expect to be able to use net.Dial safely in all kind of environments, namespaced, single or dual-stack, or at least have some way to know that I'm not leaking connections, in this case, I was expecting to create connections inside a namespace but it turned out it was creating the connections in the parent namespace.

I've found this trying to implement port-forwarding for IPv4 and IPv6 in the main kubernetes containers runtimes:

cri-o/cri-o#4639
containerd/containerd#5145

I think that due to the adoption of golang in containers environment, and the "slow adoption of IPv6" using dual-stack (kuberntes just graduated it to beta in 1.21), it will be nice to make golang more container/dual-stack friendly or at least more predictable.

If we add this to the "net: Listen is unfriendly to multiple address families, endpoints and subflows #9334" issue, that means that it will only will open a listener in one address you practically are rolling a dice with golang when you use "localhost" as address in a dual-stack containerized environment.

@seankhliao seankhliao changed the title net.Dial is not safe to use in namespaces with dual-stack enabled net: Dial is not safe to use in namespaces with dual-stack enabled Mar 10, 2021
@seankhliao seankhliao added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 10, 2021
@seankhliao
Copy link
Member

CC @bradfitz @ianlancetaylor via owners.

@ianlancetaylor
Copy link
Contributor

I'm sorry, I don't understand. Yes, it's very hard to correctly use namespaces within a single Go program. I don't understand the connection to net.Dial. What can we do to change this?

@aojea
Copy link
Contributor Author

aojea commented Mar 10, 2021

I'm sorry, I don't understand. Yes, it's very hard to correctly use namespaces within a single Go program. I don't understand the connection to net.Dial. What can we do to change this?

The problem is that net.Dial uses go routines for the dual-stack fallback implementation, and this is not evident for the user, so you expect it to make the connection inside the namespace, but it is actually making the connection from the parent namespace.

After reading all the related issues and articles, I have clear that is not easy to solve, I think that I will wrap the function and serialise the connection attempts ...
I also wanted document the issue for others, at least it took me a while to understand why the code wasn't workin

@ianlancetaylor
Copy link
Contributor

The Go standard library, like many Go packages, uses goroutines freely as appropriate. When using any mechanism, such as namespaces, that make a goroutine special in some way, that goroutine has to be extremely careful in what it calls. It basically has to stick to trivial functions, or system calls, or cgo calls.

I'm going to close this issue since I don't think there is anything we can change, but it will serve as documentation. Thanks.

@aojea
Copy link
Contributor Author

aojea commented Mar 11, 2021

Well, it seems that the solution was easier than expected, it turns out that if a host resolves to multiple IP addresses and the FastFallback is disabled net.Dial tries the connections serially, so the solution is as simple as:

	var d net.Dialer
	d.FallbackDelay = -1
	conn, err := d.Dial("tcp", "localhost:8080")

aojea pushed a commit to aojea/containerd that referenced this issue Mar 22, 2021
golang has enabled RFC 6555 Fast Fallback (aka HappyEyeballs)
by default in 1.12.
It means that if a host resolves to both IPv6 and IPv4,
it will try to connect to any of those addresses and use the
working connection.
However, the implementation uses go routines to start both connections in parallel,
and this has limitations when running inside a namespace, so we try to the connections
serially, trying IPv4 first for keeping the same behaviour.
xref golang/go#44922

Signed-off-by: Antonio Ojea <aojea@redhat.com>
@golang golang locked and limited conversation to collaborators Mar 11, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

4 participants