Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: make Dial(listener.Addr().String()) always work, even if machine's IPv6 config is not ideal #18806

Closed
bradfitz opened this issue Jan 26, 2017 · 14 comments
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@bradfitz
Copy link
Contributor

I just hit a problem in grpc-go (grpc/grpc-go#1058) where their tests were assuming they could write code like this:

        lis, err := net.Listen("tcp", ":0")
        if err != nil {
                t.Fatalf("Failed to listen: %v", err)
        }
        conn, err := net.Dial("tcp", lis.Addr().String())

The code looks reasonable, and often works (on both Go 1.7 and Go 1.8, on Linux and Darwin).

But not always.

lis.Addr().String() returns [::]:NNNNN, which looks like an IPv6 address literal for all zeros (like 0.0.0.0), regardless of whether the machine has IPv6.

And if a machine doesn't have an IPv6 address on the loopback interface, then net.Dial("tcp", "[::]:12345") fails.

But it passes otherwise.

The net.Dial documentation (https://beta.golang.org/pkg/net/#Dial) is kinda fuzzy on whether dialing the all zero address works. It does document and implement this behavior this, though:

If the host is empty, as in ":80", the local system is assumed.

Thus, the solution for gRPC's tests and anybody else is apparently adding something hacky like:

// dialAddr returns a valid net.Dial address to dial using network "tcp".
func dialAddr(ln net.Listener) string {
	addr := ln.Addr().String()
	addr = strings.TrimPrefix(addr, "[::]")
	addr = strings.TrimPrefix(addr, "0.0.0.0")
	return addr
}

But that's kinda lame and tedious.

Maybe we should just treat a net.Dial of [::] or 0.0.0.0 the same as an empty string for the host part?

/cc @mikioh @rsc @ianlancetaylor @MakMukhi

@bradfitz bradfitz added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Jan 26, 2017
@bradfitz bradfitz added this to the Go1.9 milestone Jan 26, 2017
@ghost
Copy link

ghost commented Jan 27, 2017

I vaguely remember some problem with a Go code when I was using a linux kernel with IPv6 disabled.

@rsc
Copy link
Contributor

rsc commented Jan 27, 2017

Maybe we should just treat a net.Dial of [::] or 0.0.0.0 the same as an empty string for the host part?

@bradfitz I'm pretty sure that's how it worked originally.

@bradfitz
Copy link
Contributor Author

@rsc, I'll try older Go versions to see when/if it changed.

@pmarks-net
Copy link
Contributor

That's precisely why you shouldn't remove ::1 from your loopback interface. It creates an asymmetry between the bind and connect socket APIs: a server can bind to [::]:port, but clients can't connect to it unless ::1 is available.

We've been trying to stomp out this configuration across Google machines because it's a pain to work around.

@pmarks-net
Copy link
Contributor

But in terms of a practical workaround, Go could probably interpret dialing "[::]:80" or ":80" as:

"[::]:80, with sequential fallback to 0.0.0.0:80"

Replacing it with localhost:80 also works in most cases, unless your server socket is actually IPv6-only, and localhost is 127.0.0.1-only.

@bradfitz
Copy link
Contributor Author

Paul, I didn't manually do anything to this machine. It started life as a default GCE VM and has remained untouched networking wise.

@pmarks-net
Copy link
Contributor

pmarks-net commented Jan 28, 2017

Does ping6 ::1 work? What's the Linux distro and version?

@bradfitz
Copy link
Contributor Author

bradfitz commented Feb 2, 2017

$ ping6 ::1 
connect: Network is unreachable

$ lsb_release  -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 8.6 (jessie)
Release:        8.6
Codename:       jessie

$ ip route
default via 10.240.0.1 dev eth0 
10.240.0.1 dev eth0  scope link 
172.17.0.0/16 dev docker0  proto kernel  scope link  src 172.17.0.1 

$ ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1460 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 42:01:0a:f0:00:0c brd ff:ff:ff:ff:ff:ff
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default 
    link/ether 02:42:4c:05:d8:a5 brd ff:ff:ff:ff:ff:ff

$ cat /proc/net/tcp6 
  sl  local_address                         remote_address                        st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode
   0: 00000000000000000000000000000000:0016 00000000000000000000000000000000:0000 0A 00000000:00000000 00:00000000 00000000     0        0 665 1 ffff8800ba94c800 100 0 0 10 0
   1: 00000000000000000000000000000000:1F41 00000000000000000000000000000000:0000 0A 00000000:00000000 00:00000000 00000000  1000        0 12321728 1 ffff8800ba94c040 100 0 0 10 0

Again, this was just a GCE instance I created for development about a year ago and occasionally turn off and turn back on.

@pmarks-net
Copy link
Contributor

about a year ago

http://b/18777177 was fixed in April 2016, so your GCE image likely predates it.

This was the relevant config change: andsens/bootstrap-vz@15c7d1c

@bradfitz
Copy link
Contributor Author

bradfitz commented Feb 2, 2017

@pmarks-net, regardless, this bug is about whether Go guarantees my posted program at top is always valid.

I don't want those guarantees to depend on your sysctl knobs being set a certain way.

But for the record,

# /sbin/sysctl  -a | grep ipv6 | grep able
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.docker0.disable_ipv6 = 1
net.ipv6.conf.eth0.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
net.ipv6.neigh.default.base_reachable_time_ms = 30000
net.ipv6.neigh.docker0.base_reachable_time_ms = 30000
net.ipv6.neigh.eth0.base_reachable_time_ms = 30000
net.ipv6.neigh.lo.base_reachable_time_ms = 30000

I only have that one line in:

$ cat /etc/sysctl.d/70-disable-ipv6.conf 
net.ipv6.conf.all.disable_ipv6 = 1

But looks sane, and like something somebody might write.

I see no reason Go shouldn't be consistent even on such a machine.

@gopherbot
Copy link

CL https://golang.org/cl/42496 mentions this issue.

gopherbot pushed a commit to golang/crypto that referenced this issue May 3, 2017
Dialing the 0.0.0.0 address (as returned by net.Addr().String() for a
net.Listen("tcp", ":1") address) is not yet guaranteed to work. It's
currently OS-dependent.

For some reason it works on Go 1.8+, but it hasn't yet been defined to
work reliably.

Fix the tests for now (since we need to support older Go releases),
even if this might work in the future.

Updates golang/go#18806

Change-Id: I2f0476b1d4f2673ab64ffedfa733f2d92fceb6ff
Reviewed-on: https://go-review.googlesource.com/42496
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Han-Wen Nienhuys <hanwen@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Han-Wen Nienhuys <hanwen@google.com>
@bradfitz bradfitz self-assigned this Jun 7, 2017
@bradfitz bradfitz added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. labels Jun 7, 2017
@bradfitz
Copy link
Contributor Author

bradfitz commented Jun 7, 2017

@ianlancetaylor and I looked into this. What @pmarks-net said in #18806 (comment) looks doable. I'll send a change.

@bradfitz bradfitz changed the title net: should Dial(listener.Addr().String()) always work? net: make Dial(listener.Addr().String()) always work, even if machine's IPv6 config is not ideal Jun 8, 2017
@gopherbot
Copy link

CL https://golang.org/cl/45088 mentions this issue.

gopherbot pushed a commit that referenced this issue Jun 9, 2017
It listens on all addresses, which users might not want.

Updates #18806 (follow-up to feedback from CL 45088)

Change-Id: I51de2d3fc3cd88a61eb3c63018c47aea920c0549
Reviewed-on: https://go-review.googlesource.com/45157
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Mikio Hara <mikioh.mikioh@gmail.com>
tg123 pushed a commit to tg123/sshpiper that referenced this issue Jun 20, 2017
Dialing the 0.0.0.0 address (as returned by net.Addr().String() for a
net.Listen("tcp", ":1") address) is not yet guaranteed to work. It's
currently OS-dependent.

For some reason it works on Go 1.8+, but it hasn't yet been defined to
work reliably.

Fix the tests for now (since we need to support older Go releases),
even if this might work in the future.

Updates golang/go#18806

Change-Id: I2f0476b1d4f2673ab64ffedfa733f2d92fceb6ff
Reviewed-on: https://go-review.googlesource.com/42496
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Run-TryBot: Han-Wen Nienhuys <hanwen@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Han-Wen Nienhuys <hanwen@google.com>
gopherbot pushed a commit that referenced this issue Jun 22, 2017
This change clarifies the documentation on Dial and its variants to
avoid unnecessary confusion about how the arguments for the connection
setup functions are used to make connections.

Also replaces "name" or "hostname" with "host name" when the term
implies the use of DNS.

Updates #17613.
Fixes #17614.
Fixes #17738.
Fixes #17956.
Updates #18806.

Change-Id: I6adb3f2ae04a3bf83b96016ed73d8e59926f3e8a
Reviewed-on: https://go-review.googlesource.com/34875
Reviewed-by: Ian Lance Taylor <iant@golang.org>
gopherbot pushed a commit that referenced this issue Jul 6, 2017
…::]:n

ResolveTCPAddr, ResolveUDPAddr, and ResolveIPAddr return at most one
address. When given a name like "golang.org" to resolve that might
have more than 1 address, the net package has historically preferred
IPv4 addresses, with the assumption that many users don't yet have
IPv6 connectivity and randomly selecting between an IPv4 address and
an IPv6 address at runtime wouldn't be a good experience for IPv4-only
users.

In CL 45088 (78cf0e5) I modified the resolution of the
unspecified/empty address to internally resolve to both IPv6 "::" and
0.0.0.0 to fix issue #18806.

That code has 3 other callers I hadn't considered, though: the
Resolve*Addr functions. Since they preferred IPv4, any Resolve*Addr of
"[::]:port" or "::" (for ResolveIPAddr) would internally resolve both
"::" and 0.0.0.0 and then prefer 0.0.0.0, even though the user was
looking up an IPv6 literal.

Add tests and fix it, not by undoing the fix to #18806 but by
selecting the preference function for Resolve*Addr more explicitly: we
still prefer IPv4, but if the address being looked up was an IPv6
literal, prefer IPv6.

The tests are skipped on machines without IPv6.

Fixes #20911

Change-Id: Ib7036cc43182ae4118cd1390c254e17c04a251a3
Reviewed-on: https://go-review.googlesource.com/47554
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>
sttts added a commit to sttts/kubernetes that referenced this issue Jul 10, 2017
gmarek pushed a commit to gmarek/kubernetes that referenced this issue Jul 11, 2017
enisoc pushed a commit to enisoc/kubernetes that referenced this issue Jul 12, 2017
saad-ali pushed a commit to saad-ali/kubernetes that referenced this issue Jul 13, 2017
@gopherbot
Copy link

Change https://golang.org/cl/47554 mentions this issue: net: don't return IPv4 unspecified addr for Resolve*Addr of [::] or [::]:n

perotinus pushed a commit to kubernetes-retired/cluster-registry that referenced this issue Sep 2, 2017
jonaz added a commit to jonaz/apcupsd that referenced this issue Jan 21, 2018
mdlayher pushed a commit to mdlayher/apcupsd that referenced this issue Jan 21, 2018
* Run test on machine without IPV6 on older go versions

References golang/go#18806

* Added InternalTemp, LineFrequency and OutputVoltage
@golang golang locked and limited conversation to collaborators Jul 27, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

4 participants