Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: Pure-Go DNS resolver does not properly Round-Robin DNS Names #13283

Closed
bmhatfield opened this issue Nov 17, 2015 · 14 comments
Closed

net: Pure-Go DNS resolver does not properly Round-Robin DNS Names #13283

bmhatfield opened this issue Nov 17, 2015 · 14 comments
Milestone

Comments

@bmhatfield
Copy link

I was recently debugging an issue with an Amazon Elastic Load Balancer where our traffic was not being evenly balanced across ELB Availability Zones. AWS's ELB uses a number of DNS entries and low TTLs to balance "front-door" client traffic, expecting clients to properly round-robin the addresses. This technique works for a large number of clients on the internet.

Unfortunately, the new pure-Go DNS resolver in 1.5 appears to have some affinity to lower-numbered addresses when returning addresses in round-robin form. The cgo resolver does not exhibit this behavior.

After tracing through some of the code, I believe I have a good reproduction case for this problem. The ELB in question returns 6 IP addresses.

I have written a small program to demonstrate the affinity behavior:

package main

import (
    "fmt"
    "net"
)

func main() {
    for i := 0; i <= 50; i++ {
        addrs, err := net.LookupHost("REMOVED-ELB-HOSTNAME")

        if err != nil {
            fmt.Println(err)
        } else {
            fmt.Println(addrs)
        }
    }
}
ubuntu@ubuntu-scratchbox:~$ export GODEBUG=netdns=go
ubuntu@ubuntu-scratchbox:~$ go run dialtest.go
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.21.50.150 23.23.172.185 23.23.134.56 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.21.50.150 23.23.172.185 23.23.134.56 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.21.50.150 23.23.172.185 23.23.134.56 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.21.50.150 23.23.172.185 23.23.134.56 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.21.50.150 23.23.172.185 23.23.134.56 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.21.50.150 23.23.172.185 23.23.134.56 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.21.50.150 23.23.172.185 23.23.134.56 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]
[23.21.50.150 23.23.172.185 23.23.134.56 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.134.56 23.21.50.150 23.23.172.185 54.83.193.112 75.101.148.21 184.72.238.214]
[23.23.172.185 23.23.134.56 23.21.50.150 54.83.193.112 75.101.148.21 184.72.238.214]

An alternate form of this program highlights the issue:

ubuntu@ubuntu-scratchbox:~$ export GODEBUG=netdns=go
ubuntu@ubuntu-scratchbox:~$ go run dialtest.go
23.21.50.150:80 resolved 26 times
23.23.134.56:80 resolved 16 times
23.23.172.185:80 resolved 8 times

However, switching the resolver to cgo (on Ubuntu 12.04) causes the resolution to properly round-robin:

ubuntu@ubuntu-scratchbox:~$ export GODEBUG=netdns=cgo
ubuntu@ubuntu-scratchbox:~$ go run dialtest.go
[184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112]
[54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150]
[23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56]
[23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21]
[75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185]
[23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214]
[184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112]
[54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150]
[23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56]
[23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21]
[75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185]
[23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214]
[184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112]
[54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150]
[23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56]
[23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21]
[75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185]
[23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214]
[184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112]
[54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150]
[23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56]
[23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21]
[75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185]
[23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214]
[184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112]
[54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150]
[23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56]
[23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21]
[75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185]
[23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214]
[184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112]
[54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150]
[23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56]
[23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21]
[75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185]
[23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214]
[184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112]
[54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150]
[23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56]
[23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21]
[75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185]
[23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214]
[184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112]
[54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150]
[23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56]
[23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21]
[75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185]
[23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112 184.72.238.214]
[184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150 54.83.193.112]
[54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56 23.21.50.150]
[23.21.50.150 54.83.193.112 184.72.238.214 23.23.172.185 75.101.148.21 23.23.134.56]

And again, the even resolution behavior is highlighted by an alternate form of this program:

ubuntu@ubuntu-scratchbox:~$ export GODEBUG=netdns=cgo
ubuntu@ubuntu-scratchbox:~$ go run dialtest.go
23.23.172.185:80 resolved 9 times
23.21.50.150:80 resolved 9 times
54.83.193.112:80 resolved 8 times
184.72.238.214:80 resolved 8 times

I believe the pure-Go DNS resolver should be updated to round-robin across all returned addresses.

@bmhatfield
Copy link
Author

I have an instinct that the problem lies in here (where returned addresses are being sorted) but I'm having a little trouble processing the code to understand the exact effects: https://golang.org/src/net/addrselect.go

@mdempsky
Copy link
Member

I'd suspect you're getting bit by "Rule 9: Use longest matching prefix". Does your server happen to have a 23.21.x.x address itself?

@bmhatfield
Copy link
Author

@mdempsky it does not - it is only aware of an RFC1918 address (10.x.x.x)

@mdempsky
Copy link
Member

That at least explains why it's favoring the 23.x.x.x addresses over the 58.x.x.x and 184.x.x.x addresses: 23 shares 3 leading 0 bits with 10, whereas 58 and 184 share only 2 and 1 leading 0 bits with 10, respectively.

@mdempsky
Copy link
Member

It looks like glibc doesn't strictly follow RFC 3484/6724 for IPv4 addresses:

  /* Outside of subnets, as defined by the network masks,
     common address prefixes for IPv4 addresses make no sense.
     So, define a non-zero value only if source and
     destination address are on the same subnet.  */

See http://bazaar.launchpad.net/~vcs-imports/glibc/master/view/head:/sysdeps/posix/getaddrinfo.c#L1710

@bmhatfield
Copy link
Author

EDIT: I lost the race on this comment to @mdempsky - it was written before the comment about glibc :-)

Hrm. I'm not sure what to make of that.

Does the getaddrinfo/getnameinfo implementation on Linux not implement rule 9? Does it want a full octet match for a matching prefix? Or some other interpretation of the rule I am not understanding?

One thing I see in the RFC is this:

Rules 9 and 10 MAY be superseded if the implementation has other
   means of sorting destination addresses.  For example, if the
   implementation somehow knows which destination addresses will result
   in the "best" communications performance.

In practice, the behavior of routing by "best prefix" in this context is problematic, as it causes a significant amount of traffic to be pointed at a small subset of nodes that otherwise have no meaningful routing value over the others.

@bmhatfield
Copy link
Author

Ah @mdempsky I think I would agree with their interpretation that it doesn't make sense once you're outside of the subnet.

@mdempsky
Copy link
Member

It's unfortunate that UDPConn's don't have a way to discover their local IPNet, only their IP. It looks like to match glibc's behavior, we'll need to call InterfaceAddrs and find the enclosing IPNet that way. (Which is basically how glibc finds IPv4 prefix lengths anyway.)

Alternatively, we just skip Rule 9 for IPv4 addresses. I would suspect in practice it doesn't matter.

CC @bradfitz

@mdempsky
Copy link
Member

Actually, we can still apply it for RFC 1918 private networks (i.e., 10/8, 172.16/12, and 192.168/16) relatively easily.

@mdempsky mdempsky self-assigned this Nov 17, 2015
@mdempsky
Copy link
Member

@bmhatfield Are you able to test whether https://go-review.googlesource.com/#/c/16995/ fixes the problem for you?

(Unfortunately I have to head out for a bit, hence the incomplete CL.)

@bmhatfield
Copy link
Author

Yes, I can give it a shot.

@ianlancetaylor ianlancetaylor changed the title [1.5] Pure-Go DNS resolver does not properly Round-Robin DNS Names net: Pure-Go DNS resolver does not properly Round-Robin DNS Names Nov 17, 2015
@ianlancetaylor ianlancetaylor added this to the Go1.6 milestone Nov 17, 2015
@bmhatfield
Copy link
Author

I canary-deployed this to a single host, and I can confirm that it is now including the other 3 non-23.x IP addresses in the connections that it's making.

@gopherbot
Copy link

CL https://golang.org/cl/16995 mentions this issue.

@golang golang locked and limited conversation to collaborators Nov 16, 2016
@gopherbot
Copy link

CL https://golang.org/cl/34914 mentions this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants