Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: Resolver doesn't use provided Dial function in all cases #60712

Open
chriso opened this issue Jun 9, 2023 · 4 comments
Open

net: Resolver doesn't use provided Dial function in all cases #60712

chriso opened this issue Jun 9, 2023 · 4 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@chriso
Copy link
Contributor

chriso commented Jun 9, 2023

What version of Go are you using (go version)?

$ go version
go version go1.20.3 linux/arm64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/tmp/go"
GOENV="/home/ubuntu/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/usr/local/lib/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/usr/local/lib/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/lib/go-1.20"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go-1.20/pkg/tool/linux_arm64"
GOVCS=""
GOVERSION="go1.20.3"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="0"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-O2 -g"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-O2 -g"
CGO_FFLAGS="-O2 -g"
CGO_LDFLAGS="-O2 -g"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -fno-caret-diagnostics -Qunused-arguments -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build1714904807=/tmp/go-build -gno-record-gcc-switches"

What did you do?

The net.Resolver accepts an optional Dial function that says the following:

type Resolver struct {
	// Dial optionally specifies an alternate dialer for use by
	// Go's built-in DNS resolver to make TCP and UDP connections
	// to DNS services. The host in the address parameter will
	// always be a literal IP address and not a host name, and the
	// port in the address parameter will be a literal port number
	// and not a service name.
	// If the Conn returned is also a PacketConn, sent and received DNS
	// messages must adhere to RFC 1035 section 4.2.1, "UDP usage".
	// Otherwise, DNS messages transmitted over Conn must adhere
	// to RFC 7766 section 5, "Transport Protocol Selection".
	// If nil, the default dialer is used.
	Dial func(ctx context.Context, network, address string) (Conn, error)
}

I created a script that logs Dial calls when using the pure Go resolver: https://go.dev/play/p/0O_ARZyK2eG

If I run this script locally, I see something like this:

$ ./resolve
Dial(udp, 127.0.0.53:53)
Dial(udp, 127.0.0.53:53)
{172.217.24.46 }
{2404:6800:4006:804::200e }

However, if I run the script with strace, I see that Go is making additional connections some other way:

$ strace ./resolve 2>&1 | grep '^connect'
connect(7, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.53")}, 16) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(9), sin_addr=inet_addr("172.217.24.46")}, 16) = 0
connect(3, {sa_family=AF_INET6, sin6_port=htons(9), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "2404:6800:4006:804::200e", &sin6_addr), sin6_scope_id=0}, 28) = -1 ENETUNREACH (Network is unreachable)

There's is one hardcoded call to net.DialUDP here which appears to be the source of the additional connections.

What did you expect to see?

I expect to see the Dial function used for all connections made by the pure Go resolver.

What did you see instead?

I see that the Dial function is only used in some cases.

Additional context

CL 500576 fixes the issue by using net.Resolver.Dial in all cases.

For context, this change is important for targets with limited networking capabilities (e.g. GOOS=wasip1). It means that users can provide their own Dial function to make use of the pure Go resolver. At the moment the hardcoded net.DialUDP call makes the pure Go resolver off limits for these targets.

There was some concern in the CL about whether making this change for all targets would break code in the wild. I'm submitting it as a bug report so we can discuss here instead.

cc GOOS=wasip1 maintainers: @achille-roussel @johanbrandhorst @Pryz

cc those that commented on CL 500576: @mateusz834 @ianlancetaylor

@chriso
Copy link
Contributor Author

chriso commented Jun 9, 2023

If I replace the hardcoded DialUDP call with r.dial("udp") then the provided Dial function is used in all cases.

-c, err = DialUDP("udp", nil, &dst)
+c, err = r.dial(ctx, "udp", dst.IP.String())

This has the additional benefit of threading the lookup context through to the underlying dialer.

If we're concerned about breaking code in the wild, we could instead opt-in by target, and take this path for GOOS=wasip1 only for now (since it has limited networking capabilities, and DialUDP always fails).

This approach was suggested by @mateusz834:

if runtime.GOOS == "wasip1" {
    c, err = r.dial(ctx, "udp", dst.IP.String())
} else {
    c, err = DialUDP("udp", nil, &dst)
}

@ianlancetaylor suggested that we might instead require an additional hook:

type Resolver struct {
    Dial func(ctx context.Context, network, address string) (Conn, error)
    
    // Extra hook:
    DialUDP func(ctx context.Context, network, address string) (Conn, error)
}

or something like this:

type Resolver struct {
    Dial func(ctx context.Context, network, address string) (Conn, error)
    
    // Extra hook:
    UDPConnect func(ctx context.Context, *UDPAddr) (*UDPAddr, bool)
}

@gopherbot
Copy link

Change https://go.dev/cl/500576 mentions this issue: net: prefer Resolver.Dial over DialUDP on wasip1

@ianlancetaylor ianlancetaylor added this to the Go1.22 milestone Jun 10, 2023
@ianlancetaylor ianlancetaylor added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 10, 2023
@mateusz834
Copy link
Member

mateusz834 commented Jun 10, 2023

The runtime.GOOS == "wasip1" guard was just a simple fix idea, but I agree with @ianlancetaylor that having a per platform behaviour in this case is not ideal.

I think that this hook should be named something like IsAddrReachable, so that the intention is clear.
And probably it should use the netip.Addr at this point.

type Resolver struct {
    // IsAddrReachable is used for address sorting by the go resolver.
    // When this field is equal to nil, the default dialer is being used. addr is considered reachable,
    // when the default dialer sucesfully establishes a UDP connection to addr.
    IsAddrReachable func(ctx context.Context, addr netip.Addr) (local netip.Addr, reachable bool)
}

@chriso
Copy link
Contributor Author

chriso commented Jun 14, 2023

CL 502315 improved the situation for wasip1 by addressing the panic in net.DialUDP. Since it no longer panics, an error from the hardcoded call only affects the sort order.

@odeke-em odeke-em modified the milestones: Go1.22, Go1.23 Dec 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

5 participants