Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: Go's DNS resolver retry AAAA request to next nameserver even A response has records #64783

Closed
WeiZhixiong opened this issue Dec 18, 2023 · 8 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.

Comments

@WeiZhixiong
Copy link

Go version

go version go1.21.5 linux/amd64

What operating system and processor architecture are you using (go env)?

GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/root/.cache/go-build'
GOENV='/root/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/root/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/root/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/opt/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/opt/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.21.5'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/dev/null'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build403659500=/tmp/go-build -gno-record-gcc-switches'

What did you do?

Problem

here is /etc/resolv.conf

nameserver 10.154.200.27
nameserver 10.66.200.202
options timeout:1 attempts:1

we have some private domain, it's only has A records, no AAAA, I use net.Dial("tcp", "domain:80") start a connect, the Go's DNS resolver retry AAAA request to next nameserver even A response has records.
image

if use cgo, no retry. like go build -tags 'netcgo' same program, the result is:
image

Reason

we debug find hit the errLameReferral error. but in glibc, it's different.

func checkHeader(p *dnsmessage.Parser, h dnsmessage.Header) error {
    ...
    // libresolv continues to the next server when it receives
    // an invalid referral response. See golang.org/issue/15434.
    if h.RCode == dnsmessage.RCodeSuccess && !h.Authoritative && !h.RecursionAvailable && err == dnsmessage.ErrSectionDone {
        return errLameReferral
    }
    ....
}

func (r *Resolver) tryOneName(ctx context.Context, cfg *dnsConfig, name string, qtype dnsmessage.Type) (dnsmessage.Parser, string, error) {
    ...
    for i := 0; i < cfg.attempts; i++ {
        for j := uint32(0); j < sLen; j++ {
            server := cfg.servers[(serverOffset+j)%sLen]
 
            p, h, err := r.exchange(ctx, server, q, cfg.timeout, cfg.useTCP, cfg.trustAD)
            ...
            if err := checkHeader(&p, h); err != nil {
                dnsErr := &DNSError{
                    Err:    err.Error(),
                    Name:   name,
                    Server: server,
                }
            ...
            }
    ...
}

in glibc A or AAAA anyone has records, no retry.

next_ns:
    if (recvresp1 || (buf2 != NULL && recvresp2)) {
      *resplen2 = 0;
      return resplen;
    }
 
...
 
if (anhp->rcode == NOERROR && anhp->ancount == 0
    && anhp->aa == 0 && anhp->ra == 0 && anhp->arcount == 0) {
    goto next_ns;
}

What did you expect to see?

if domain A type response has records, no AAAA retry, if domain AAAA type response has records, no A retry.

What did you see instead?

if domain A type response has records, AAAA has no record, AAAA request retry to next nameserver.

@mateusz834
Copy link
Member

Does CL 550435 help with this?
Can you share the output of these commands?

dig @10.154.200.27 abtest.bilibili.co A
dig @10.154.200.27 abtest.bilibili.co AAAA
dig @10.66.200.202 abtest.bilibili.co A
dig @10.66.200.202 abtest.bilibili.co AAAA

@WeiZhixiong
Copy link
Author

Does CL 550435 help with this?

Yes, but there has two problem, CL 550435 only fix one.

dig @10.154.200.27 abtest.bilibili.co A

; <<>> DiG 9.10.3-P4-Debian <<>> @10.154.200.27 abtest.bilibili.co A
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 27383
;; flags: qr rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;abtest.bilibili.co.		IN	A

;; ANSWER SECTION:
abtest.bilibili.co.	80	IN	A	10.154.200.12

;; Query time: 0 msec
;; SERVER: 10.154.200.27#53(10.154.200.27)
;; WHEN: Mon Dec 18 23:42:46 HKT 2023
;; MSG SIZE  rcvd: 63

dig @10.154.200.27 abtest.bilibili.co AAAA

; <<>> DiG 9.10.3-P4-Debian <<>> @10.154.200.27 abtest.bilibili.co AAAA
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 64271
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;abtest.bilibili.co.		IN	AAAA

;; AUTHORITY SECTION:
bilibili.co.		542	IN	SOA	shylf-dns-03. hostmaster.bilibili.co. 2023121803 16384 900 1048576 2560

;; Query time: 0 msec
;; SERVER: 10.154.200.27#53(10.154.200.27)
;; WHEN: Mon Dec 18 23:45:05 HKT 2023
;; MSG SIZE  rcvd: 106

dig @10.66.200.202 abtest.bilibili.co A

; <<>> DiG 9.10.3-P4-Debian <<>> @10.66.200.202 abtest.bilibili.co A
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 50743
;; flags: qr rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;abtest.bilibili.co.		IN	A

;; ANSWER SECTION:
abtest.bilibili.co.	120	IN	A	10.154.200.12

;; Query time: 1 msec
;; SERVER: 10.66.200.202#53(10.66.200.202)
;; WHEN: Mon Dec 18 23:45:36 HKT 2023
;; MSG SIZE  rcvd: 63

dig @10.66.200.202 abtest.bilibili.co AAAA

; <<>> DiG 9.10.3-P4-Debian <<>> @10.66.200.202 abtest.bilibili.co AAAA
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62038
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;abtest.bilibili.co.		IN	AAAA

;; AUTHORITY SECTION:
bilibili.co.		600	IN	SOA	shylf-dns-03. hostmaster.bilibili.co. 2023121803 16384 900 1048576 2560

;; Query time: 1 msec
;; SERVER: 10.66.200.202#53(10.66.200.202)
;; WHEN: Mon Dec 18 23:46:10 HKT 2023
;; MSG SIZE  rcvd: 106

@mateusz834
Copy link
Member

First of all your resolvers are not recursive. All resolvers in /etc/resolv.conf should be recursive.

I suspect that CL 550435 helps, because the responses have an EDNS(0) resource (arcount != 0), so that the errLameReferral is not returned.

Yes, but there has two problem, CL 550435 only fix one.

What do you mean that there are two problems? What it fixes? What it did not fix?

@thanm thanm added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Dec 18, 2023
@thanm
Copy link
Contributor

thanm commented Dec 18, 2023

@ianlancetaylor @neild per owners

@seankhliao
Copy link
Member

Duplicate of #57697

@gopherbot
Copy link

Change https://go.dev/cl/550435 mentions this issue: net: Prevent unintended retries upon receiving an empty answer response from the DNS server.

gopherbot pushed a commit that referenced this issue Feb 19, 2024
…se from the DNS server.

CL https://golang.org/cl/37879 migrates DNS message parsing to the golang.org/x/net/dns/dnsmessage package. However, during the modification of the "lame referral" error check introduced by CL https://golang.org/cl/22428, a condition was overlooked. This omission results in unexpected retries when a DNS server returns an empty response (not an invalid response, but one that includes an additional section).

Fixes #57697
Fixes #64783

Change-Id: I203896aa2902c305569005c1712fd2f9f13a9b6b
Reviewed-on: https://go-review.googlesource.com/c/go/+/550435
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Commit-Queue: Ian Lance Taylor <iant@golang.org>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
Reviewed-by: Dexter Ouyang <kkhaike@gmail.com>
@gopherbot
Copy link

Change https://go.dev/cl/565295 mentions this issue: net: prevent unintended retries upon receiving an empty answer response from the DNS server.

@gopherbot
Copy link

Change https://go.dev/cl/565296 mentions this issue: net: prevent unintended retries upon receiving an empty answer response from the DNS server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

5 participants