New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net: Interfaces on Linux should check for rtnetlink dump interrupted #52137
Comments
Thanks for looking into it. Would it be possible to provide a standalone reproducer? Thanks. |
This script demostrates the issue.
The result should look something like this:
Then you can compare the result files. |
Given that this issue is very unlikely to happen, and that an API change is unwanted, my suggestion is that the library does a small number (one or two) of retries when detecting a 'NLM_F_DUMP_INTR', and only return an error in the extremely unlikely event of the retries also failing. |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
As far as I can tell Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
See #51934 for the background.
I have helped analyze the issue, mainly from a Linux kernel perspective, and I have found two issues:
We see a 'golang' program (kube-proxy) somethimes receive a truncated list of address when calling "net.InterfaceAddrs()".
The library issues an 'RTM_GETADDR' request to the kernel.
The 'RTM_GETADDR' request ends up in the kernel routine "inet_dump_ifaddr()"
https://elixir.bootlin.com/linux/v4.15.18/source/net/ipv4/devinet.c#L1638
The kernel keeps two 512 position hash tables to store IP addresses.
One hashed by interface name, and one hashed by the interface index.
The one used to retrieve addresses with 'netlink' is the index hash
table. Each hack bucket contains a liked list of interfaces, and
each interface has a linked list of addresses.
The linux kernel will fill as many addresses as will fit in one
SKB buffer (max size 32k) by iterating over the buckets/devices/addresses.
Depending on how many per-address attributes exists, approx 400
addresses are included in one buffer.
When a buffer is full, the position of the bucket/interface/address
iterators are saved until the next buffer is to be filled.
This will only happen after the user mode program has read the
previous one.
To visualize the issue, lets instead assume that each buffer
can only contain 6 addresses.
So, after having filled the buffer with the addresses from
interface 'a', 'b', 'c', and the first three addresses from
interface 'd' the buffer is full, and the we save the indexes
A=4, B=2, C=3 as the iterator bucket/interface/address indexes.
Now, interface 'c' gets deleted, so the state changes like the picture
below.
Notice that when we continue, the iterator interface index will now
select interface 'e' instead of 'd'
Furthermore, the iterator address index is now greater than the number
of addresses, so the next available address to store will be the one for
interface 'g'.
The result is that all the remaining addresses from interface 'd' and
the first three (in this case all) addresses of interface 'e' will
not be part of the reported set of addresses.
To help the user become aware that the returned data may be inconsistent,
a flag 'NLM_F_DUMP_INTR' is set in the header of the message when something
like this is discovered (routine 'nlm_check_consistent()'). Unfortunately,
the 'golang' routine 'interfaceTable()' that has the possibility to detect
this, does not. (https://go.dev/src/net/interface_linux.go).
Also, if the request is for AF_UNSPEC (as in 'golang'/net/interfaceTable()'),
then the 'NLM_F_DUMP_INTR' flag may get wiped out by the IPv6 addresses
that follow if there are no IPv4 interfaces with addresses after the
interruption. (in our example, if the 'g' interface did not exist).
Another unfortunate thing is that new interfaces are added at the head
of the hash table list, so if there are more than 256 interfaces (one in
each bucket), the issue may happen on every subsequent create/delete cycle.
So, to reproduce the issue:
# ./list_addresses.py > A
in the first column by 'ip link') masked by 0xff should now be the same
for both 'kube-proxy0' and 'transient0'.
# (./list_addresses.py >B)& sleep .5; ip link del transient0
the '* DUMP INTERESTTED *' message depending on what index your devices
have and if they have IPv6 addresses only.
So, what is needed to fix the problem:
A short term solution for our issue would be to instead of 2) have
the 'golang' library not use 'AF_UNSPEC'
/Per
The text was updated successfully, but these errors were encountered: