New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net: ListenPacket can't be used on multicast address #34728
Comments
/cc @bradfitz |
/cc @mikioh |
Last issue-tracker comments by mikioh were last spring, despite several CCs since. Maybe his github notifications are off? https://github.com/golang/go/issues?utf8=%E2%9C%93&q=is%3Aissue+commenter%3Amikioh |
I have hit up against this problem myself. Actually there is two bugs in my particular case, it's not nearly as clean as I would like it but I found the fix. First issue is that As the bug report here says, listening to that multicast address also binds to 0.0.0.0 and if it had no direct impact on functionality I'd say it's not that important. But because listening to the multicast address binds to all interfaces on the same port as the multicast listener, I can't also send to the address from the same machine (which led me to investigating how to tell it what listen port to use for sending). Second issue is when I have a tun0 interface (VPN) running, it doesn't multicast out the physical network interface, so I filter out interfaces without hardware addresses that don't have the mulitcast flag set. These both relate to the dialer. Note, it works fine with no tunnel interface, presumably set to higher priority, and the network stack dutifully sends it only out of the higher prioritized virtual interface and the rest of the lan hears nothing. But it is then impossible to listen on the same port on the same machine that is connecting. To solve it:
As I see it there is two clear bugs involved. The function
In fact, what the OP says about it being wrong that the multicast listener binds to 0.0.0.0:portnumber is not really a problem. The problem is in the dialler, which doesn't enable SO_REUSEPORT and monopolises the port number thereby for all interfaces. Second, it doesn't discriminate which local IP to bind to, and lets the operating system network stack priority list to choose the first one it sees, which is the wrong one. Dialler should not pick a virtual interface for the LAN multicast addresses by default, as this makes LAN multicasting impossible, and it should not do that when multicast address especially that is for LAN multicasting is being sent to. Third is the dialler uses the same port on 0.0.0.0 (with the wrong virtual interface IP) as you are trying to broadcast to, which causes a port conflict, but I think it is trying this without REUSEPORT or even REUSEADDR enabled so you can't send and receive at the same time. It's not that difficult to fix these things but it seems like a pretty bad omission that it in two cases automatically chooses things that cause a binding conflict when there is enough information just from the IP to indicate that it's a LAN address and thus should not use a virtual interface by default. Here is the code I wrote that resolves the issue:
|
I see there has been no progress on this issue. I have also discovered that the multicast group join function in dgramopt sets the wrong options for darwin and freebsd platforms. I suspect there needs to be a different implementation between linux/android/windows and *bsd/darwin. I wrote the code above primarily for the broadcast purpose anyway, I can't figure out why but this code works for the same exact purpose: package main
import (
"fmt"
"net"
"time"
"github.com/p9c/pod/pkg/log"
)
const UDP4MulticastAddress = "224.0.0.1:11049"
func main() {
ready := make(chan struct{})
go func() {
log.INFO("listening")
ready <- struct{}{}
Listen(UDP4MulticastAddress, func(addr *net.UDPAddr, count int, data []byte) {
log.INFO(addr, count, string(data[:count]))
})
}()
bc, err := NewBroadcaster(UDP4MulticastAddress)
if err != nil {
log.ERROR(err)
}
<-ready
for i := 0; i < 10; i++ {
log.INFO("sending", i)
// var n int
_, err = bc.Write([]byte(fmt.Sprintf("this is a test %d", i)))
// log.INFO(n, err)
if err != nil {
log.ERROR(err)
}
}
time.Sleep(time.Second * 1)
}
const (
maxDatagramSize = 8192
)
// NewBroadcaster creates a new UDP multicast connection on which to broadcast
func NewBroadcaster(address string) (*net.UDPConn, error) {
addr, err := net.ResolveUDPAddr("udp", address)
if err != nil {
return nil, err
}
conn, err := net.DialUDP("udp", nil, addr)
if err != nil {
return nil, err
}
return conn, nil
}
// Listen binds to the UDP address and port given and writes packets received
// from that address to a buffer which is passed to a hander
func Listen(address string, handler func(*net.UDPAddr, int, []byte)) {
// Parse the string address
addr, err := net.ResolveUDPAddr("udp", address)
if err != nil {
log.ERROR(err)
}
// Open up a connection
conn, err := net.ListenMulticastUDP("udp", nil, addr)
if err != nil {
log.ERROR(err)
}
conn.SetReadBuffer(maxDatagramSize)
// Loop forever reading from the socket
for {
buffer := make([]byte, maxDatagramSize)
numBytes, src, err := conn.ReadFromUDP(buffer)
if err != nil {
log.ERROR("ReadFromUDP failed:", err)
}
handler(src, numBytes, buffer)
}
} |
EDIT2: This code does work on freebsd: https://github.com/p9c/pod/blob/master/pkg/transport/cmd/example/listenservemulticast.go I must be doing something wrong. EDIT: I may need to not set the options for sends to non-multicast addresses here? I got it working on both darwin and linux now, here is the altered code related to the NewConnection function presented above. The trick seems to be that the listener has to have the SO_REUSEADDR toggled, and it seems to need a net.PacketConn (that might not be essential but yes, this works). I'm not sure if this is actually relevant or helpful relating to the bug in this topic exactly, but as you can see it uses ListenPacket and if you feed NewConnection a multicast address it works (and doesn't hear its own packets also, importantly). func reusePort(network, address string, conn syscall.RawConn) error {
return conn.Control(func(descriptor uintptr) {
syscall.SetsockoptInt(int(descriptor), syscall.SOL_SOCKET, syscall.SO_REUSEADDR, 1)
})
}
// NewConnection creates a new connection with a defined default send
// connection and listener and pre shared key password for encryption on the
// local network
func NewConnection(send, listen, preSharedKey string, maxDatagramSize int,
ctx context.Context, multicast bool) (c *Connection, err error) {
sendAddr := &net.UDPAddr{}
sendConn := &net.UDPConn{}
listenAddr := &net.UDPAddr{}
var listenConn net.PacketConn
if listen != "" {
config := &net.ListenConfig{Control: reusePort}
listenConn, err = config.ListenPacket(context.Background(), "udp4", listen)
if err != nil {
log.ERROR(err)
}
}
if send != "" {
sendAddr, err = net.ResolveUDPAddr("udp4", send)
if err != nil {
log.ERROR(err)
}
sendConn, err = net.DialUDP("udp4", nil, sendAddr)
if err != nil {
log.ERROR(err, sendAddr)
}
// log.SPEW(sendConn)
}
ciph := gcm.GetCipher(preSharedKey)
return &Connection{
maxDatagramSize: maxDatagramSize,
buffers: make(map[string]*MsgBuffer),
sendAddress: sendAddr,
SendConn: sendConn,
listenAddress: listenAddr,
listenConn: &listenConn,
ciph: ciph, // gcm.GetCipher(*cx.Config.MinerPass),
ctx: ctx,
mx: &sync.Mutex{},
}, err
} I want to also report something that might be helpful in the future: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=149086 The code above works fine on Linux and MacOS for sure, but the sender doesn't work on FreeBSD. I was relieved when I found this issue on the FreeBSD Bugzilla because now I know where it is probably from. So you might be able to add this one to the caveats about known bugs on the root of the net library documentation. But there is definitely an issue with the UDP multicast parts of the net library. Implicitly any sender or listener using a multicast address needs to have the SO_REUSEADDR or SO_REUSEPORT (I lean towards the former as being more consistently supported across windows, linux, darwin, and *bsds. Just my opinion. |
Can we focus on @quentinmit 's original issue report and talk about the setting of the addr to IPv4zero/IPv6unspecified? (I see that the socket and port reuse issue that the subsequent comments focused on has been resolved and really that was an independent problem anyway.) The current implementation behavior is incorrect and it means that if I were to listen for multicast traffic on 239.2.3.101 by calling This is broken behavior for any system that is subscribing to multicast traffic from many different multicast addresses where the ports for the packets happen to overlap. The documentation for https://godoc.org/golang.org/x/net/ipv4 posits that you can look at the control messages and determine if you actually care about the packet of data but this is hugely inefficient and simply won't scale for some workloads. The 'fix' is to remove If someone wants the behavior described in the comment then they can specify the IP address of IPv4zero/IPv6unspecified -- if that is what they want. Does this sound acceptable? |
cc @FiloSottile |
I waited for 5 weeks for @FiloSottile to respond to the tag above and then sent an email and got a response. I wrote:
His response was:
So, bringing it back here. This really doesn't need to be a huge problem -- in fact it isn't for any language by golang. Whoever decided that any golang program on a box would receive all multicast traffic for a specific port (independent of multicast IP addr) was optimizing for the wrong thing. The fundamentals of a language - or any system - should be, "Make the easy things easy, and the hard things possible." How should we proceed here? |
cc @odeke-em for any input... |
I don't claim to understand this code. That said, there is a comment for the code that listens on a multicast address: // We provide a socket that listens to a wildcard
// address with reusable UDP port when the given laddr
// is an appropriate UDP multicast address prefix.
// This makes it possible for a single UDP listener to
// join multiple different group addresses, for
// multiple UDP listeners that listen on the same UDP
// port to join the same group address. The code this refers to dates back to https://golang.org/cl/5562048 for #41921 . It sounds like people here are saying that the current approach in the code is not desirable. But I'm not clear on what the right behavior is. |
To be clear, I am not the owner or even an expert of this code, so I assume @networkimprov cc'd me just in case I had encountered this before, which I haven't, and discussing the issue in my personal email was not going to help. |
Totally understand @FiloSottile - I only reached out through email as I was concerned that they were waiting on you to make forward progress. :). My apologies for bugging you on your personal email.
With the current behavior of the code this call Looking more closely at the comment in the code there are two subparts that we should more closely understand. The first is this:
Indeed this is what the code is currently doing - although only for the same port. It isn't completely generic like "I want a single listener to receive data from X:Y and W:V multicast addresses". Is there some use case out there where the author wanted multiple multicast addresses to be listened to but for the same port? Perhaps. Should this mean that you therefore cannot listen to a single multicast address and port pair? No. The second part of the comment reads:
This use case is much more common but it is independent of the issue that we have narrowed this discussion down to. As you can see in the bsd version of setDefaultMulticastSockOpts - Lines 46 to 56 in 4ce9ea5
The new code, along with updated comment, would read something like this:
This change keeps intact the reuse usecase but removes the shortcut where you get all multicast traffic independent of multicast IP. The general principle of 'Make the easy things easy and the hard things possible' is in effect here. If the caller wants multicast traffic for all multicast IPs then they can pass in 0.0.0.0 themselves. They will have to set the socket options if they want the second behavior but that is up to them. |
I am also hitting this problem. Can someone please explain, why when I ask to bind on specific IP (which is pretty low-level stuff) does the programming language decide to do something else? |
@danielkucera - no other programming language that I am aware of will do this. My best guess is that someone - very early on - had a use case where they wanted all multicast traffic on a specific port to be delivered to the reader. Or someone was very confused about what they were doing. Either way this is a bug with a simple fix that we can apply to Golang to bring its semantics and usefulness on par with all other languages. |
Yes, the current behavior is confusing; I would like to get a fix for this. We have some video monitoring service that is using multicast. From time to time, a different channel than expected channel would appear in the control room. It turns out, the receiver pull program written in Go would get a random stream from the Linux Kernel if the port is overbooked. |
@ianlancetaylor any thoughts? |
Sorry, nothing to add to what I wrote last October. |
Anyone else we could cc? |
@ianlancetaylor you wrote.
In October you wrote "It sounds like people here are saying that the current approach in the code is not desirable. But I'm not clear on what the right behavior is." I detailed - in my comment in October 2020 - the specific change to behavior (along with what the code change would be). I believe that it is "right" in that it satisfies the principle of least surprise and also inline with what other programming languages do. I assume that @networkimprov was asking for your thoughts on the specific proposal for the change. :) |
I'm sorry, I'm not a networking expert at all. Can you send in a change with the exact source code change that you are suggesting? Thanks. |
Absolutely, if we can get agreement from any necessary parties on the semantic changes. I believe that this is a prerequisite before investing any time to make the code changes - especially given the specificity of the description that I provided above. |
I'm hoping to see the change to make sure that I fully understand what you wrote. Code is clearer than words. Thanks. |
@ianlancetaylor see the attached MR #47772 . Is it clear now? |
Change https://golang.org/cl/343149 mentions this issue: |
@danielkucera - thank you very much! This is perfect. |
Thanks for the patch. i'm running into the same issue in that I need to share multicast address between multiple applications on the same interface and the other applications (not golang) behave appropriately ... i'm going to incorporate your change to my personal branch and see if it works for my use case since this code has not yet been merged/incorporated/looked at |
So I have pieced together this example for listening to specific IP addresses on IPv4 without receiving everything and filtering. Does it look okay?
|
What version of Go are you using (
go version
)?go version go1.12.9
Does this issue reproduce with the latest release?
Yes, the code is unchanged on master.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
What did you expect to see?
A socket bound to 224.0.0.128:5076 (but not joined to a multicast group)
What did you see instead?
A socket bound to 0.0.0.0:5076 (and not joined to any multicast group)
net.ListenPacket
's documentation says:But in fact it also applies this logic if the address is a multicast address. I suspect this is because
net.ListenMulticastUDP
is documented as having this behavior:and it shares an implementation function with
ListenPacket
:https://github.com/golang/go/blob/master/src/net/sock_posix.go#L210
And no, golang.org/x/net/ipv4 does not provide a workaround here, because all of its APIs require that you start with
net.Listen*
to get anet.PacketConn
. Likewise, thesyscall
package doesn't provide an escape hatch becausenet.FilePacketConn
is not implemented on Windows (though I will likely end up doing that anyway, with my resulting code only supporting POSIX environments).The text was updated successfully, but these errors were encountered: