Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/tcp: provide a way to set keepalive time and interval separately #8328

Open
gopherbot opened this issue Jul 4, 2014 · 7 comments
Open
Milestone

Comments

@gopherbot
Copy link

by redforks:

func setKeepAlivePeriod() in net/tcpsockopt_unix.go is the back end of
TCPConn.SetKeepAlivePeriod() on linux platform. It set both TCP_KEEPINTVL and
TCP_KEEPIDLE to the period passed in:

func setKeepAlivePeriod(fd *netFD, d time.Duration) error {
    if err := fd.incref(); err != nil {
        return err
    }
    defer fd.decref()

    // The kernel expects seconds so round to next highest second.
    d += (time.Second - time.Nanosecond)
    secs := int(d.Seconds())

    err := os.NewSyscallError("setsockopt", syscall.SetsockoptInt(fd.sysfd, syscall.IPPROTO_TCP, syscall.TCP_KEEPINTVL, secs))
    if err != nil {
        return err
    }
    return os.NewSyscallError("setsockopt", syscall.SetsockoptInt(fd.sysfd, syscall.IPPROTO_TCP, syscall.TCP_KEEPIDLE, secs))
}

Linux has three socket option affects keep alive:

  tcp_keepalive_time

 the interval between the last data packet sent (simple ACKs are not considered data) and the first keepalive probe; after the connection is marked to need keepalive, this counter is not used any further

tcp_keepalive_intvl

 the interval between subsequential keepalive probes, regardless of what the connection has exchanged in the meantime

tcp_keepalive_probes

 the number of unacknowledged probes to send before considering the connection dead and notifying the application layer

If I am not wrong, first kernel waits for TCP_KEEPIDLE seconds than send a keep alive
package (first probe), if not receive response from peer, wait TCP_KEEPINTVL seconds
send keep alive package again until TCP_KEEPCNT times, then close the tcp connection.

Linux default parameters is:
  TCP_KEEPIDLE: 7200 // 2 hours
  TCP_KEEPINTVL: 75 // 75 seconds
  TCP_KEEPCNT: 9 // try 9 times
So default a connection will closed after 7200+75*9 seconds, about 2 hours and 11
minutes.

Set both TCP_KEEPINTVL and TCP_KEEPIDLE to the same value normally is not we want. Such
as:

  conn.SetKeepAlivePeriod(time.Hour)

The actual effect is: 1 hour + 9 hour, after 10 hours kernel close the connection, it is
much longer than OS default settings.

I guess for compatible reason, SetKeepAlivePeriod() can not support all the three
parameters that linux use, but at least not set TCP_KEEPINTVL.
@ianlancetaylor
Copy link
Contributor

Comment 1:

Labels changed: added repo-main, release-go1.4.

@gopherbot
Copy link
Author

Comment 3:

CL https://golang.org/cl/136480043 mentions this issue.

@rsc
Copy link
Contributor

rsc commented Sep 16, 2014

Comment 5:

Did CL 136480043 fix this? The gobot said that the CL mentioned this issue, but it
doesn't look like it did when it was submitted.

@ianlancetaylor
Copy link
Contributor

Comment 6:

CL 136480043 was simplified.  It did not fix this issue.
Actually I think this issue should be changed to request a more complex interface in
go.net.  Adjusting summary and labels accordingly.

Labels changed: added repo-net, release-none, removed repo-main, release-go1.4.

@bradfitz bradfitz removed the new label Dec 18, 2014
@mikioh mikioh changed the title go.net: provide a way to set keepalive time and interval separately x/net: provide a way to set keepalive time and interval separately Dec 23, 2014
@mikioh mikioh added repo-net and removed repo-net labels Dec 23, 2014
@mikioh mikioh changed the title x/net: provide a way to set keepalive time and interval separately net: provide a way to set keepalive time and interval separately Jan 4, 2015
@rsc rsc added this to the Unplanned milestone Apr 10, 2015
@rsc rsc removed the release-none label Apr 10, 2015
@rsc rsc changed the title net: provide a way to set keepalive time and interval separately x/net: provide a way to set keepalive time and interval separately Apr 14, 2015
@rsc rsc modified the milestones: Unreleased, Unplanned Apr 14, 2015
@rsc rsc removed the repo-net label Apr 14, 2015
@mikioh mikioh changed the title x/net: provide a way to set keepalive time and interval separately x/net/tcp: provide a way to set keepalive time and interval separately May 2, 2015
@luigiberrettini
Copy link

IMHO it would be useful to support, in a platform indipendent way, the configuration of all keep-alive settings:

  • retry count
  • time
  • interval

An attempt has been made by @felixge with his tcpkeepalive library

For Windows inspiration can be taken from the work performed to close https://github.com/dotnet/corefx/issues/25040 and https://github.com/dotnet/corefx/issues/33111

@madflojo
Copy link

Bumping this issue.

Having the ability to set custom keep-alive settings would be a huge benefit to folks who write TCP Server applications (folks like myself). At the moment, there is no way to set the idle time (tcp_keepalive_time) or probe count (tcp_keepalive_probes) via the standard library implementation.

These settings can be set at a per session value, and are very useful when dealing with long live TCP Sessions. At the moment you either need to use the tcpkeepalive library which in its own admission has issues or export the raw connection to set these values with syscall.

@Gr33nbl00d
Copy link

Gr33nbl00d commented Apr 25, 2023

Indeed looks like a bug to me why is timeout=interval that totally does not make any sense.

We receive usually every minute a request from our iot devices so we would like to set the timeout to more than a minute.
But the interval to something small like 3 seconds.

This way we dont flood the network with keep alives. If we do not receive something for 1 minute connetion is timed out after 10 (keep alive count) x interval = 30 seconds.

With the GO implementation: The only option i have is to either set it to 3 seconds. Flodding the network with unneeded keep alive messages or set it to 1 minute and aceppting that a connections takes 9 minutes to time out! This is not acceptable. If we have 50.000 IOT devices connected and there is a network break it takes 9 mintues to clean up the mess, But every 3 seconds an ACK request is with this number of devices hugh increase in traffic over a year.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants