Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: test is failing on Dragonfly builder #34368

Closed
timdarbydotnet opened this issue Sep 18, 2019 · 35 comments
Closed

net: test is failing on Dragonfly builder #34368

timdarbydotnet opened this issue Sep 18, 2019 · 35 comments
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. OS-Dragonfly Testing An issue that has been verified to require only test changes, not just a test failure.
Milestone

Comments

@timdarbydotnet
Copy link

timdarbydotnet commented Sep 18, 2019

What version of Go are you using (go version)?

1.13

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

DragonflyBSD amd64

What did you do?

The net test is consistently failing after I upgraded the OS to a recent commit:
https://build.golang.org/log/58be31cfd1a92ba9582fdf33e01f79e03184e59b

At first glance, this appears to be a Dragonfly bug
http://bugs.dragonflybsd.org/issues/3205

@dmitshur dmitshur added OS-Dragonfly Testing An issue that has been verified to require only test changes, not just a test failure. labels Sep 18, 2019
@timdarbydotnet
Copy link
Author

According to Matt Dillon, the structural elements in the route messages have changed. DF team is working on a fix.

@toothrot toothrot changed the title net: test is failing on Dragonfly builder x/net: test is failing on Dragonfly builder Sep 20, 2019
@gopherbot gopherbot added this to the Unreleased milestone Sep 20, 2019
@toothrot toothrot changed the title x/net: test is failing on Dragonfly builder net: test is failing on Dragonfly builder Sep 20, 2019
@toothrot toothrot removed this from the Unreleased milestone Sep 20, 2019
@toothrot toothrot added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 20, 2019
@toothrot toothrot added this to the Go1.14 milestone Sep 20, 2019
@tuxillo
Copy link
Contributor

tuxillo commented Sep 26, 2019

It seems we need to adjust a defs file in https://github.com/golang/net but we're not sure what's the workflow to get those changes into Go. Can you please let us know?

@bcmills
Copy link
Contributor

bcmills commented Sep 27, 2019

we're not sure what's the workflow to get those changes into Go. Can you please let us know?

The process for the golang.org/x repos is the same as for the main go repo: https://golang.org/doc/contribute.html

CC @mikioh @bradfitz

@andybons
Copy link
Member

andybons commented Oct 3, 2019

@tdfbsd the dragonfly builder is a line of red on build.golang.org with a number of failures still: https://build.golang.org/log/e33c85c1a7f2e094141d05bafa426c9038a2bce3

Any update on progress?

@timdarbydotnet
Copy link
Author

I think @tuxillo is still working this.

@tuxillo
Copy link
Contributor

tuxillo commented Oct 3, 2019

FAIL golang.org/x/net/ipv4 0.037s
And the test that belong in there didn't fail before? I did a quick test in 5.6 and they do fail there too.

@timdarbydotnet
Copy link
Author

@tuxillo checkout out the dragonfly column on https://build.golang.org/ to see what @andybons is referring to.

@tuxillo
Copy link
Contributor

tuxillo commented Oct 3, 2019

Ah, I see, still the route mistmatch thing. There was a commit to x/net, see: golang/net@c5a3c61

I don't know how x/net is imported into Golang, how can I help?

@bradfitz
Copy link
Contributor

bradfitz commented Oct 4, 2019

@tuxillo, see https://github.com/golang/go/blob/master/src/README.vendor

But I don't see how golang/net@c5a3c61 could help --- that just removes some symbols, no?

@tuxillo
Copy link
Contributor

tuxillo commented Oct 4, 2019

@bradfitz as far as I understand it, this file is autogenerated based on the defs file:
vendor/golang.org/x/net/route/zsys_dragonfly.go

So when that file is generated again, it will have the correct RTM_VERSION and the "message mismatch" should go away.

@andybons
Copy link
Member

andybons commented Oct 4, 2019

@tuxillo you have to generate the file (on a Dragonfly machine) then send that change for review. The instructions are at the top of the generated file. As a tip, cmd/cgo translates to go tool cgo.

@tuxillo
Copy link
Contributor

tuxillo commented Oct 4, 2019

@andybons oh I thought there was some merging from 'x/net' to golang somehow and frequently. I'll do that then.

@gopherbot
Copy link

Change https://golang.org/cl/199557 mentions this issue: net/route: adjust some constants for DragonFly BSD

@rsc rsc modified the milestones: Go1.14, Backlog Oct 9, 2019
@bradfitz
Copy link
Contributor

I installed DragonFly 5.6.2 and ran the tests as root at tip and they all passed. Including golang.org/x/net/route.

What version are you running, @tdfbsd?

@bradfitz
Copy link
Contributor

Nevermind. I see this a problem at Dragonfly master.

@tuxillo
Copy link
Contributor

tuxillo commented Oct 11, 2019

@bradfitz it affects only master because we've bumped RTM_VERSION there and not in RELEASE (5.6.2) but in the next release it will also fail.

With regards to: "In any case, even if we re-generate the zsys_dragonfly.go files in x/net/route and x/net/ipv6, it only removes 3 lines of unexported constants that nothing ever uses, so that won't help fix the broken builders."

That's only partially true. It would also get the new RTM_VERSION. I've regenerated it locally and it did:

 const (
-       sysRTM_VERSION = 0x6
+       sysRTM_VERSION  = 0x7

In any case, I have two things that I'd like to know:

  1. Can we have two separate builders, one for 'master' and one for 'release' ?
  2. How to deal with such cases in Go so that programs work in both 'master' and 'release' ?

@bcmills
Copy link
Contributor

bcmills commented Oct 11, 2019

  1. Can we have two separate builders, one for 'master' and one for 'release' ?

Absolutely! See http://golang.org/wiki/DashboardBuilders for instructions on how to add a new builder.

  1. How to deal with such cases in Go so that programs work in both 'master' and 'release' ?

That seems like more of a Dragonfly question — does the Dragonfly kernel not provide a stable syscall ABI?

(If it does not, perhaps we should change Dragonfly to use a C syscall library, as we did for macOS in #17490; CC @randall77.)

@tuxillo
Copy link
Contributor

tuxillo commented Oct 11, 2019

It's presenting a new RTM_VERSION in the headers so it's stable, right?. The way I see, the problem is that a statically generated file in x/net won't match the OS headers and hence cause problems.

@bcmills
Copy link
Contributor

bcmills commented Oct 11, 2019

It's presenting a new RTM_VERSION in the headers so it's stable, right?

What effect does changing RTM_VERSION have on the kernel's syscall ABI?
(I'm not familiar with the Dragonfly kernel.)

Do Dragonfly kernels provide ABI compatibility with previous RTM_VERSIONs? If so, we should probably stay on the older RTM_VERSION.

@tuxillo
Copy link
Contributor

tuxillo commented Oct 12, 2019

Sorry, I mean the API is the same but indeed the ABI is not backwards compatible after the bump of RTM_VERSION. We don't have a compat layer, unfortunately.
For our 'master' branch users, any binary using the old RTM_VERSION won't work. It hasn't been bumped in 'release' (stable) yet, it will be on the next release cycle.

@bradfitz
Copy link
Contributor

That's unfortunate. Is that the official Dragonfly policy or was that just an accidental ABI/compat breakage?

Should Go be detecting the kernel version at runtime and using a different ABI depending on where we find ourselves?

Or should Go only target the latest release? Or should Go only target master?

For instance, for other operating systems we have policies & docs on this, like:

https://github.com/golang/go/wiki/OpenBSD

Go aims to support the two most recent OpenBSD releases, because OpenBSD officially supports only the two most recent releases, and makes a best-effort attempt to maintain ABI support in consecutive releases.

We need to figure out what the Dragonfly policy is.

@rsmarples
Copy link

It's not acciental breakage and it's no different from OpenBSD. How Go wants to support it is up to Go. My recommendation would be to figure it out at compile time like every other program that depends on RTM_VERSION.

http://lists.dragonflybsd.org/pipermail/users/2019-September/358280.html

@bradfitz
Copy link
Contributor

How Go wants to support it is up to Go.

To be clear, the Go team doesn't maintain the Dragonfly port.

My recommendation would be to figure it out at compile time like every other program that depends on RTM_VERSION

No other Go port does, that, IIRC.

We either probe the kernel at runtime and figure out what ABI we need to speak (e.g. https://github.com/golang/net/blob/146acd2/route/sys_freebsd.go#L59) or we call C functions instead (Solaris, Darwin, Windows?), but it sounds like that's not even stable here.

Dragonfly-Go maintainers need to decide: do you support stable? Or only tip? Or both tip+stable? If the latter, I suggest one of the Dragonfly-Go maintainers starts writing some code to probe which kernel/ABI version is applicable or add some conditional paths to do the right thing depending on the environment.

@bradfitz bradfitz added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Oct 16, 2019
@gopherbot gopherbot removed the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 16, 2019
@tuxillo
Copy link
Contributor

tuxillo commented Oct 16, 2019

Thanks @bradfitz, we'll discuss it and we'll update this issue when we come to a conclusion. Please don't close it.

@gopherbot
Copy link

Change https://golang.org/cl/201482 mentions this issue: net: skip some interface tests on Dragonfly for now

@tuxillo
Copy link
Contributor

tuxillo commented Oct 17, 2019

@bradfitz initially we're going to try to support both directly in Go. The alternative would be supporting tip (master branch) and do some patching in our package system but I think that doesn't scale for future changes, so we're leaving it as last resort for now.

I've been taking a look at x/net defs for FreeBSD to get a grasp on how it works. I can see they define structures for specific FreeBSD versions (i.e. if_data_freebsd7, ..., if_msghdr_freebsd8 etc) and that works for struct size mismatch case. But for the constant value cases like RTM_VERSION, how do you handle it in the defs file? Maybe you can't handle it there and you have to handle it in the route code directly?

@bcmills
Copy link
Contributor

bcmills commented Oct 17, 2019

@tuxillo, is RTM_VERSION actually needed for Go programs?

The only use of it I see is in syscall.ParseRoutingMessage function, but since that's a deprecated function in a deprecated package, maybe it's not such a big deal: as far as I can tell, syscall.ParseRoutingMessage is not called anywhere in the standard library — even in tests!


Can syscall.ParseRoutingMessage be implemented without depending on RTM_VERSION? If not, what would break if syscall.ParseRoutingMessage unconditionally returned an error on Dragonfly?

@tuxillo
Copy link
Contributor

tuxillo commented Oct 17, 2019

In x/net it seems to be used in several functions. It's also part of the RouteMessage struct.

About returning an error unconditionally, I'm not sure. We'd have to test.

@bcmills
Copy link
Contributor

bcmills commented Oct 17, 2019

The sysRTM_VERSION constant in x/net/route is at least unexported, so the functions that use it could presumably be changed to do something conditional without a breaking change in the API.

@gopherbot
Copy link

Change https://golang.org/cl/201740 mentions this issue: doc/go1.14.html: add some TODOs about various ports

gopherbot pushed a commit that referenced this issue Oct 17, 2019
Skipping tests isn't great, but neither is a wall of red masking other
potential regressions.

Updates #34368

Change-Id: I5fdfa54846dd8d648001594c74f059af8af52247
Reviewed-on: https://go-review.googlesource.com/c/go/+/201482
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Bryan C. Mills <bcmills@google.com>
gopherbot pushed a commit that referenced this issue Oct 17, 2019
Updates #15581
Updates #34368

Change-Id: Ife3be7ed484cbe87960bf972ac701954d86127d8
Reviewed-on: https://go-review.googlesource.com/c/go/+/201740
Reviewed-by: Bryan C. Mills <bcmills@google.com>
@tuxillo
Copy link
Contributor

tuxillo commented Oct 18, 2019

As a reference, the error that appeared on the tests is errMessageMismatch which comes from ParseRIB in x/net.

The call stack is something like this:

(Multiple places)
Interfaces()
InterfaceTable()
interfaceMessages()
ParseRIB()

I wonder if I can hardcode something like sysRTM_VERSION56 in the defs file (which is then parsed by cgo) and put a conditional check in ParseRIB, maybe that would work?

@gopherbot
Copy link

Change https://golang.org/cl/202317 mentions this issue: all: fix tests on dragonfly

gopherbot pushed a commit to golang/net that referenced this issue Oct 21, 2019
Detect the ABI version based on kern.osreldate.

Only use 32-bit cmsg alignment for versions before the September 2019
ABI changes:
http://lists.dragonflybsd.org/pipermail/users/2019-September/358280.html

Use RTM_VERSION 7 on Dragonfly master (5.8 release).

Determine sizeof struct ifa_msghdr at runtime based on the ABI version.

Temporarily skip some test relying on the net package which will only be
fixed once this CL is re-vendored into std.

Updates golang/go#34368

Change-Id: I732fab21d569b303f45dfb6a0bbbb11469511a07
Reviewed-on: https://go-review.googlesource.com/c/net/+/202317
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@gopherbot
Copy link

Change https://golang.org/cl/202438 mentions this issue: vendor: re-vendor golang.org/x/net to fix Dragonfly build

@gopherbot
Copy link

Change https://golang.org/cl/202457 mentions this issue: ipv6: re-enable tests on Dragonfly

gopherbot pushed a commit to golang/net that referenced this issue Oct 21, 2019
Now that golang.org/x/net was re-vendored into std, these tests should
pass again.

Updates golang/go#34368

Change-Id: I6f253896836fb18d46875a9420de58ca05a49646
Reviewed-on: https://go-review.googlesource.com/c/net/+/202457
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
@gopherbot
Copy link

Change https://golang.org/cl/217358 mentions this issue: doc/go1.14: remove TODO about Dragonfly passing

gopherbot pushed a commit that referenced this issue Feb 4, 2020
Both the Dragonfly release and tip builder have been passing for a
while. The net package's interface API is working on both builders since
CL 202317 which has been re-vendored in CL 202438.

Updates #34368
Updates #36878

Change-Id: I187178b3a59f2604187af453207fb4e24a56105c
Reviewed-on: https://go-review.googlesource.com/c/go/+/217358
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@golang golang locked and limited conversation to collaborators Feb 1, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. OS-Dragonfly Testing An issue that has been verified to require only test changes, not just a test failure.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants