Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

array to slice with generics, array type/length specialization with generics #51740

Open
zx2c4 opened this issue Mar 17, 2022 · 7 comments
Open
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@zx2c4
Copy link
Contributor

zx2c4 commented Mar 17, 2022

I've got a trie structure that works over 4 byte arrays and 16 byte arrays, for IPv4 and IPv6 respectively. I used to just use a []byte slice for this, and adjust accordingly based on len(x) when it mattered.

Actually, pretty much the only place it mattered was here:

func commonBits(ip1, ip2 []byte) uint8 {
	if len(ip1) == 4 {
		a := binary.BigEndian.Uint32(ip1)
		b := binary.BigEndian.Uint32(ip2)
		x := a ^ b
		return uint8(bits.LeadingZeros32(x))
	} else if len(ip1) == 16 {
		a := binary.BigEndian.Uint64(ip1)
		b := binary.BigEndian.Uint64(ip2)
		x := a ^ b
		if x != 0 {
			return uint8(bits.LeadingZeros64(x))
		}
		a = binary.BigEndian.Uint64(ip1[8:])
		b = binary.BigEndian.Uint64(ip2[8:])
		x = a ^ b
		return 64 + uint8(bits.LeadingZeros64(x))
	} else {
		panic("Wrong size bit string")
	}
}

So in converting this all away from slices and toward static array sizes, I made this new type constraint:

type ipArray interface {
	[4]byte | [16]byte
}

Then I broke out those two if clauses into their own functions:

func commonBits4(ip1, ip2 [4]byte) uint8 {
	a := binary.BigEndian.Uint32(ip1[:])
	b := binary.BigEndian.Uint32(ip2[:])
	x := a ^ b
	return uint8(bits.LeadingZeros32(x))
}

func commonBits16(ip1, ip2 [16]byte) uint8 {
	a := binary.BigEndian.Uint64(ip1[:8])
	b := binary.BigEndian.Uint64(ip2[:8])
	x := a ^ b
	if x != 0 {
		return uint8(bits.LeadingZeros64(x))
	}
	a = binary.BigEndian.Uint64(ip1[8:])
	b = binary.BigEndian.Uint64(ip2[8:])
	x = a ^ b
	return 64 + uint8(bits.LeadingZeros64(x))
}

So far, so good, but what is the implementation of commonBits?

func commonBits[B ipArray](ip1, ip2 B) uint8 {
	// ???
}

If you try to convert the array to a slice, the compiler will bark at you. You can use any(ip1).(type) in a switch, but then you get runtime overhead. I've figured out a truly horrific trick that combines two Go 1.17 features into an unholy mess:

func giveMeA4[B ipArray](b B) [4]byte {
	return *(*[4]byte)(unsafe.Slice(&b[0], 4))
}

func giveMeA16[B ipArray](b B) [16]byte {
	return *(*[16]byte)(unsafe.Slice(&b[0], 16))
}

func commonBits[B ipArray](ip1, ip2 B) uint8 {
	if len(ip1) == 4 {
		return commonBits4(giveMeA4(ip1), giveMeA4(ip2))
	} else if len(ip1) == 16 {
		return commonBits16(giveMeA16(ip1), giveMeA16(ip2))
	}
	panic("Wrong size bit string")
}

This... works, amazingly. Similarly, when I needed to adjust my randomized unit tests, I wound up going with code that looks like this:

var addr B
rand.Read(unsafe.Slice(&addr[0], len(addr)))

I asked some Go experts if there was a better way, and the answer I got was that generics aren't yet well suited for arrays. So, I'm opening this rather vague report in hopes that it can turn into a proposal for something useful.

CC @danderson @sebsebmc @josharian

@ianlancetaylor
Copy link
Contributor

I think that at a high level you are looking for something like C++ template specialization: a way to write an implementation of a generic function to use for a specific type. The current generics support in Go does not provide a way to do that, which is intentional (https://go.googlesource.com/proposal/+/refs/heads/master/design/43651-type-parameters.md#omissions).

Please let me know if I misunderstand.

That said, if using any(ip1).(type) in a type switch does introduce runtime overhead (I haven't checked) we can consider compiler optimizations for that case. For example perhaps a type switch on a generic argument with a small number of cases should be used as an indication that we should stencil out those cases rather than using a dictionary. (And of course more generally if the constraint(s) only permit a couple of types we could consider stenciling out those types always.)

That is, perhaps we can approach this as a compiler optimization issue rather than as a language issue.

@zx2c4
Copy link
Contributor Author

zx2c4 commented Mar 17, 2022

The other funny aspect of this is that you can't slice the type, even though you can call len(x) on it.

type ipArray interface {
	[4]byte | [16]byte
}
func legal[B ipArray](x B) {
	_ = len(x)
}
func illegal[B ipArray](x B) {
	_ = x[:]
}

I'm not sure I understand why len(x) would be okay but x[:] would not. It's using the same information in both cases to derive the result. I wind up hacking around that using the unsafe.Slice trick above, but I don't quite see why that should be necessary?

@ianlancetaylor
Copy link
Contributor

The inability to slice is a deficiency of the current implementation, not a part of the language design. There are a number of similar infelicities in 1.18.

@zx2c4
Copy link
Contributor Author

zx2c4 commented Mar 17, 2022

Oh that's good news. So it sounds like we've identified two compiler improvements that might actually be actionable without having to have a complicated language proposal:

  1. Optimizing switch x := any(t).(type) to not require any runtime overhead for reasonably contoured cases.
  2. Allow slicing types without having to resort to unsafe.Slice to force it.

The first would help with a sort of poorman's type specialization via a dispatcher that could disappear at compile time. This would help with the generic case.

The second would help with array specialization, among other things, which could also disappear at compile time depending on the context.

That seems like a straightforward way to address this, right?

@ianlancetaylor
Copy link
Contributor

Sounds plausible to me.

@davidmdm
Copy link

davidmdm commented Mar 17, 2022

This may be a bit naive of me, but if this proposal was accepted: #45380 (type switch on parametric types) wouldn't this solve this issue?

@zx2c4
Copy link
Contributor Author

zx2c4 commented Mar 17, 2022

That's similar but it's a language proposal, right? The above is just about adding some small compiler optimization to accomplish the same thing.

@heschi heschi added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 17, 2022
@heschi heschi added this to the Backlog milestone Mar 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

4 participants