Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/exp/slices: Allow different types for haystack/needle in BinarySearchFunc #57348

Closed
Merovius opened this issue Dec 15, 2022 · 12 comments
Closed

Comments

@Merovius
Copy link
Contributor

I propose relaxing the constraints on BinarySearchFunc to allow using two different types:

func BinarySearchFunc[E, T any](x []E, target T, cmp func(E, T) int) (int, bool)

As it adds a type parameter, this is not backwards compatible, but it's exp and the extra type argument is inferable. We could also delay this until, if ever, we move the slices package to the stdlib.

Rationale

A usecase this just came up for me is that I wrote a helper package for Advent of Code to deal with sparse interval sets. A set is represented as a slice of intervals [a,b), [c,d),…, [y,z) with a<b<…<y<z. Looking up an element in this set is done via a binary search for an interval containing it. Similarly, intersections and unions of these sets involve a bunch of binary searching. In all of these cases, the algorithm naturally looks up an integer, not an interval.

With the current signature, the union version would be written as

lo, lok := slices.BinarySearchFunc(s, Interval[T]{i.Min, i.Min}, func(i, j Interval[T]) int {
	if j.Min > i.Max {
		return -1
	}
	if j.Min+1 < i.Min {
		return 1
	}
	return 0
})
// vs.
lo, lok := slices.BinarySearchFunc(s, i.Min, func(i Interval[T], v T) int {
	if v > j.Max {
		return -1
	}
	if v+1 < j.Min {
		return 1
	}
	return 0
}

To me, the second version is far more clearer. It makes obvious which argument is the needle and which is the haystack element. It does not involve creating an artificial Interval, nor taking a .Min to extract the needle from said Interval again (and from the code, it's really not clear whether that's semantically meaningful).

The two type parameter version seems far more natural to me and anecdotally, I believe it would have always been the right choice for me when I looked at slices.BinarySearchFunc. And I remember not using it at least once or twice because of this (instead resorting to sort.Search). As another example, if I have a slice of Person, sorted by name, looking up the index by name would be

slices.BinarySearchFunc(s, Person{Name:name}, func(a, b Person) int { return cmp(a.Name, b.Name) })
// vs.
slices.BinarySearchFunc(s, name, func(p Person, n name) int { return cmp(a.Name, n) })

Again, this just seems clearer to me.

@gopherbot gopherbot added this to the Proposal milestone Dec 15, 2022
@David-Mao
Copy link

I have also encountered this issue before. I agree it's annoying. Sometimes I have to create a temp element simply to make the comparison works, while all I really care is just a member value of that element.

But if backwards compatibility is not a problem, maybe changing the signature to

func BinarySearchFunc[E any](x []E, cmp func(E) int) (int, bool) 

would be enough and simpler? the target can be captured into the cmp function anyway.

@Merovius
Copy link
Contributor Author

I suggested that in the original issue but it did not happen. Perhaps my arguments at the time where weak, though.

@rogpeppe
Copy link
Contributor

I support this proposal. For the same reasons I gave in the original proposal, I'm not keen on the one-argument form, because it's not clear which way the comparison goes (and if anyone else is similar to me, it's can be a real brain bender to work out exactly which is correct), and in practice you'll always end up creating a closure.

As an example, say we have a slice of objects ordered by time. Using the more general API is nice here:

type entry struct {
	key string
	val int
}

func (e *entry) cmpKey(k string) int {
	return strings.Compare(e.key, k)
}

func findEntry(es []*entry, k string) *entry {
	i, ok := slices.BinarySearchFunc(es, k, (*entry).cmpKey)
	if ok {
		return es[i]
	}
	return nil
}

@earthboundkid
Copy link
Contributor

My preference is for the closure, one argument form. In the cases where you do use a closure instead of a method, the two argument form ends up being confusing in its own way.

@kylelemons
Copy link
Contributor

I think the one argument version is the most consistent with sort.Search, but I think the two argument version gives the most flexibility by allowing for the use of non-closure arguments. I think it's clear what the arguments are and what the return value is. I am also in the boat where I'm left scratching my head about what to write inside a 1-ary cmp function, and I get sort.Search wrong often enough that I nowadays always look up an example.

@rsc
Copy link
Contributor

rsc commented Jan 18, 2023

It sounds like people are generally in favor of

func BinarySearchFunc[E, T any](x []E, target T, cmp func(E, T) int) (int, bool)

Is E, T the standard order in other languages for the comparison arguments in a call like this?

Are there any other concerns?

@aclements
Copy link
Member

Is E, T the standard order in other languages for the comparison arguments in a call like this?

These are the closest equivalents I can find in other languages:

Python: bisect
Java: For arrays and Lists

They each take the slice-equivalent first and the target element second, like the proposed BinarySearchFunc, and presumably the cmp function should take its arguments in the same order.

@rsc
Copy link
Contributor

rsc commented Jan 25, 2023

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Feb 1, 2023

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc changed the title proposal: x/exp/slices: Allow different types for haystack/needle in BinarySearchFunc x/exp/slices: Allow different types for haystack/needle in BinarySearchFunc Feb 1, 2023
@rsc rsc modified the milestones: Proposal, Backlog Feb 1, 2023
@gopherbot
Copy link

Change https://go.dev/cl/464456 mentions this issue: slices: allow different types in BinarySearchFunc

@gopherbot
Copy link

Change https://go.dev/cl/485995 mentions this issue: slices: update BinarySearchFunc docs for CL 464456

gopherbot pushed a commit to golang/exp that referenced this issue Apr 19, 2023
In CL 464456 we updated BinarySearchFunc to permit different types
for the slice element and the target. Update the docs accordingly.

For golang/go#57348
Fixes golang/go#59685

Change-Id: I8bc3b81051628807fb2426a425101b8bad041ff9
Reviewed-on: https://go-review.googlesource.com/c/exp/+/485995
Reviewed-by: Eli Bendersky <eliben@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
@gopherbot
Copy link

Change https://go.dev/cl/498395 mentions this issue: doc/go1.21: merge x/exp/slices issue into slices package

gopherbot pushed a commit that referenced this issue May 25, 2023
For #57348

Change-Id: I84943711b033d63f0993133f93d9f09ce2af5965
Reviewed-on: https://go-review.googlesource.com/c/go/+/498395
Reviewed-by: Eli Bendersky <eliben@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Matthew Dempsky <mdempsky@google.com>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
@dmitshur dmitshur modified the milestones: Backlog, Unreleased May 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants