Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slices: add DeleteFunc #54768

Closed
Deleplace opened this issue Aug 30, 2022 · 21 comments
Closed

slices: add DeleteFunc #54768

Deleplace opened this issue Aug 30, 2022 · 21 comments

Comments

@Deleplace
Copy link
Contributor

I propose we add this new func to golang.org/x/exp/slices :

// DeleteFunc removes any elements from s for which del returns true.
// DeleteFunc modifies the contents of the slice s; it does not create a new slice.
// When DeleteFunc removes m elements in total, it might not modify the elements s[len(s)-m:len(s)]. If 
// those elements contain pointers you might consider zeroing those elements so that objects they
// reference can be garbage collected.
func DeleteFunc[S ~[]E, E any](s S, del func(E) bool) S
@gopherbot gopherbot added this to the Proposal milestone Aug 30, 2022
@Deleplace
Copy link
Contributor Author

The intent is similar to these existing funcs:

  • maps.DeleteFunc
  • maps.EqualFunc
  • slices.CompactFunc
  • slices.CompareFunc
  • slices.EqualFunc
  • slices.IndexFunc
  • slices.IsSortedFunc
  • slices.SortFunc

@Deleplace
Copy link
Contributor Author

In some uncommon cases, it could be useful to know the original index of an element, to decide if it should be removed or not. For this we would need to pass the index as an argument to the del func: del func(int, E) bool.

In my opinion we should not pass such an index because it is ambiguous, as DeleteFunc is destructive and changes the indices. To keep things simple, DeleteFunc would work only when elements are independent from each other, and the indices don't matter.

@seankhliao seankhliao changed the title proposal: exp/slices: DeleteFunc proposal: x/exp/slices: DeleteFunc Aug 30, 2022
@earthboundkid
Copy link
Contributor

This is traditionally called Filter or FilterInPlace. The decision was made to deliberately omit map and filter from slices pending discovery of best practices in real experience.

With that said, I think this would be helpful, and maybe even under the name DeleteFunc. Filtering in place without generics is kind of tedious. The idiom I usually use looks like this:

filteredSlice := original[:0]
for _, e := range original {
    if cond(e) {
        filteredSlice = append(filteredSlice, e)
    }
}
original = filteredSlice

It's not so bad, but it's not obvious to people unfamiliar with the idiom how it works, and it's a lot of lines for a simple idea. I would prefer:

slices.DeleteFunc(&original, func(e E) {
    return !cond(e)
})

In terms of the signature, I think DeleteFunc(s *S, del func(E) bool) is better than DeleteFunc(s S, del func(E) bool) S because the original slice isn't very useful after having run DeleteFunc over it, so you have an extra unnecessary assignment. In the case where someone really wants the original slice, they can do filteredSlice := original before calling DeleteFunc. This also better matches maps.DeleteFunc, which does not return the original map.

@Deleplace
Copy link
Contributor Author

I agree about the original slice not being useful for any common purpose, after having been changed by one of the destructive funcs.

  • slices.Compact
  • slices.CompactFunc
  • slices.Delete
  • slices.Insert

all accept S and return S, instead of accepting *S and returning nothing. Maybe they do it that way to mimick append, which is only "half-destructive" (depending on what we want to do with the original slice).

@earthboundkid
Copy link
Contributor

For append, I think it makes sense that it returns a slice instead of taking a pointer because you may want to append to nil or a subslice etc. I don't think it makes as much sense for Compact/Delete/Insert.

@flowchartsman
Copy link

flowchartsman commented Oct 28, 2022

Having played around with implementing this generically myself, one thing that occurred to me is that another potential benefit for a generic DeleteFunc might be to go ahead and automatically zero out "deleted" slice elements so that any pointed-to memory can be GCed, avoiding the need to remember the "trick to it".

The simplest way would just be to always zero from len(new) -> len(original), or you could reflect on the elem type and zero out the elements conditionally if that's worth it (not sure, poc: https://go.dev/play/p/foOKrgQ5gxN).

This brings up some more interesting questions. Because, if you have generics, and you want to put generic slice methods in the standard library, of course you would want to do the right thing for your users in one place if the cost isn't too high. For my part, I can definitely say that the documentation for slices.Delete is unsatisfying to me as a Go educator, since it's one more thing to have to explain upfront about a seemingly-simple process, and explaining append is already a heavy lift. It's not intuitive that you can "delete" something, but still have chunks of uncollected memory floating around. Of course this is nothing new, since slice expressions effectively do the same thing, but if you're going to the trouble of making a library for it, might as well have it do something extra. It would certainly add a little something back on the other side of the scale that we lose to the need to call functions like this without a lambda syntax.

Perhaps this is an argument for Delete(in-place, reflective) and Filter(copy), or perhaps it's an argument that copying is the more natural thing, and we should just always copy values and live with that. That's certainly the more conservative, Go-y thing to do, but we are also in a new generic world, and that brings a lot of opportunity for new conventions. It may even be that it will be perfectly fine to have something like DeleteFunc(s *S, del func(E) bool), though it does feel a little unnatural to me right now.

I think, I'd vote for either

  • give Delete and DeleteFunc the ability to zero out old capacity or
  • don't bother with them at all, and only copy.

@earthboundkid
Copy link
Contributor

You may be interested in #56351 which proposes to add a clear function which would remove all elements from a map and set all values in a slice to the zero value.

@flowchartsman
Copy link

Yes, I think DeleteFunc should probably call that behind the scenes, since there's no reasonable way to zero out the elements yourself unless you keep the original slice and also return the range of indices of the leftover len() after all of the shuffling has completed.

@Deleplace
Copy link
Contributor Author

I agree with the sentiment that slices.Delete, slices.DeleteFunc, slices.Compact, slices.CompactFunc, slices.Clip would all be more useful and less cryptic if they did the extra work of zeroing out the discarded right tail of the original slice. I suggested such behavior but got strong pushback.

@flowchartsman the range of the leftover is always len(new):len(old) but yes leaving this work as an exercise to the developer at call site is not great in my opinion.

@earthboundkid
Copy link
Contributor

Maybe we should propose slice.ClearTail(s S) which runs clear(s[len(s):cap(s)]).

@Deleplace
Copy link
Contributor Author

ClearTail is much better than nothing, in the world in currently live in.

@flowchartsman
Copy link

flowchartsman commented Oct 30, 2022

I agree with the sentiment that slices.Delete, slices.DeleteFunc, slices.Compact, slices.CompactFunc, slices.Clip would all be more useful and less cryptic if they did the extra work of zeroing out the discarded right tail of the original slice. I suggested such behavior but got strong pushback.

I think that pushback is misplaced. Just because slice expressions let us leave hanging references in a situation we might not want to isn't a strong argument in my opinion that a higher-level, generic function should not "do the right thing" for its expressed usecase. Slice expressions are, to my thinking, a basic feature that has no intrinsic intent beyond just "adjust the len and cap for this view into a backing array". They can be used in a variety of contexts, and it's up to the programmer to execute their intent by either zeroing out the elements or not, depending on what their plans are for that slice in the future. In addition, slice expressions do not change the positions of elements, so there's every expectation that everything in the backing array is just as you left it, and is the explicit reason we have three-argument slice expressions in the first place, to allow sharing a subset view without the worry that someone will accidentally append() and overwrite elements.

With something called Delete*, the intent is clear: we want to remove things, and there's no reason to keep around the remaining capacity, especially when the implementation deliberately overwrites values by shifting their indices, as it does in DeleteFunc There's no way for the user to know which elements are going to stay on the end of the slice and which will be overwritten by shifting a "good" element into their place. The only possible piece of extra information we could give them, as @carlmjohnson suggests, is a range of leftover elements, but even that's only useful for clearing them out, and what would be the point of requiring an extra step?

That's why my argument is that we either don't mess with the indices at all, in which case DeleteFunc (and arguably Delete) are non-starters, or we clear out the other items automatically. The middle ground is kind of pointless IMO

@flowchartsman
Copy link

flowchartsman commented Oct 30, 2022

As an aside, I think exp/slices and exp/maps are an interesting window into the conflict between the general conservative attitude towards language changes and side-effects versus the expanded functionality of type parameters and how useful they can actually be if the desire is to do as little as possible behind the scenes.

Delete and DeleteFunc are especially interesting because they are predicated on implementations that move things around, so you have to ask why you wouldn't clear out the remainder anyway. Then you have to ask why it's an assignment in the first place if there's no reason to keep around the (possibly changed) original. Then you have to ask why you shouldn't just make it take a pointer, and then you have to ask where the aversion to methods taking a slice pointer comes from in the first place. It's a lot of precedent to consider for some seemingly-simple generic functionality.

@rsc
Copy link
Contributor

rsc commented Mar 8, 2023

Adding to the proposal process. The usual operation is Filter, though, in which the callback returns whether to keep the item, not to delete it. That might be a better answer, and then the Delete docs can point to Filter. The only problem then is finding a name that people feel is appropriate for filter-in-place. Some people think Filter should return a copy. It's especially confusing because Filter/DeleteFunc has to return a slice. Maybe call the in-place one slices.Keep and then have the Delete docs warn about not calling Delete in a loop and pointing to Keep?

@earthboundkid
Copy link
Contributor

earthboundkid commented Mar 8, 2023

I was skeptical of the name "DeleteFunc" at first because "Filter" is usual, but I've been using it in my code since shortly after this issue was opened, and I've come around to liking it. It fits well with maps.DeleteFunc, which is also an in-place operation. I think the names Filter/Map/Reduce should wait until some generic iterator library is added to the standard library and be used there. Having slices.Map/Filter which return new slices encourages writing throw away intermediate slices. An iter library that looks something like this doesn't have that problem:

it := iter.FromSlice(s1)
it = iter.Map(it, f1) 
// Note: it needs to be it2 :=, if f1 outputs a different type
it = iter.Filter(it, f2)
it = iter.Map(it, f3) // etc
s2 := iter.ToSlice(it)  // or e := iter.Reduce(it, initial, f4)

@rsc
Copy link
Contributor

rsc commented Mar 8, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Mar 15, 2023

Given that maps has maps.DeleteFunc and 'delete' is the name of the operation on maps, and also that we have slices.Delete, adding slices.DeleteFunc with the sense of 'delete the values for which the function is true'. So DeleteFunc is probably better than Keep, and both are much better than Filter.

@rsc rsc changed the title proposal: x/exp/slices: DeleteFunc proposal: x/exp/slices: add DeleteFunc Mar 29, 2023
@rsc rsc changed the title proposal: x/exp/slices: add DeleteFunc proposal: slices: add DeleteFunc Mar 29, 2023
@rsc
Copy link
Contributor

rsc commented Mar 29, 2023

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Apr 6, 2023

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc changed the title proposal: slices: add DeleteFunc slices: add DeleteFunc Apr 6, 2023
@rsc rsc removed this from the Proposal milestone Apr 6, 2023
@rsc rsc added this to the Backlog milestone Apr 6, 2023
@gopherbot
Copy link

Change https://go.dev/cl/483175 mentions this issue: slices: add DeleteFunc

@gopherbot
Copy link

Change https://go.dev/cl/509236 mentions this issue: slices: add DeleteFunc

gopherbot pushed a commit to golang/exp that referenced this issue Jul 13, 2023
DeleteFunc was added to the standard library for the 1.21 release.
Add it here in x/exp for people still using earlier releases.

For golang/go#54768
Fixes golang/go#61327

Change-Id: I3c37051c289f46b0068bc1ee5da610149c59cd22
Reviewed-on: https://go-review.googlesource.com/c/exp/+/509236
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Eli Bendersky <eliben@google.com>
eric pushed a commit to fancybits/go that referenced this issue Sep 7, 2023
Fixes golang#54768

Change-Id: I588ae33c13e0bbd9d324c11771667b22a864047d
Reviewed-on: https://go-review.googlesource.com/c/go/+/483175
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Eli Bendersky <eliben@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
eric pushed a commit to fancybits/go that referenced this issue Sep 7, 2023
Fixes golang#54768

Change-Id: I588ae33c13e0bbd9d324c11771667b22a864047d
Reviewed-on: https://go-review.googlesource.com/c/go/+/483175
Reviewed-by: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Eli Bendersky <eliben@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Accepted
Development

No branches or pull requests

6 participants