Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slices: add Concat #56353

Closed
earthboundkid opened this issue Oct 20, 2022 · 54 comments
Closed

slices: add Concat #56353

earthboundkid opened this issue Oct 20, 2022 · 54 comments

Comments

@earthboundkid
Copy link
Contributor

This came up in the discussion of #55358 and prior discussions. It would be nice to have a simple, generic way to concatenate a slice of slices.

Here's the proposed signature:

func Concat[S ~[]E, E any](dst S, slices ...S) S 

Usage:

// Join slices into a new slice
a := []int{ 1, 2, 3 }
b := []int{ 4, 5, 6 }
c := slices.Concat(nil, a, b) 
// c == int{ 1, 2, 3, 4, 5, 6 }

// overwrite an existing slice
a = slices.Concat(a[:0], c[0:1], c[2:3], c[4:5]) 
// a == int{ 1, 3, 5 }
@gopherbot gopherbot added this to the Proposal milestone Oct 20, 2022
@rittneje
Copy link

a := []int{1, 2, 3}
b := []int{4}
c := slices.Concat(a[:2], b, a[2:])

What is the value of c? Depending on the order of events weird things could happen.

For example, consider the following (perhaps naive) implementation:

func Concat[S ~[]E, E any](dst S, slices ...S) S {
	for _, s := range slices {
		dst = append(dst, s...)
	}
	return dst
}

This will result in c being [1 2 4 4] which is probably unexpected.

@ericlagergren
Copy link
Contributor

Isn't append already generic?

@earthboundkid
Copy link
Contributor Author

Isn't append already generic?

Append can't concatenate multiple slices. This means you need to jump through hoops to precalculate the capacity and to make a one-liner you end up with awkward things like append(s1, append(s2, s3...)...).

@earthboundkid
Copy link
Contributor Author

This will result in c being [1 2 4 4] which is probably unexpected.

I think it's only worth doing Concat if it does one shot preallocation, like

func Concat[S ~[]E, E any](dst S, ss ...S) S {
	c := 0
	for _, s := range ss {
		c += cap(s)
	}
	dst = Grow(dst, c)
	for _, s := range ss {
		dst = append(dst, s...)
	}
	return dst
}

If you run this, you get 1, 2, 4, 3.

@rittneje
Copy link

@carlmjohnson I don't think it should use cap as that could greatly overestimate the size requirement. For example, slices.Concat(nil, make([]int, 0, 1000)) should not grow the dest to 1000 elements.

Really it should be len, but that still causes problems. https://go.dev/play/p/zb1d-MhD6Pj

@earthboundkid
Copy link
Contributor Author

😅 You're right!

@earthboundkid
Copy link
Contributor Author

I think at a certain point, if the user is clobbering the slice they are reading from, it's a user error. You can probably make up an example using slices.Insert that is confusing too.

@rittneje
Copy link

@carlmjohnson It would be all too easy for someone not to realize that the first parameter is special in that regard, especially because it isn't in other languages that have similar APIs. And due to the nature of append/slices.Grow it will sometimes appear to work the way they expect, meaning they might not even notice the problem until it causes some production incident.

What if it were a method instead of a function, to reinforce the fact that the first parameter is actually the output buffer?

type Buffer[T any] []T

func (buf Buffer[T]) Concat(ss ...[]T) Buffer[T] {
	size := 0
	for _, s := range ss {
		size += len(s)
	}
	if size == 0 {
		return buf
	}
	buf = Grow(buf, len(buf)+size)
	for _, s := range ss {
		buf = append(buf, s...)
	}
	return buf
}

(This type name is just for example purposes.)

Then it would be slices.Buffer[int](a).Concat(b, c). For people who don't care to optimize it, they can call slices.Buffer[int](nil).Concat(a, b, c).

@earthboundkid
Copy link
Contributor Author

The objection to destination slices already applies to several functions in exp/slices: Insert, Delete, the upcoming Replace, and arguably Grow and Clip (although not as confusing as the others). If you think it's causing bugs, you should open an issue to make a v2 of the package before the current slices API gets into the standard library.

@rittneje
Copy link

Yes, it is possible to break them, but it is a bit more contrived. https://go.dev/play/p/4lLBHhxrfdN

Regardless, adding a typedef with methods does not itself necessitate a v2. It just means some (or all) of the existing functions would presumably be deprecated.

@AndrewHarrisSPU
Copy link

I feel like two versions of concatenation could make sense - a minimally complicated, more variadic 'Append'-like concatenation, and a profligately eager, maximally allocating 'Collect'-ing concatenation:
https://go.dev/play/p/13N05VZPjwS

@earthboundkid
Copy link
Contributor Author

I don't think it makes sense to have two very similar concatenation functions.

@DeedleFake
Copy link

Especially because the Append() case could be handled by allowing multiple spread operations in variadic arguments, such as append(s, a..., b..., c...).

@rittneje
Copy link

@DeedleFake The Go language spec defines a... to reuse the backing slice rather than copying. Allowing multiple spread operations could not preserve this essential behavior.

@DeedleFake
Copy link

It could, but only for the case of a single spread operator being used. It doesn't seem that different to me to just having multiple elements without using the spread operator.

@AndrewHarrisSPU
Copy link

AndrewHarrisSPU commented Oct 24, 2022

I don't think it makes sense to have two very similar concatenation functions.

The edit distance in LOC from append(...) to a variation with copy-then-append, or clone-then-append, isn't large. I do think they are not at all similar in terms of actual behavior, the invariants broken or preserved, algorithmic uses etc. That said I think it'd be fair enough to just have the append-like behavior in slices (a different take on Collect in an iter package would be more interesting.)

I think what I'm trying to suggest is that, on a first glance Concat can read either way. To me, I'd expect the Collect-like behavior from a slices.Concat. FWIW, there are only a few uses of a func concat or a method concat in the standard library (plus one in x/exp/slog); they all maintain the "no clobbering" invariant, mostly by slice copying rather than slice growing.

@rittneje
Copy link

It could, but only for the case of a single spread operator being used. It doesn't seem that different to me to just having multiple elements without using the spread operator.

It cannot.

func foo(x ...int) {
    // ...
}

a := []int{1, 2, 3}
b := []int{4, 5, 6}
foo(a..., b...) // cannot work

Within foo, the type of x is []int, which means it must be a standard slice backed by a single contiguous array. Thus there is no way to pass both a... and b... without copying at least one of them.

With regards to the original proposal, I think naming it something like AppendAll would be better, as that would at least suggest that it has analogous behavior to built-in append as far as the first parameter goes.

@DeedleFake
Copy link

Thus there is no way to pass both a... and b... without copying at least one of them.

Right, that's what I said. It wouldn't change anything for existing cases where only a single spread operator is used, which is all that's currently legal.

@ianlancetaylor
Copy link
Contributor

Discussion about passing both a... and b... should perhaps go to #18605 rather than here. Thanks.

@go101
Copy link

go101 commented Nov 11, 2022

concat should be built-in to get the most out of it.
Otherwise, the implementation is hard to be performant, even correct.

@aclements
Copy link
Member

I think we should consider this proposal together with #4096, since these are two different solutions to the same problem.

Personally, I would much prefer to have #4096, since that feels like a natural and ergonomic extension to the way people already concatenate multiple slices. While we can do this with a generic function, doing so creates two different ways to do almost the same thing, where neither subsumes the other. It also has some (probably slight) performance disadvantages compared to doing it directly in append.

@rsc
Copy link
Contributor

rsc commented May 24, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@randall77
Copy link
Contributor

Concat suffers from the issue of having a slice in slices alias dst. As described here: #56353 (comment)

This was fixed for slices.{Insert,Replace}, but it is tricky. See CL 494817. I'm not sure how you would handle that overlapping in general.

@earthboundkid
Copy link
Contributor Author

Maybe Concat should just always make a new slice. func Concat[S ~[]T, T any](ss ...S) S. I think I would be okay with that. It's simpler without dst, and reusing an existing slice does seem to be a tricky corner case.

@rsc
Copy link
Contributor

rsc commented May 31, 2023

It does sound like an always-allocating Concat is a lot simpler to understand. Is it also useful enough? I think for most cases that I can think of, I'd be happy with the always-allocating form.

Are there any objections to a Concat that always allocates?

@rsc rsc changed the title proposal: slices: add Concat slices: add Concat Jun 21, 2023
@rsc rsc modified the milestones: Proposal, Backlog Jun 21, 2023
@gopherbot
Copy link

Change https://go.dev/cl/504882 mentions this issue: slices: add Concat

@6543
Copy link
Contributor

6543 commented Aug 7, 2023

@6543
Copy link
Contributor

6543 commented Aug 7, 2023

well this should impl. the proposed api: #61817

correct me if I did messed up somewhere :)

@gopherbot
Copy link

Change https://go.dev/cl/516656 mentions this issue: slices: add Concat

@fzipp
Copy link
Contributor

fzipp commented Aug 7, 2023

@6543 There is already an implementation; see the linked change list in comment #56353 (comment)

@6543
Copy link
Contributor

6543 commented Aug 7, 2023

@fzipp yes but it does not implement the proposed api and looks stale

@earthboundkid
Copy link
Contributor Author

It implements the approved API. It's not stale, just waiting for the Go release cycle: https://github.com/golang/go/wiki/Go-Release-Cycle

@go101
Copy link

go101 commented Sep 1, 2023

Maybe this function should be specially optimized, just like maps.Clone? https://docs.go101.org/std/src/maps/maps.go.html#line-41

@earthboundkid
Copy link
Contributor Author

If you have a performance improvement to benchmark, that can be a new issue.

@go101
Copy link

go101 commented Sep 1, 2023

It is impossible to write normal benchmarks without hacking the runtime code. Someone ever hacked it.

There is no need to create a new issue.

eric pushed a commit to fancybits/go that referenced this issue Sep 7, 2023
Fixes golang#56353

Change-Id: I985e1553e7b02237403b833e96fb5ceec890f5b8
GitHub-Last-Rev: 96a35e5
GitHub-Pull-Request: golang#60929
Reviewed-on: https://go-review.googlesource.com/c/go/+/504882
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
eric pushed a commit to fancybits/go that referenced this issue Sep 7, 2023
Fixes golang#56353

Change-Id: I985e1553e7b02237403b833e96fb5ceec890f5b8
GitHub-Last-Rev: 96a35e5
GitHub-Pull-Request: golang#60929
Reviewed-on: https://go-review.googlesource.com/c/go/+/504882
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
eric pushed a commit to fancybits/go that referenced this issue Dec 23, 2023
Fixes golang#56353

Change-Id: I985e1553e7b02237403b833e96fb5ceec890f5b8
GitHub-Last-Rev: 96a35e5
GitHub-Pull-Request: golang#60929
Reviewed-on: https://go-review.googlesource.com/c/go/+/504882
Auto-Submit: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
@justinhwang
Copy link

@ianlancetaylor / @earthboundkid could one of you elaborate on what this comment means and why we shouldn't be using make here?

https://go-review.googlesource.com/c/go/+/504882/3..6/src/slices/slices.go#b510

Since we don't make any promises about the capacity, I recommend

newslice := Grow[S](nil, size)

That should lift the capacity up to the next size class.

@earthboundkid

This comment was marked as off-topic.

@justinhwang
Copy link

If it used make there, the capacity would be exactly the combined size of the slices, but we want it to be rounded up to the next size class, so people don't rely on len(s) == cap(s) being true, and box ourselves into a corner.

Still not quite following, what do we mean by rounded up to the next size class? Do we mean:

cap(Grow[S](nil, size)) != size

@earthboundkid

This comment was marked as off-topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Accepted
Development

Successfully merging a pull request may close this issue.