slices: add Repeat function #65238

New issue

Closed

slices: add Repeat function#65238

Labels

FrozenDueToAgeProposalProposal-Accepted

Milestone

Go1.23

adamroyjones

Proposal Details

Note

The current version of this proposal is here.

The signature in that version of the proposal is func Repeat[T any](count int, xs ...T) []T.

I propose a small, trivial addition to the slices package.

// Repeat returns a new slice containing count copies of x.
func Repeat[T any](x T, count int) []T {
	if count < 0 {
		panic("count must be non-negative")
	}
	xs := make([]T, count)
	for i := 0; i < count; i++ {
		xs[i] = x
	}
	return xs
}

There's prior, related art in other languages with list-like types, including the following. (The Python and JavaScript examples were provided by @Jorropo and @earthboundkid; see below.) Not all of the below are identical in their behaviour, but they're thematically similar.

Elixir, with List.duplicate/2.
Haskell, with replicate in the Prelude.
JavaScript, with Array.prototype.fill.
Python, with * for sequences.
Ruby, with Array.new.
Rust, with std::vec::Vec.repeat.
Scala, with Array.fill.

I've personally found it a useful helper to have—particularly when combined with Concat, which looks like it'll make its way into Go 1.22.

I appreciate that slices oughtn't become a dumping ground for everyone's ideas about what's useful, but this feels like a natural addition to me.

I'm happy to contribute a pull request with tests if this is considered a good idea.

added

added this to the Proposal milestone

on Jan 23, 2024

apparentlymart

At least some of the examples you shared in other languages are of repeating (something equivalent to) []T, rather than just T. For example, in Rust the Vec::repeat repeats all of the elements of the vector n times, rather than just a single element value.

The hypothetical Go Repeat function could potentially support both usage patterns at once by swapping the argument order and using a variadic signature:

func Repeat[T any](count int, vals ...T) []T {
	l := count * len(vals)
	xs := make([]T, l)
	for i := 0; i < l; i += len(vals) {
		copy(xs[i:], vals)
	}
	return xs
}

This new signature still supports repeating just one item, at the expense of the program having to quietly construct a single-element slice for that case.

Jorropo

Member

If we repeat slices and not elements then it should work in-place if the underlying array has enough capacity.

apparentlymart

I would find it quite surprising if I asked this function to repeat a sub-slice of a larger slice and it reacted by overwriting other elements outside of the sub-slice I provided.

I think writing outside of the bounds of a slice into the excess capacity -- particularly if it only happens under specific circumstances like only when there is enough excess capacity -- is likely to lead to bugs where callers don't fully understand the implications. In particular, I imagine an author writing unit tests where the source slice always has no excess capacity because that's the easiest thing to write as a literal, not considering that a caller might provide a sub-slice of a larger backing array in more practical usage

If it's desirable to repeat while avoiding a new allocation, then I would suggest changing the signature to pass a separate slice to write the result into, and document that it's acceptable for the target slice to overlap the source slice only if the two slices share a common prefix, so that the function doesn't have to jump through hoops to avoid overwriting its own input during its work.

However, I would also tend to think that repeating into excess capacity of the backing array of the input slice would be a rare enough need that I would hand-write it that way if I needed it, rather than expecting the standard library to provide it.

YMMV, of course!

Possible alternate signature meeting that additional requirement:

func RepeatCopy[S ~[]E, E any](dst []E, src S)

The function would keep copying src into consecutive indices of dst until it has written to every element of dst.

If src is shorter than dst then src gets repeated to fill that space. If len(dst) is not an integer multiple of len(src) then the final copy is truncated.

If src is longer than dst then it's exactly equivalent to copy(dst, src). Not particularly useful, but consistent with the rest of the behavior.

If dst and src both share the same backing array and &dst[0] == &src[0], this would then perform the in-place repeat-extension as the previous comment suggests.

Repeat as I suggested it in my earlier comment could then potentially be a wrapper around this RepeatCopy, optimizing for the (presumed) more common case:

func Repeat[T any](count int, vals ...T) []T {
	l := count * len(vals)
	xs := make([]T, l)
	RepeatCopy(xs, vals)
	return xs
}

Jorropo

Member

I would find it quite surprising if I asked this function to repeat a sub-slice of a larger slice and it reacted by overwriting other elements outside of the sub-slice I provided.

My original idea is that you want to do that for anything that adds new elements else you risk quadratic time complexity when the operation is repeated on the same slice.
However I now think this is wrong, you can't not have quadratic complexity anyway for this particular operation (at least without doing fancy things like ropes).

Things would be much better if we had type system support for slices that are only up to length long.
It's currently OOB knowledge what functions append and which one never touch extra capacity.
The best hint is if Append is in the name, then I think AppendRepeat would be the better name.

Jorropo

Member

To add to

There's prior, related art in other languages with list-like types, including the following

Python has this builtin on all sequence types (including lists):

s + t the concatenation of s and t

s * n or n * s equivalent to adding s to itself n times

`s + t`	the concatenation of s and t
`s * n` or `n * s`	equivalent to adding s to itself n times

apparentlymart

Looking again at the motivating examples for other languages, I notice that the Scala one would probably translate into Go more like this:

func RepeatFunc[T any](count int, makeElem func () T) []T {
	xs := make([]T, count)
	for i := 0; i < count; i++ {
		xs[i] = makeElem()
	}
	return xs
}

I'm not sure if this additional variant is actually useful enough to justify inclusion, but just exploring the different design tradeoffs that each of the other languages made in their similar library features.

gophun

Another option: make it return an iter.Seq[T] that can be collected via slices.Collect.

earthboundkid

Contributor

FWIW, Python uses *:

>>> ["a"] * 5
['a', 'a', 'a', 'a', 'a']

JavaScript is weirder. You first make an Array with capacity and no items then you fill it.

> new Array(5).fill("a")
< ["a", "a", "a", "a", "a"]

It seems like a pretty common slice utility across languages.

adamroyjones

Author

@apparentlymart: You're quite right. The "related" in "prior, related art" is bearing more load than I made clear. I'm sorry about that!

Thanks for all of the illuminating comments. So far, I'd categorise things into two mirrored families of three variants.

With slices.
a. The original, func(x T, count int) []T.
b. The variadic extension, func(count int, xs ...T) []T.
c. The functional extension, func(count int, fn func() T) []T.
With sequences.
a. As 1a, except it returns iter.Seq[T].
b. As 1b, except it returns iter.Seq[T].
c. As 1c, except it returns iter.Seq[T].

We might think of 1c and 2c as "RepeatFunc", as suggested.

I think there are two (current) weaknesses of 2a and 2b. Firstly, the iter package is crystallising, though it looks like it'll be accepted imminently. Secondly, the slices package operates on and returns slices. I'm not sure what the implications of new ideas like range-over-func and iter will be for existing packages, but I expect there'd be a general preference to keep the standard library's signatures frozen (to avoid breaking changes) and to keep each package's signatures internally "consistent" (and so precluding a mixture of slices and sequences in the slices package).

I think 1b is strictly preferable to 1a. It's simple, it's predictable in its behaviour and performance (with a single natural implementation), and it directly extends 1a without contortion. I think it'd slot easily into the existing slices package.

1c and 2c are interesting. Repeatedly calling a function is a natural way to generate a sequence. As well as the Scala example, there's a loosely-related (lazy, infinite) version of this in Elixir called Stream.unfold/2 which has a kind of beauty to it. But I'm not sure exactly how 1c and 2c would fit with Go just yet; I'll confess that I've not played around with the iter (and xiter, etc.) experiments. It all seems in flux.

earthboundkid

Contributor

1c strikes me as not any easier than just writing a loop.

n := -2
s :=  slices.Repeat(5, func() int {
  n += 2
  return n
})

// vs

s := make([]int, 5)
n := 0
for i := range s {
  s[i] = n
  n += 2
}

adamroyjones

Author

I think there's a case for 1c, but that it's marginal, with the advantages probably being small enough to not warrant extending the slices package. Go's syntax for closures is a bit noisy, but things can be visually compressed to

package main

import "fmt"

func main() {
	n := -2
	fmt.Printf("%v\n", repeat(10, func() int { n += 2; return n }))
}

func repeat[T any](count int, fn func() T) []T {
	xs := make([]T, count)
	for i := 0; i < count; i++ {
		xs[i] = fn()
	}
	return xs
}

which does turn the 6 lines to 2 (but at what cost?). I think it'd be a better fit if Go had syntax like

func main() {
	n := -2
	fmt.Printf("%v\n", repeat(10, () => n += 2))
}

but Go's not such a language—and it probably shouldn't become such a language.

apparentlymart

mentioned this

on Jan 29, 2024

proposal: Go 2: make(...) should have the last (optional) parameter for configuring default values for all types (slices, maps and chans) #65332

adonovan

Member

I agree that slices.Repeat should repeat slices (not elements), just as strings.Repeat repeats strings.

The implementation needs to check for overflow. It should panic with a message containing the values of both n and len(values), instead of unhelpfully calling make with a negative argument, or, worse, spuriously returning an empty slice when the product is a multiple of the word size.

Also, there's a nice optimization you can do here which is to make fewer calls to copy by working in powers of two:
https://github.com/google/starlark-go/blob/f86470692795f8abcf9f837a3c53cf031c5a3d7e/starlark/eval.go#L1190-L1197

38 remaining items

to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

Labels

FrozenDueToAgeProposalProposal-Accepted

Type

No type

Projects

No projects

Milestone

Go1.23
Closed Aug 13, 2024, 100% complete

Relationships

None yet

Development

Code with Copilot Agent Mode

No branches or pull requests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

slices: add Repeat function #65238

Proposal Details

38 remaining items

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

slices: add Repeat function #65238

Description

Proposal Details

Activity

apparentlymart commented on Jan 23, 2024

Jorropo commented on Jan 23, 2024

apparentlymart commented on Jan 24, 2024

Jorropo commented on Jan 24, 2024

Jorropo commented on Jan 24, 2024

apparentlymart commented on Jan 24, 2024

gophun commented on Jan 24, 2024

earthboundkid commented on Jan 24, 2024

adamroyjones commented on Jan 24, 2024

earthboundkid commented on Jan 25, 2024

adamroyjones commented on Jan 25, 2024

adonovan commented on Feb 1, 2024

38 remaining items

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions