Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: recognize append loop #14325

Open
bradfitz opened this issue Feb 14, 2016 · 6 comments
Open

cmd/compile: recognize append loop #14325

bradfitz opened this issue Feb 14, 2016 · 6 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Performance
Milestone

Comments

@bradfitz
Copy link
Contributor

bradfitz commented Feb 14, 2016

Loops with appends are common in Go. And often the loops have a fixed iteration count.

cmd/compile could recognize something of the form:

        for ... := range x {
            s = append(s, v)
        }

And compile it as:

        s = runtime_ensurecapacity(s, len(x))
        for ... := range x {
            s = append(s, v)
        }

Likewise with for i := 0; i < n; i++.

That would remove the need for the (often premature) optimization of:

      s := make([]T, 0, len(x))

And let people just write instead:

    var s []T
    for k := range m {
          s = append(s, k)
    }

This would also help eliminating garbage when the number of iterations is very large. (e.g. @campoy's https://news.ycombinator.com/item?id=11098802)

/cc @randall77 @dr2chase @rsc

@bradfitz bradfitz added this to the Unplanned milestone Feb 14, 2016
@cespare
Copy link
Contributor

cespare commented Feb 14, 2016

In the specific case of

s := make([]T, 0, len(x)) // or just var s []T, to avoid the possibly-premature optimization
for _, v := range x {
  s = append(s, v)
}

IMO it's better style anyway to write

s := make([]T, len(x))
for i, v := range x {
  s[i] = v
}

because it's about the same complexity (make is needed, but append is not) and because it's more obviously performant without the need for compiler optimizations. I think that this style preference is borne out by stdlib code as well.

But I suppose your proposal still helps in cases where s was previously initialized with some other elements.

By the way, I find your last code snippet confusing (why are the variable names different from the preceding snippets, and why are we appending slice indexes?) Should it instead read

var s []T
for _, v := range x {
  s = append(s, v)
}

@mrkaspa
Copy link

mrkaspa commented Mar 17, 2016

@cespare I agree, this is a most logical way to proceed if you define a fixed size array

s := make([]T, len(x))
for i, v := range x {
  s[i] = v
}

@mdempsky
Copy link
Member

There are some subtleties to when this is a safe transformation. E.g., in

var a [2]int
p := a[:0]

for _, v := range [...]int{1,2,3,4} {
    p = append(p, v)
}

the first two calls to append(p, x) need to still have the side effect of assigning a[0] = 1 and a[1] = 2, because the Go spec guarantees append will reuse the underlying array if there's available capacity.

Instead of up front calling a hypothetical runtime.ensurecapacity function, we could instead be smarter about calculating the desired length argument for runtime.growslice. Currently, for append(p, v), we just pass len(p)+1 each time.

However, the Go spec makes no guarantees about how large the newly allocated slice will be. That gives us freedom to try to predict the target size better and let that influence what size slice to allocate.

@rsc
Copy link
Contributor

rsc commented Mar 18, 2016

I am skeptical. What's missing from this report is any kind of motivation from performance of a real program. Even Francesc's post points out that this kind of microbenchmark is likely misleading. Any real loop is going to have possible calculations that can cause it to panic, aborting the loop early. You don't want to be left with an enormous piece of unnecessary garbage in that case. And to the extent that you might underallocate during an append loop, you shoot yourself in the foot by breaking the guarantees of append, for example when that loop is enclosed in another loop. I suspect we should not change anything here, certainly not until there are real programs to worry about instead of toys.

@tilinna
Copy link

tilinna commented Aug 16, 2016

Whenever I'm coding in languages like Python or Javascript, I'm much less concerned about memory allocation. With Go, I somehow switch into C-mode and see mallocs everywhere. And because I know how the append() works, it's really hard to not add that make() before every for-range -loop.

I'm sure I'm not the only one having this struggle so I hope the compiler is fixed to do it automatically. Also the lint tools should be aligned with it.

A recent example of litter here: 77e68ea

@iwdgo
Copy link
Contributor

iwdgo commented Jan 6, 2019

The algorithm involves three nested slices of small dimensions. Some duplicated values appeared for larger numbers with no reason. The end size of any slice is not known and can't be calculated within the range of reality w/o writing more code to evaluate the size than for the algorithm itself. After reading this issue, the solution is to duplicate the variable of the range only when it must be updated.
This code is only an illustration.

var s [][]T
for k, v := range s {
    ...
    vl := make([]int, len(v)+1)
    vl = append([]T{}, v...)
    s[i][k] = append(vl, dl) // dl is to be added as determined by the rest of the code
    ...
}

It seems to me that the make/append/update could somehow have an easier solution as identifying such an issue is time consuming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. Performance
Projects
None yet
Development

No branches or pull requests

8 participants