Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bytes: add Extend method #9638

Closed
rsc opened this issue Jan 19, 2015 · 6 comments
Closed

bytes: add Extend method #9638

rsc opened this issue Jan 19, 2015 · 6 comments
Milestone

Comments

@rsc
Copy link
Contributor

rsc commented Jan 19, 2015

Bryan Ford suggests adding something like this:

// Extend adds capacity to b for n bytes beyond its current length
// and then returns the new slice b and the n-byte suffix just created.
// The suffix is not explicitly initialized.
func Extend(b []byte, n int) (larger, suffix []byte) {
    old := len(b)
    for cap(b) < old+n {
        b = append(b[:cap(b)], 0)
    }
    return b[:old+n], b[old:n]
}

It could be used to grow a slice to make room for a direct write,
instead of writing to a temporary copy and using append.
For example, sha1's Sum method does:

func (d0 *digest) Sum(in []byte) []byte {
    ...
    hash := d.checkSum()
    return append(in, hash[:]...)
}

That code was written carefully to return a fixed-size array so that d.checkSum didn't allocate, and we know that append doesn't capture hash, so there's no allocation other than the append.

The code could instead change checkSum to write the 20 bytes to its argument (renaming to writeSum) and use:

func (d0 *digest) Sum(in []byte) []byte {
    ...
    buf, sum := bytes.Extend(in, Size)
    d.writeSum(sum)
    return buf
}

Brad and Minux talked him out of it, but I am not so sure.

To consider for Go 1.5.

@rsc rsc self-assigned this Jan 19, 2015
@rsc rsc added this to the Go1.5 milestone Jan 19, 2015
@bradfitz
Copy link
Contributor

I didn't try to talk him out of it. I just made some observations and asked questions.

I'll make another: do we want crypto/sha1 to depend on unicode? I doubt it. But because bytes does, we couldn't have sha1 use bytes.Extend, if this were put this in the bytes package.

@extemporalgenome
Copy link
Contributor

Some of the use cases this could replace could be statically analyzed and optimized. For example, using append(x, make([]T, n)...) to grow a slice could be automatically optimized to avoid a memory-to-memory copy (since every element will be zero) and to further avoid the allocation by make (since the element values are constant, and the slice does not change or escape, the slice itself is not needed). SSA-style analysis may allow such optimizations to be general and categorical.

Historically, performance/optimization considerations have not made it into the language (e.g. in the form of compiler hinting keywords) or the stdlib. The allocation and copying avoidance should therefore be given little weight, as an eventual compiler could make such optimizations implicit anyway.

If the non-performance merits of bytes.Extend are enough to still warrant inclusion, the function definition as proposed is a bit unconventional, since it adds convenience at the expense of increasing complexity. I would recommend func Extend([]byte, int) []byte such that the return value is the same as the "larger" value described above. "suffix" can be trivially created with a single slice expression. The supporting precedent for this recommendation is that there is no path.SplitExt, because the equivalent can likewise be created with a straightforward slice expression upon the result of calling path.Ext

@PieterD
Copy link
Contributor

PieterD commented Feb 1, 2015

I agree, optimizing append(x, make([]T, n)...) would be very nice, I ran into that extra allocation the other day.

Absent that optimization, an extend function in bytes would also be very useful.

The function provided is very slow however, and it does not zero out the bytes between len and cap. I came up with this, which allocates less and is faster in every case:

const extendShortLen = 512
var extendShort [extendShortLen]byte
func Extend(b []byte, s int, zero bool) []byte {
    l := len(b)
    e := l
    if l+s <= cap(b) {
        // Short append; extension fits in cap, no allocation.
        b = b[:l+s]
        e += s
    } else {
        // Extension does not fit, use up all cap first
        e = cap(b)
        b = b[:e]
        s -= e - l
        if s <= extendShortLen {
            // Mid append, must allocate new backing array.
            b = append(b, extendShort[:s]...)
        } else {
            // Long append, must allocate new backing array and temporary slice.
            b = append(b, make([]byte, s)...)
        }
    }
    if zero {
        x := b[l:e]
        for len(x) > 0 {
            x = x[copy(x, extendShort[:]):]
        }
    }
    return b
}

@extemporalgenome
Copy link
Contributor

@PieterD why do b = append(b, make([]byte, s)...) instead of a precisely sized make followed by a copy? Avoiding append here would prevent a pointless allocation. Also, the zero argument is redundant in many cases (anywhere an allocation has occurred). It's generally accepted that, when extending the length of a slice further into its capacity, the newly visible elements will be overwritten (or otherwise ignored). In these cases, the zero argument serves no purpose either.

@PieterD
Copy link
Contributor

PieterD commented Feb 1, 2015

Boy, are my ears red. Yeah, that is a lot better. That pretty much invalidates the mid case as well, and what remains is pretty pitiful. And here I thought I was being all clever. Oops!

It's true that zero would be false for most cases, and should you want it zeroed you can always do it yourself after extending. However, should you want to zero it, you would only have to do anything if you weren't allocating. Which I suppose would be most of the time anyway, since at least in my case I re-use a lot of buffers. So I guess you're right, it would be exceptionally rare.

Looks like I agree with you on all counts. Thank you.

@rsc
Copy link
Contributor Author

rsc commented Jun 8, 2015

I do think there's something useful here, but perhaps not useful enough. Not for Go 1.5 at least.

@rsc rsc closed this as completed Jun 8, 2015
@golang golang locked and limited conversation to collaborators Jun 25, 2016
@rsc rsc removed their assignment Jun 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants