Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: encoding/binary: add WriteVarint WriteUVarint methods #29010

Closed
robaho opened this issue Nov 29, 2018 · 20 comments
Closed

proposal: encoding/binary: add WriteVarint WriteUVarint methods #29010

robaho opened this issue Nov 29, 2018 · 20 comments

Comments

@robaho
Copy link

robaho commented Nov 29, 2018

There are binary.ReadUVarint and binary.ReadVarint that accept io.ByteReader but no methods to write to an io.ByteWriter

This requires creating a []byte in order to use the Put methods, which is inefficient in high-volume encoding, due to allocation costs and the "copy back" operation.

The methods are trivial, so I am guessing there is some other reason ?

*** edit: add cost of "copy back" since the array cost is not the only consideration

@ianlancetaylor ianlancetaylor changed the title binary encoding is missing WriteVarint WriteUVarint methods encoding/binary: missing WriteVarint WriteUVarint methods Nov 29, 2018
@ianlancetaylor ianlancetaylor changed the title encoding/binary: missing WriteVarint WriteUVarint methods proposal: encoding/binary: add WriteVarint WriteUVarint methods Nov 29, 2018
@gopherbot gopherbot added this to the Proposal milestone Nov 29, 2018
@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Nov 29, 2018

I turned this into an API proposal.

@gopherbot
Copy link

Change https://golang.org/cl/152057 mentions this issue: fix issue #29010, proposal: encoding/binary: add WriteVarint WriteUVa…

@bradfitz
Copy link
Contributor

bradfitz commented Dec 5, 2018

This was rejected previously in #2748. See that bug.

Perhaps we could add:

type Writer struct {
    w io.Writer
    buf [10]byte
}

func NewWriter(w io.Writer) *Writer { ... }

func (w *Writer) WriteVarint(...) { ... }
func (w *Writer) WriteUvarint(...) { ... }

@robaho
Copy link
Author

robaho commented Dec 5, 2018

@bradfitz I don't understand how #2748 relates - it is just the opposite. This PR allows a buffer not to be allocated. Without these changes, a caller using io.ByteWriter must allocate buffers. There are no buffers allocated in the PR I submitted.

@robaho
Copy link
Author

robaho commented Dec 5, 2018

@bradfitz to clarify, that is why it takes an io.ByteWriter, not an io.Writer - similar to the Read methods

@ianlancetaylor
Copy link
Contributor

CL is at https://go-review.googlesource.com/c/go/+/152057 (golang.org/cl/152057 is hidden by #28836).

@robaho
Copy link
Author

robaho commented Dec 12, 2018

@ianlancetaylor the CL shows a code review score of -2 (if I’m reading that correctly), but I don’t see any review? Is there some docs on this part of the process I can read? Thanks.

@agnivade
Copy link
Contributor

Please see the comment above the -2 score.

Temporary -2 until the proposal is reviewed and accepted.

@rsc
Copy link
Contributor

rsc commented Dec 12, 2018

We rejected Write-based functions in both #2748 and #12009.
It seems weird to introduce WriteByte-based functions now.
None of the other functions here use WriteByte.

The root problem here is that it doesn't really seem worth
adding these very specific things in isolation, especially when
PutVarint can be trivially used instead.

This requires creating a []byte in order to use the Put methods, which is inefficient in high-volume encoding, due to allocation costs.

These are direct calls and they should be recorded by the compiler as not allowing their arguments to escape, so you should be able to use a stack-allocated array just fine.

That is:

var b [binary.MaxVarintLen64]byte
n := binary.PutVarint(b[:], 1000)

does not allocate (go tool compile -m confirms that b does not escape).

@robaho
Copy link
Author

robaho commented Dec 12, 2018

@rsc Then why are there analogous functions that take ByteReader? The proposals that were rejected were based on io.Writer not io.ByteWriter.

Furthermore you are forgetting that the common case for this is when the output is backed by a byte.Buffer. Even with a stack allocated array there is an additional copy required.

Honestly, this is such a trivial fix, and the inability for people to understand its valid underpinnings is somewhat scary.

@robaho
Copy link
Author

robaho commented Dec 12, 2018

@rsc sorry issue #12009 does refer to ByteWriter but it is closed for nonsensical reasons in the CL that have nothing to do with the points I am making here

@robaho
Copy link
Author

robaho commented Dec 12, 2018

The arguments don’t address the double copy requirements not having the methods creates. They are only concerned with the performance of interfaces - which if that is the underlying rejection reason, the performance of interfaces should be improved because having to create intermediate buffers is a bad design in these conditions.

This is where Java’s inlining/intrinsics excel - it allows you to write clean code and not give up the performance. There is no reason the compiler could not in-line this for the common case like byte.Buffer, either that or there need to be methods in the encoding to work with byte.Buffer which seems restrictive.

Rejecting this under the guise of protecting performance is wrong IMO. The additional copy might be worse than a one time interface deterrence if the compiler were smarter. And the poor code is definitely worse. Otherwise, its seems the Go leaders position is don’t use interfaces or your code will be slow, which would be a horrible position - either improve interface performance, or let the caller chose lesser performance but clean code.

@robaho
Copy link
Author

robaho commented Dec 12, 2018

An alternative would be to add varint methods to byte.Buffer, but as pointed out in the CL custom streams may not be using a byte.Buffer

@bradfitz
Copy link
Contributor

An alternative would be to add varint methods to byte.Buffer, but as pointed out in the CL custom streams may not be using a byte.Buffer

What's next? JPEG encoding methods on bytes.Buffer? ASN.1 encoding methods on bytes.Buffer? No, thanks. It'd be better to just accept a *bytes.Buffer if you need fast WriteByte calls not going through an interface value.

@robaho
Copy link
Author

robaho commented Dec 13, 2018

@bradfitz I agree, that is why the correct way is to add this PR. Depending on compiler implementation, the double copy is more expensive than the interface usage, especially given cache locality concerns. The pushback on this is somewhat baffling.

@robaho
Copy link
Author

robaho commented Dec 13, 2018

@bradfitz @rsc I also don't understand the 'interface' argument - the main method of this class is:

func Write(w io.Writer, order ByteOrder, data interface{})

which performs dynamic type checks, and even reflection, so arguing that adding these method would lead developers to write poor performing code doesn't hold water - otherwise this method would not exist.

Again, the proposed methods take a ByteWriter, not Writer, so zero allocations. An interface method lookup should be performed once. They are analogous to the ReadVarint methods that already exist in the package.

There is no suitable performant alternative - except for creating these routines locally - which is what I was forced to do - and what everyone is forced to do if they are using byte.Buffer. Not good code structure IMO, and its hard to understand why you are forcing developers down this route.

robaho added a commit to robaho/go that referenced this issue Dec 14, 2018
robaho added a commit to robaho/go that referenced this issue Dec 14, 2018
@robaho
Copy link
Author

robaho commented Dec 14, 2018

Here are the performance numbers:

goos: darwin
goarch: amd64
BenchmarkFunction-8     	20000000	        68.0 ns/op
BenchmarkBrittle-8      	30000000	        52.6 ns/op
BenchmarkByteWriter-8   	20000000	        83.2 ns/op
and the code
package main

import (
"bytes"
"encoding/binary"
"io"
"math/rand"
"testing"
)

// WriteUvarint encodes a uint64 onto w
func WriteUvarint(w io.ByteWriter, x uint64) error {
for x >= 0x80 {
err := w.WriteByte(byte(x) | 0x80)
if err != nil {
return err
}
x >>= 7
}
return w.WriteByte(byte(x))
}

func FWriteUvarint(b *bytes.Buffer, x uint64) error {
var buf [binary.MaxVarintLen64]byte

n := binary.PutUvarint(buf[:], x)
_, err := b.Write(buf[:n])
return err

}

func BenchmarkFunction(b *testing.B) {
buffer := &bytes.Buffer{}
for i := 0; i < b.N; i++ {
r := rand.Uint64()
FWriteUvarint(buffer, r)
}
}

func BenchmarkBrittle(b *testing.B) {
buffer := &bytes.Buffer{}
var bb [binary.MaxVarintLen64]byte
for i := 0; i < b.N; i++ {
r := rand.Uint64()
n := binary.PutUvarint(bb[:], r)
buffer.Write(bb[:n])
}
}

func BenchmarkByteWriter(b *testing.B) {
buffer := &bytes.Buffer{}
for i := 0; i < b.N; i++ {
r := rand.Uint64()
WriteUvarint(buffer, r)
}
}


I filed issue #29276 to address the interface dispatch performance. The generated code could easily be improved in my opinion, although the interface dispatch is only about 20% slower than the function call - which would be required in a maintainable code base IMO.

@rsc
Copy link
Contributor

rsc commented Dec 19, 2018

Then why are there analogous functions that take ByteReader?

Because when you have an I/O stream you don't have the luxury of necessarily knowing how many bytes are incoming. You can always compute how many bytes you might need to write (outgoing).

Let's leave this as is.

@rsc rsc closed this as completed Dec 19, 2018
@robaho
Copy link
Author

robaho commented Dec 19, 2018

@rsc I'm sorry but this reasoning is pretty weak. Anyone using the varint methods is either 1) write their own that allocates the array and calls the method, 2) copy and paste the varint code and have a local version.

Both are poor design decisions, but this flawed reasoning leaves that as the only option.

Under your reasoning, there shouldn't even be a byte.Buffer type, since the caller should be able to known the size of what it is writing. Not to mention, that the only "real reason" against this, is a performance one, and that is an abstraction cost that holds across all Go code - and should be dealt with elsewhere - not creating poor APIs just to help developers avoid performance problems.

** edit: actually, under your reasoning, why is there a ByteWriter interface at all if every API usage is deemed incorrect due to performance reasons. Further, why are there any interfaces at all? I'm sorry, but it makes zero sense.

@dazoe
Copy link

dazoe commented Jan 16, 2019

I just want to say i agree with @robaho here. I'm currently working on a personal side project and am only a hobbyist programmer, barely worthy of the title of beginner. I wrote a few functions to parse incoming packets utilizing binary.ReadUvarint then I needed write a function to create an outgoing packet. I figured I could just use a bytes.Buffer expecting a binary.WriteUvarint and was a little shocked to not find one.

@golang golang locked and limited conversation to collaborators Jan 16, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants