Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: encoding/binary: Add SizeUvarint() and SizeVarint() #66284

Open
friendlymatthew opened this issue Mar 12, 2024 · 9 comments · May be fixed by #66366
Open

proposal: encoding/binary: Add SizeUvarint() and SizeVarint() #66284

friendlymatthew opened this issue Mar 12, 2024 · 9 comments · May be fixed by #66366
Labels
Milestone

Comments

@friendlymatthew
Copy link

friendlymatthew commented Mar 12, 2024

Proposal Details

It would be nice to have a function like SizeUvarint() and SizeVarint() that calculates the size of a UVarint or Varint representation without actually appending the bytes to a buffer. Exactly like AppendUvarint but it simply counts the bytes that would have been appended.

The current roundabout way of doing this is something like:

lengthBytes := len(binary.AppendUvarint([]byte{}, uint64(k.DataPointer.Length)))
@gopherbot gopherbot added this to the Proposal milestone Mar 12, 2024
@dsnet
Copy link
Member

dsnet commented Mar 13, 2024

What's the intended use-case for this?

You could also use protowire.SizeVarint, which does the same thing.

@kevmo314
Copy link

What's the intended use-case for this?

You could also use protowire.SizeVarint, which does the same thing.

We'd like to preallocate a buffer which includes a uvarint in the middle. Actually, our use case is pretty similar to that of protobuf's implementation looking at the references.

It doesn't really make sense for us to import a whole library just for one line of code though, in fact it seems like the existence of this code in protowire suggests that there is a valid use case for it that's missing in the stdlib, compared to the other Size* functions in that library which reference types specific to the protobuf library.

@Jorropo
Copy link
Member

Jorropo commented Mar 14, 2024

I usually preallocate MaxVarintLen64 (or 32 or 16) because the difference between preallocating 3 to 10 bytes is trivial (unless you have a huge slice of theses or something) and using a fixed size let the compiler stack allocate the buffer.

@kevmo314
Copy link

That doesn't quite work for us because we want to determine if adding a field into our serialized buffer causes the buffer to exceed a certain size, and if so we do a different allocation strategy (potentially writing it to a different buffer). Hardcoding the buffer to MaxVarintLen64 would break this calculation, which would defeat the benefit of using a varint for our use case.

@adonovan
Copy link
Member

Though the use-case seems genuine, it seems like a pretty obscure one as in most cases (as @Jorropo points out) you can allocate a local temporary array of the maximum varint size (10) and encode into that to find the size, then copy from the local buffer to the output buffer.

BTW, the proposal should include a signed variant as well.

@friendlymatthew friendlymatthew changed the title proposal: encoding/binary: Add SizeUvarint() proposal: encoding/binary: Add SizeUvarint() and SizeVarint() Mar 14, 2024
@ericlagergren
Copy link
Contributor

It doesn't really make sense for us to import a whole library just for one line of code though

A little copying never hurt :)

That said, I agree these would be good additions to the stdlib. I've written them myself a number of times when working with serialization code.

@dsnet
Copy link
Member

dsnet commented Mar 17, 2024

There is the existing Size function, which matches the number of bytes that the Write function would generate.

A SizeUvarint and SizeVarint function would match that same precedent.

IIRC, protobuf had another use case for this where you're encoding a large slice of integers, and it is faster to pre-compute the necessary buffer before encoding. Naively assuming that the buffer needs to be 10*len(integers) may grossly over-allocate for small integers.

@friendlymatthew
Copy link
Author

I don't mean to jump the gun, but #66366 merely follows protowire.SizeVarint to compute the encoded size. Signed integers follow the zigzag encoding in PutVarint:

func PutVarint(buf []byte, x int64) int {
ux := uint64(x) << 1
if x < 0 {
ux = ^ux
}
return PutUvarint(buf, ux)
}

@gopherbot
Copy link

Change https://go.dev/cl/572196 mentions this issue: encoding/binary: Add SizeUvarint() and SizeVarint()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

Successfully merging a pull request may close this issue.

7 participants