Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: encoding: add interface BinaryWriter to reduce pressure on GC #72915

Open
halturin opened this issue Mar 18, 2025 · 14 comments
Open

proposal: encoding: add interface BinaryWriter to reduce pressure on GC #72915

halturin opened this issue Mar 18, 2025 · 14 comments
Labels
LibraryProposal Issues describing a requested change to the Go standard library or x/ libraries, but not to a tool Proposal
Milestone

Comments

@halturin
Copy link

Proposal Details

The main problem with the existing https://pkg.go.dev/encoding#BinaryMarshaler - is that it puts a huge pressure on the garbage collector, as there is no way to reuse byte buffers.

In 1.24 https://pkg.go.dev/encoding#BinaryAppender was introduced, which partially solves this problem (though it doesn't help much in cases of reallocation)

It would be great to have Marshaler interface which allows you to keep the control over the underlying buffer.

type BinaryWriter interface {
	WriteBinary(io.Writer) error
}

As a working example:
EDF - is a binary format in the network stack of Ergo Framework https://github.com/ergo-services/ergo

It uses:
https://github.com/ergo-services/ergo/blob/master/net/edf/edf.go#L53

type Marshaler interface {
	MarshalEDF(io.Writer) error
}

This approach allows to reuse all the binary buffers

@gopherbot gopherbot added this to the Proposal milestone Mar 18, 2025
@gabyhelp gabyhelp added the LibraryProposal Issues describing a requested change to the Go standard library or x/ libraries, but not to a tool label Mar 18, 2025
@ianlancetaylor
Copy link
Member

CC @dsnet

@puellanivis
Copy link

I don’t see how this provides any greater functionality over the BinaryAppender interface… If the io.Writer is a bytes.Buffer, then the later is just wrapping a []byte internally, and thus suffers from the same exact reallocation issues as holding the byte slice yourself. And using a b = b[:0] to reset your byte slice, is not just vaguely the same, it’s the exact same first line of code in bytes.Buffer.Reset(), and if it didn’t need to maintain an io.Reader implementation, it wouldn’t need at all the second two lines of code.

Looking at the linked examples… I’m unsure how that code is bringing any advantage to GC pressure over just using BinaryAppender. Indeed, the ergo/lib.Buffer is throwing away just as many backing arrays on reallocations… the only difference is really that you’re keeping a side-band backing slice with a minimum size that you could have just already turned to the pool as soon as it was unused after a append call that would reallocate.

But I don’t want to get too in the weeds on that specific implementation, because my central consideration is what behavior does this provide that is not covered by good BinaryAppender use. I’m not saying that there’s zero benefit over BinaryAppender, but “reduced GC pressure” is not it, right?

@halturin
Copy link
Author

@puellanivis thanks for your comment.

The main idea of BinaryWriter is to provide full control over the underlying buffer, which essentially gives you control over the pressure on the GC. That's basically what I do in my project (Ergo Framework).

With BinaryAppender, double reallocation is the usual case

@mateusz834
Copy link
Member

With BinaryAppender, double reallocation is the usual case

But then the BinaryWriter would cause allocations on the caller side, right? BinaryWriter.WriteBinary(slice) would in most cases cause the slice to escape to the heap.

@halturin
Copy link
Author

would in most cases cause the slice to escape to the heap.

escapes only if buffer > 2^16

const size = 1 << 16

func write(b []byte) error {
	b[0] = 1
	return nil
}

func Benchmark_LargeSize_Stack_EqualOrLess65535(b *testing.B) {
	for i := 0; i < b.N; i++ {
		// not escape to heap when size < 2^16
		dataLarge := make([]byte, size)
		write(dataLarge)
	}
}
func Benchmark_LargeSize_Heap_LargerThan65535(b *testing.B) {
	for i := 0; i < b.N; i++ {
		// escape to heap when size is 2^16 and greater
		dataLarge := make([]byte, size+1)
		write(dataLarge)
	}
}
Image

@puellanivis
Copy link

puellanivis commented Mar 19, 2025

With BinaryAppender, double reallocation is the usual case

I don’t believe I understand what you mean here. Can you demonstrate some code that would “double reallocate” here, that could not easily be solved ahead of the code with a b = slices.Grow(b, expectedAdditionalNecessaryLength)?

Like:

func (f *Foo) AppendBinary(b []byte) ([]byte, error) {
  b = slices.Grow(b, 42)
  b = append(b, []byte{ /* 32 bytes of data */ }...)
  if cond {
    return append(b, []byte{ /* 10 bytes of data */ }...), nil
  }
  return append(b, []byte{ /* 10 different bytes of data */ }...), nil
}

Like, equally in your example, one could use b := make([]byte, 0, size) and then pass that through the various BinaryAppenders, right? And it wouldn’t escape into the heap the same as your benchmem demonstrates? And it would have at most one allocations on the demonstrated Foo.AppendBinary() as well.

@seankhliao
Copy link
Member

isn't this just the io.WriterTo interface?
https://pkg.go.dev/io#WriterTo

@mateusz834
Copy link
Member

escapes only if buffer > 2^16

You are testing a completely different case, you have to call an interface with slice in a way that the call is not devirtualized.

@halturin
Copy link
Author

escapes only if buffer > 2^16

You are testing a completely different case, you have to call an interface with slice in a way that the call is not devirtualized.

is it ok now or still a different case? :)

const size = 1 << 16

type A struct {
}

func (a *A) Write(b []byte) (int, error) {
	b[0] = 1
	return 0, nil
}

var (
	aaa = &A{}
)

func Benchmark_LargeSize_Stack_EqualOrLess65535(b *testing.B) {
	var writer io.Writer = aaa
	for i := 0; i < b.N; i++ {
		// not escape to heap when size < 2^16
		dataLarge := make([]byte, size)
		writer.Write(dataLarge)
	}
}
func Benchmark_LargeSize_Heap_LargerThan65535(b *testing.B) {
	var writer io.Writer = aaa
	for i := 0; i < b.N; i++ {
		// escape to heap when size is 2^16 and greater
		dataLarge := make([]byte, size+1)
		writer.Write(dataLarge)
	}
}
Image

@mateusz834
Copy link
Member

mateusz834 commented Mar 19, 2025

The global aaa must be of an iface type (so var aaa io.Writer = &A{}), since otherwise writer.Write is devirtualized directly to (*A).Write.

@halturin
Copy link
Author

The global aaa must be of an iface type (so var aaa io.Writer = &A{}), since otherwise writer.Write is devirtualized directly to (*A).Write.

thanks, missed that

@puellanivis
Copy link

isn't this just the io.WriterTo interface? https://pkg.go.dev/io#WriterTo

:blink: :blink: It never occurred to me that one could totally use it like that? 🤦‍♀ Like, to write a non-data stream thing to an io.Writer.

@mateusz834
Copy link
Member

:blink: :blink: It never occurred to me that one could totally use it like that? 🤦‍♀ Like, to write a non-data stream thing to an io.Writer.

Also this is a pattern that we already expose in few APIs like: https://pkg.go.dev/golang.org/x/tools/go/ssa#Function.WriteTo

@puellanivis
Copy link

:blink: :blink: It never occurred to me that one could totally use it like that? 🤦‍♀ Like, to write a non-data stream thing to an io.Writer.

Also this is a pattern that we already expose in few APIs like: https://pkg.go.dev/golang.org/x/tools/go/ssa#Function.WriteTo

Also, from http.Request, we have Write(w io.Writer) error which would match this proposed interface but for name. But it also writes in a textproto, right? 🤔

I think the interface might be a good idea in general… even if I’m not 100% sold on the “reduce GC pressure” argument.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LibraryProposal Issues describing a requested change to the Go standard library or x/ libraries, but not to a tool Proposal
Projects
Status: Incoming
Development

No branches or pull requests

7 participants