Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: bytes: add Pool for buffer management #48535

Closed
lesismal opened this issue Sep 22, 2021 · 16 comments
Closed

proposal: bytes: add Pool for buffer management #48535

lesismal opened this issue Sep 22, 2021 · 16 comments

Comments

@lesismal
Copy link

lesismal commented Sep 22, 2021

Background

Unlike other object type pools used for different repositories themselves, the []byte pool is a common need of many repositories.
Many repositories or packages with this need are still customized themselves or use buffers without a pool, such as:
https://github.com/golang/go/blob/master/src/crypto/tls/conn.go#L930
Or here if we use a pool may be better:
https://github.com/golang/protobuf/blob/master/proto/wire.go#L23

I am now working on a non-blocking lib nbio that aims to solve 1000k problems which is still a difficult job for net/http.
nbio is event-driven and allowed users to customize their own size-limited goroutine pool to handle messages of connections, instead of serving each connection with at least one goroutine, then we save lots of goroutines, reduce lots of memory usage and avoid stw.
nbio has supported non-blocking tls/http1.x/websocket and has mostly achieved the results. Here is some load test compared with std: #15735 (comment).

There were some difficulties I've met during the implementation of these non-blocking-io and streaming-parser, especially the buffer's life cycle management.
Unlike std-based frameworks in which the buffer can be easily reused in a for loop in the same goroutine, the buffers in nbio are passed between different goroutines and different protocol layers(tcp/tls/http/websocket).
To reuse the buffers and reduce the memory usage as much as possible, I manage all the buffers by the same []byte pool, although I've already achieved it, but not convenient.
I think it's much better if std provides such a []byte pool, and many different pkg will be able to use the pool together which could promote the reuse rate and reduce more memory usage and gc.

Possible Implementation

I prefer to add a []byte pool in bytes like this:

package bytes

import (
	"sync"
)

// maxAppendSize represents the max size to append to a slice.
const maxAppendSize = 1024 * 1024 * 4

// Pool is the default instance of []byte pool.
// User can customize a Pool implementation and reset this instance if needed.
var Pool interface {
	Get() []byte
	GetN(size int) []byte
	Put(b []byte)
} = NewPool(64)

// bufferPool is a default implementatiion of []byte Pool.
type bufferPool struct {
	sync.Pool
	MinSize int
}

// NewPool creates and returns a bufferPool instance.
// All slice created by this instance has an initial cap of minSize.
func NewPool(minSize int) *bufferPool {
	if minSize <= 0 {
		minSize = 64
	}
	bp := &bufferPool{
		MinSize: minSize,
	}
	bp.Pool.New = func() interface{} {
		buf := make([]byte, bp.MinSize)
		return &buf
	}
	return bp
}

// Get gets a slice from the pool and returns it with length 0.
// User can append the slice and should Put it back to the pool after being used over.
func (bp *bufferPool) Get() []byte {
	pbuf := bp.Pool.Get().(*[]byte)
	return (*pbuf)[0:0]
}

// GetN returns a slice with length size.
// To reuse slices as possible,
// if the cap of the slice got from the pool is not enough,
// It will append the slice,
// or put the slice back to the pool and create a new slice with cap of size.
//
// User can use the slice both by the size or append it,
// and should Put it back to the pool after being used over.
func (bp *bufferPool) GetN(size int) []byte {
	pbuf := bp.Pool.Get().(*[]byte)
	need := size - cap(*pbuf)
	if need > 0 {
		if need <= maxAppendSize {
			*pbuf = (*pbuf)[:cap(*pbuf)]
			*pbuf = append(*pbuf, make([]byte, need)...)
		} else {
			bp.Pool.Put(pbuf)
			newBuf := make([]byte, size)
			pbuf = &newBuf
		}
	}

	return (*pbuf)[:size]
}

// Put puts a slice back to the pool.
// If the slice's cap is smaller than MinSize,
// it will not be put back to the pool but dropped.
func (bp *bufferPool) Put(b []byte) {
	if cap(b) < bp.MinSize {
		return
	}
	bp.Pool.Put(&b)
}
@gopherbot gopherbot added this to the Proposal milestone Sep 22, 2021
@komuw
Copy link
Contributor

komuw commented Sep 22, 2021

// Put puts a slice back to the pool.
// If the slice's cap is smaller than MinSize,
// it will not be put back to the pool but dropped.

see: #23199

@lesismal
Copy link
Author

lesismal commented Sep 22, 2021

// Put puts a slice back to the pool.
// If the slice's cap is smaller than MinSize,
// it will not be put back to the pool but dropped.

see: #23199

They are different!

The problem described in #23199 is about Pool for bytes.Buffer which is usually bound to an alive object, such as a Connection, and the bytes.Buffer is usually held for a long time by the object which means it could not be reused when it was alive with the object.

But in this issue, we talk about just a []byte, that would not or should not be bound to any Object that has a long life cycle. For example, in nbio, every body/buffer is held by a message, we got it from the Pool when we start parse a new message, we release it and put it back to the pool soon after we handled the message, it was not bound to the Connection and was not held for long.
Or for protobuf, we usually Mashal a protobuf Message to a buffer, then send the buffer, then we can put the buffer back to the Pool.

We get the []byte from the Pool, and put it back to the Pool soon, that means we could promote the reuse rate of the buffers, then, the buffers in the Pool keep a small num.

For an example of an std solution, here are 1000k online connections, the qps/tps is not too large, maybe 10k. There are two bytes.Buffer for each connection(one for read, the other for write), then it cost us 2000k bytes.Buffer.
But for the same 1000k online connections and the same qps/tps of 10k, we need one []byte for each message we are processing or in the processing queue, and the num should be smaller than 10k.

@lesismal
Copy link
Author

Here is the simple load test result:

image

@dsnet
Copy link
Member

dsnet commented Sep 22, 2021

Any pool for variable sized buffers needs to keep track of utilization and somehow make use of that metric to ensure that the way the buffers is used matches how they are stored. The proposed implementation does not do that.

Have you seen #27735 (comment)? It takes into consideration utilization, but doesn't provide a GetN operation. Personally, I have not found the GetN operation to be all that useful since N is often provided by the client and cannot be trusted (i.e., if the client claims to be sending 1GiB data, the server should not blindly allocate 1GiB up front; at worst the client should be forced to send at least 1GiB of data before the server allocates 1GiB of memory).

@lesismal
Copy link
Author

lesismal commented Sep 22, 2021

Any pool for variable sized buffers needs to keep track of utilization and somehow make use of that metric to ensure that the way the buffers is used matches how they are stored. The proposed implementation does not do that.

Have you seen #27735 (comment)? It takes into consideration utilization, but doesn't provide a GetN operation. Personally, I have not found the GetN operation to be all that useful since N is often provided by the client and cannot be trusted (i.e., if the client claims to be sending 1GiB data, the server should not blindly allocate 1GiB up front; at worst the client should be forced to send at least 1GiB of data before the server allocates 1GiB of memory).

GetN does provide convenience for users, else users need to Get first and then do more operations for the buffer to make sure of enough length.

About the trust of the client, we usually set a read-limited size which also limits the buffer size we want to get from the Pool.
And for different applications, if there is not a common buffer solution that can solve all different scenarios, the programmer should design more details for the usage of buffers, such as you have mentioned: #23199 (comment).

at worst the client should be forced to send at least 1GiB of data before the server allocates 1GiB of memory).

This sounds right, but how can we know whether the client has sent 1GB? We can't know that before allocating the buffer and reading the data, so we still need to allocate the buffer first. If we read the data by several small buffers first then copy them to a 1GB buffer, it leads to another problem of performance.

The limitation is necessary, otherwise, even if the client really sent 1GB or even 1TB, we still OOM.

Although I think we should allocate the buffer first, I don't think allocating 1GB is right for most scenarios, setting a read-limited size or other types of limitations are right for most of the time. And for a specific scenario, we should not use a common buffer pool, but we customize it.
Still take 1GB for example, we can use mutex or channel to serialize the requests, then we handle the request one by one and reuse the 1GB in each request, this the usual way that we guard our service keeping away from OOM or other huge costs(just an example, not for all scenarios).

@lesismal
Copy link
Author

lesismal commented Sep 22, 2021

BTW, neither a Pool with variable-sized buffers nor a Pool with an array of sync.Pools to bucketize the items by size could solve all scenarios, we should not ask for, and are not possible to gain a Pool that could be used anywhere.

But it will help for most of the scenarios if std provides this commonly used type of Pool.

@ianlancetaylor ianlancetaylor added this to Incoming in Proposals (old) Sep 22, 2021
@ericlagergren
Copy link
Contributor

Quoting #23199:

…every use of sync.Pool is a bug report on the GC.

Outside of your very specific use case, what benefit is there to adding a new pool to the stdlib?

@lesismal
Copy link
Author

// Put puts a slice back to the pool.
// If the slice's cap is smaller than MinSize,
// it will not be put back to the pool but dropped.

see: #23199

They are different!

The problem described in #23199 is about Pool for bytes.Buffer which is usually bound to an alive object, such as a Connection, and the bytes.Buffer is usually held for a long time by the object which means it could not be reused when it was alive with the object.

But in this issue, we talk about just a []byte, that would not or should not be bound to any Object that has a long life cycle. For example, in nbio, every body/buffer is held by a message, we got it from the Pool when we start parse a new message, we release it and put it back to the pool soon after we handled the message, it was not bound to the Connection and was not held for long.
Or for protobuf, we usually Mashal a protobuf Message to a buffer, then send the buffer, then we can put the buffer back to the Pool.

We get the []byte from the Pool, and put it back to the Pool soon, that means we could promote the reuse rate of the buffers, then, the buffers in the Pool keep a small num.

For an example of an std solution, here are 1000k online connections, the qps/tps is not too large, maybe 10k. There are two bytes.Buffer for each connection(one for read, the other for write), then it cost us 2000k bytes.Buffer.
But for the same 1000k online connections and the same qps/tps of 10k, we need one []byte for each message we are processing or in the processing queue, and the num should be smaller than 10k.

@ericlagergren I have replied about #23199 here, please check the differences.

My case is not very specific, lots of cases that use buffers with a short life cycle could benefit from a common buffer pool.
Of course, we can customize more details such as #23199 (comment).
And at least, we could provide a config field to control a pkg to use the std's pool or itself's pool.

@ericlagergren
Copy link
Contributor

ericlagergren commented Sep 25, 2021 via email

@lesismal
Copy link
Author

lesismal commented Sep 25, 2021

That doesn’t really address my question, though. This sort of pool (of objects without costly initialization) shouldn’t be necessary: it means there is a deficiency in the GC (for a particular use case, anyway). Go’s compatibility promise means there is a high bar for adding new APIs to the stdlib. It’s not clear how this proposed API—which only works around a known, current deficiency in the GC—meets that bar.

For an std-based HTTP service like one-connection-one-goroutine, the allocating of buffers which is not too big size is goroutine-stack-friendly most of the time, and buffer for reading can be reused in a loop:

var buffer []byte = ...
for { 
    request, err := readRequest(conn, buffer...)
    if err {
        ...
        break
    }
    handleRequest(request)
}

But for more programs(maybe pushing, game, IM services), buffers are used between different goroutines, which makes the buffers not goroutine-stack-friendly and can't be reused that easily. Then it becomes necessary.

@rsc
Copy link
Contributor

rsc commented Oct 6, 2021

Why does this need to be in the standard library? Why can't it be done in a third-party package?

@rsc
Copy link
Contributor

rsc commented Oct 6, 2021

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc moved this from Incoming to Active in Proposals (old) Oct 6, 2021
@lesismal
Copy link
Author

lesismal commented Oct 7, 2021

Why does this need to be in the standard library? Why can't it be done in a third-party package?

When a project depends on multiple libs, the libs of these different authors do not use the same third-party lib, and the project cannot use the same buffer pool together, but if the std provides a buffer pool, different authors will be more inclined to use the std.

@rsc
Copy link
Contributor

rsc commented Oct 13, 2021

A global byte pool seems pretty dangerous anyway. It would probably be better to follow sync.Pool and have individual pools. At that point it doesn't matter whether different authors use different pools.

We definitely don't want to encourage everyone to start using a global []byte pool instead of make and relying on the GC for cleanup. That will just lead to tons of dangling pointer errors. A pool is better to use in the few places where it really matters and you can be extra careful to get it right.

@lesismal
Copy link
Author

That's all right, I'll close this issue.
Thank you for your time!

@rsc rsc moved this from Active to Declined in Proposals (old) Oct 20, 2021
@rsc
Copy link
Contributor

rsc commented Oct 20, 2021

This proposal has been declined as retracted.
— rsc for the proposal review group

@golang golang locked and limited conversation to collaborators Oct 20, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
Development

No branches or pull requests

6 participants