New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bufio: set default buffer size from *bytes.Reader #39332
Comments
The default bufio.Reader buffer size is 4096 bytes. That's far larger than most SDPs. Size it to match the input. We could do with smaller yet, but this is simple and easy and pretty high impact. I also filed golang/go#39332. name old time/op new time/op delta Unmarshal-8 9.44µs ± 5% 8.32µs ± 0% -11.83% (p=0.001 n=4+13) name old alloc/op new alloc/op delta Unmarshal-8 7.50kB ± 0% 4.04kB ± 0% -46.10% (p=0.000 n=4+15) name old allocs/op new allocs/op delta Unmarshal-8 133 ± 0% 133 ± 0% ~ (all equal)
The default bufio.Reader buffer size is 4096 bytes. That's far larger than most SDPs. Size it to match the input. We could do with smaller yet, but this is simple and easy and pretty high impact. I also filed golang/go#39332. name old time/op new time/op delta Unmarshal-8 9.44µs ± 5% 8.32µs ± 0% -11.83% (p=0.001 n=4+13) name old alloc/op new alloc/op delta Unmarshal-8 7.50kB ± 0% 4.04kB ± 0% -46.10% (p=0.000 n=4+15) name old allocs/op new allocs/op delta Unmarshal-8 133 ± 0% 133 ± 0% ~ (all equal)
Sounds reasonable. The dependency is already there anyway. We could in theory do even better though and use the same underlying array so bufio doesn't need to allocate its own of any size. |
In theory yes but I worry about Hyrum’s Law and people modifying the buffer underfoot. Maybe we should start there early in cycle and fall back to the less aggressive optimization if needed. |
Should |
@qingyunha i don’t think there’s a bufio->strings dependency yet, and I’d be reluctant to add one. |
If the optimization is based on |
We have been in the past. This same issue came up earlier for http.NewRequest's type switch: Line 874 in 1519bc4
|
(But if we wanted, we could use a combination of reflect + interface assertion to get the strings.Reader length without importing strings, and still verify it's only pkg strings and not something else with Len) |
The comment from Bradford Lamson-Scribner has been deleted since I started replying, but FWIW, you can obtain the bytes.Reader's underlying buffer like: https://play.golang.org/p/t0o_t_lO3ck That kinda violates the WriterTo contract at least in spirit, but limited shenanigans are generally permitted within the standard library when sufficiently documented & tested. Better than unsafe at least. |
@bradfitz thank you! I saw @josharian post this issue, and was interested in putting together a CL. Not long after posting, I realized it could be a great first contribution for someone (maybe depending on which implementation is preferred) and sort of took my hat out of the ring. I will keep 👀 on this though, if no one else gains interest - happy to work on it. |
@bradford-hamilton I say go for it. I’d be tempted to do two CLs, first only using Len, then sharing the underlying buffer, so that we can roll back the second one independently if needed. @bradfitz instead of WriteTo, seems like just Bytes would do it. What am I missing? @bcmills yeah, I wish we could use a method, and we probably could have long ago. Probably not now, sadly. |
@josharian, |
@bcmills oops, indeed. Thanks. |
@josharian sounds good, I'll start with a CL using the bytes.Reader's Len(), and follow it up with another that uses the same underlying byte slice. Sounds like @bcmills suggestion of a |
Adding new API requires opening a new proposal issue first. Feel free to do that! |
Sure, that makes sense to me. I will put some thought into it and maybe put something together after feeling it out a bit in these CLs. I'll keep this conversation posted. |
Change https://golang.org/cl/235977 mentions this issue: |
It occurs to me that the existence of Consider what happens if, say, the Or what happens if the Programs probably already assume that the buffer size of a — In contrast, a poor size is trivial to fix on the caller side when appropriate: var r *bufio.Reader
if br, ok := b.(*bytes.Reader); ok {
r = bufio.NewReaderSize(br, br.Len())
} else {
r = bufio.NewReader(b)
} |
@bcmills I’m still pondering. I wonder whether we should track whether we auto-sizes the buffer originally and be prepared to switch to a large buffer if needed during Reset. Then we potentially have two allocations instead of one, but at most two. That’s for the “use Len” approach. For the more invasive, “re-use the existing buffer” approach, i think it is clearer what to do in Reset—if we are working with a borrowed buffer, switch to a new borrowed buffer (or allocate a new one if there isn’t one to borrow). If that’s all too clever, maybe we should just address this with documentation. (And a staticcheck check? cc @dominikh ) |
I think that's too clever. I'm not even convinced this is worth a staticcheck warning.
This is probably a common pattern inside tests, but how much does the wasted 4k matter in a test? And how often does it happen in non-test code? Just looking at the Go tree, there are 4 instances of the string |
FWIW, I came across it in real code by profiling an actual server. |
Sounds like the consensus is that this isn’t possibly to do now in a way that is simple and uniformly helpful. |
The default bufio.Reader buffer size is 4096 bytes. That's far larger than most SDPs. Size it to match the input. We could do with smaller yet, but this is simple and easy and pretty high impact. I also filed golang/go#39332. name old time/op new time/op delta Unmarshal-8 9.44µs ± 5% 8.32µs ± 0% -11.83% (p=0.001 n=4+13) name old alloc/op new alloc/op delta Unmarshal-8 7.50kB ± 0% 4.04kB ± 0% -46.10% (p=0.000 n=4+15) name old allocs/op new allocs/op delta Unmarshal-8 133 ± 0% 133 ± 0% ~ (all equal)
The default bufio.Reader buffer size is 4096 bytes. That's far larger than most SDPs. Size it to match the input. We could do with smaller yet, but this is simple and easy and pretty high impact. I also filed golang/go#39332. name old time/op new time/op delta Unmarshal-8 9.44µs ± 5% 8.32µs ± 0% -11.83% (p=0.001 n=4+13) name old alloc/op new alloc/op delta Unmarshal-8 7.50kB ± 0% 4.04kB ± 0% -46.10% (p=0.000 n=4+15) name old allocs/op new allocs/op delta Unmarshal-8 133 ± 0% 133 ± 0% ~ (all equal)
The default bufio.Reader buffer size is 4096 bytes. That's far larger than most SDPs. Size it to match the input. We could do with smaller yet, but this is simple and easy and pretty high impact. I also filed golang/go#39332. name old time/op new time/op delta Unmarshal-8 9.44µs ± 5% 8.32µs ± 0% -11.83% (p=0.001 n=4+13) name old alloc/op new alloc/op delta Unmarshal-8 7.50kB ± 0% 4.04kB ± 0% -46.10% (p=0.000 n=4+15) name old allocs/op new allocs/op delta Unmarshal-8 133 ± 0% 133 ± 0% ~ (all equal)
I just came across some code that does this rather natural thing:
However,
len(buf)
is typically low, much smaller than the default buffer size of 4096. I think thatbufio.NewReader
should check whetherr
is a*bytes.Reader
, and if so, user.Len()
to size the buffer (assumingr.Len()
is less than 4096).Implementation is easy, and would be a good first contribution for someone.
Any objections?
cc @griesemer @bradfitz @ianlancetaylor per golang.org/s/owners
The text was updated successfully, but these errors were encountered: