Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: mime/multipart.Reader should use sync.Pool and reuse bytes.Buffers to reduce memory allocation pressure #21516

Closed
SunRunAway opened this issue Aug 18, 2017 · 12 comments
Labels
FrozenDueToAge Proposal WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Milestone

Comments

@SunRunAway
Copy link

var b bytes.Buffer

We have a upload server using req.ParseMultipartForm. GOGCTRACE tells us gabarage collector takes 8% cpu usage and the heap profile tells us mine/multipart.Reader.ReadForm allocates lots of memory.

qq20170818-1 2x

@odeke-em
Copy link
Member

Thank you @SunRunAway for reporting the issue.

Would you mind elaborating more on what you would like to have this issue to solve?

@SunRunAway
Copy link
Author

hi, @odeke-em
I've send a CL. https://go-review.googlesource.com/c/56990

@gopherbot
Copy link

Change https://golang.org/cl/56990 mentions this issue: mime/multipart: use sync.Pool in ReadForm

@odeke-em odeke-em changed the title mime/multipart: Reader uses sync.Pool with memory buffer proposal: mime/multipart.Reader should use sync.Pool and reuse bytes.Buffers to reduce memory allocation pressure Aug 18, 2017
@gopherbot gopherbot added this to the Proposal milestone Aug 18, 2017
@rsc
Copy link
Contributor

rsc commented Aug 21, 2017

One small issue is that we try to avoid using a sync.Pool for buffers that are ever observed by user space. So if Reader calls some arbitrary io.Reader's Read method to fill in the buffer, it should not go back into the pool, to avoid a dangling pointer bug in that Reader from corrupting some future unrelated call.

@SunRunAway
Copy link
Author

@rsc Sorry, I cannot understand. Could you give an example please?

@nightlyone
Copy link
Contributor

Alternative fixes would involve making the multipart.Reader re-usable by supporting a Reset method and store a bytes.Buffer as well as a buffer for io.CopyBuffer inside it for re-use. So the http library can do the re-use and pooling on request level.

@SunRunAway
Copy link
Author

Hi, do you have any suggestion to do or decision about this issue & CL?

@valyala
Copy link
Contributor

valyala commented Aug 28, 2017

One small issue is that we try to avoid using a sync.Pool for buffers that are ever observed by user space. So if Reader calls some arbitrary io.Reader's Read method to fill in the buffer, it should not go back into the pool, to avoid a dangling pointer bug in that Reader from corrupting some future unrelated call.

This restriction sounds strange. It is based on a weird corner case that should never happen in real code. How many io.Reader implementations exist in the wild that keep references to buffer memory passed into Read call and then modify this memory after returning from Read call? I think the number of such io.Readers is close to 0. Additionally, race detector should detect such cases.

update:
The only easy-to-miswrite case is the following:

func Read(p []byte) (int, error) {
  go fillBuffer(p)
  ...
}

But it is already broken:

  • it has data race, since p contents may be modified when the Read caller reads it.
  • it results in a memory leak if fullBuffer never returns and p is allocated before each Read call.

@rsc
Copy link
Contributor

rsc commented Aug 28, 2017

It is based on a weird corner case that should never happen in real code.

In general all reasons not to perform optimizations fall into that category. I'd still like to keep bad code in one package from causing mysterious failures in an unrelated package, and that's the motivation for the restriction I described.

On the other hand it sounds like you have a significant opportunity to make your server faster. The question is whether there's a way to do it at a slightly higher level and avoid the possible corruption in buggy code.

@SunRunAway
Copy link
Author

SunRunAway commented Aug 30, 2017

Alternative fixes would involve making the multipart.Reader re-usable by supporting a Reset method and store a bytes.Buffer as well as a buffer for io.CopyBuffer inside it for re-use. So the http library can do the re-use and pooling on request level.

The multipart.ReadForm has a for loop to handle more than one file, if we do it in the http library the bytes.Buffer stored in multipart.Reader should be used by multiple files. The code may be complex.
Another con is the re-use cycle will persist until the http library calls MultipartForm.RemoveAll() wether the file stored in temperary file or not.

@ianlancetaylor
Copy link
Contributor

Per #21516 (comment) above, do you have a revised API suggestion?

@rsc rsc added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jun 5, 2018
@gopherbot
Copy link

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

@golang golang locked and limited conversation to collaborators Jul 5, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge Proposal WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Projects
None yet
Development

No branches or pull requests

7 participants