Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: wbuf allocation increased significantly from 1.5 to 1.6 #15319

Closed
Ariemeth opened this issue Apr 15, 2016 · 13 comments
Closed

runtime: wbuf allocation increased significantly from 1.5 to 1.6 #15319

Ariemeth opened this issue Apr 15, 2016 · 13 comments
Milestone

Comments

@Ariemeth
Copy link

Ariemeth commented Apr 15, 2016

Please answer these questions before submitting your issue. Thanks!

  1. What version of Go are you using (go version)?
    1.6, 1.6.1, 1.5.2
  2. What operating system and processor architecture are you using (go env)?
    set GOARCH=amd64
    set GOBIN=
    set GOEXE=.exe
    set GOHOSTARCH=amd64
    set GOHOSTOS=windows
    set GOOS=windows
    set GOPATH=C:\Development\Projects\go
    set GORACE=
    set GOROOT=C:\Go
    set GOTOOLDIR=C:\Go\pkg\tool\windows_amd64
    set GO15VENDOREXPERIMENT=
    set CC=gcc
    set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0
    set CXX=g++
    set CGO_ENABLED=1
  3. What did you do?
    If possible, provide a recipe for reproducing the error.
    A complete runnable program is good.
    A link on play.golang.org is best.
    Ran the following program on windows using go 1.6.1 and found go routines using ~11kB of memory each. Decided to run the test again using go 1.5.2 to see if there was a difference in the amount of memory being used per go routine.
    https://play.golang.org/p/bgP8fs5O7q
  4. What did you expect to see?
    I had expected to see go routines using approximately the same memory in different versions of go.
  5. What did you see instead?
    Go routines were using ~11kB of memory in 1.6.1 and 9.5kB in 1.5.2.
@bradfitz
Copy link
Contributor

I think you mean per goroutine. Your program has only one channel total.

@bradfitz bradfitz changed the title Memory usage of channels increased on Windows by 20% from 1.5 to 1.6 runtime: memory usage of goroutines on Windows increased by 20% from 1.5 to 1.6 Apr 15, 2016
@bradfitz bradfitz added this to the Unplanned milestone Apr 15, 2016
@bradfitz
Copy link
Contributor

/cc @alexbrainman @aclements

@aclements
Copy link
Member

The channel may actually be important here. On Linux, it went from 2,590 bytes in 1.5.2 to 4,709 bytes in 1.6. However, if you replace <-ch with select {}, it only goes up to 2,600 bytes in 1.6. This suggests that, at least on Linux, the channel operation used to fit in the initial 2K stack allocation and now doesn't. The explanation may be different on Windows, however, since there the initial stack is 8K, so it would grow to 16K if it did grow, which is more than the observed 11K.

@aclements
Copy link
Member

Interesting. 1.6 is not growing the stack, so that's not where the extra memory is coming from.

@aclements
Copy link
Member

The extra memory is all in mstats.GCSys, which went from 33,596,416 in 1.5.2 to 212,801,536 in 1.6. The majority of that is almost certainly in workbufs. I'm not surprised there are a fair number of workbufs, since it's going to pick up all of the sudogs created by the blocked goroutines during stack scanning. However, I don't know why it would have increased so much since 1.5.2.

/cc @RLH

@Ariemeth
Copy link
Author

You are right. I had channels on the brain. I meant go goroutine

@Ariemeth
Copy link
Author

Thank you for fixing that @bradfitz

@valyala
Copy link
Contributor

valyala commented Apr 16, 2016

The channel may actually be important here. On Linux, it went from 2,590 bytes in 1.5.2 to 4,709 bytes in 1.6. However, if you replace <-ch with select {}, it only goes up to 2,600 bytes in 1.6. This suggests that, at least on Linux, the channel operation used to fit in the initial 2K stack allocation and now doesn't.

This sounds pretty bad :(
Is there justified reason for such a large stack size fo channel operations? This effectively prohibits using channels in memory-effective highly concurrent code operating millions of goroutines.

@aclements
Copy link
Member

@valyala, I confirmed (in my later comments) that it's not in fact stack growth causing this. The goroutines are still running on their initial stack allocation. What's causing the increased memory usage is that GC is allocating more internal memory (most likely work buffers), though I haven't tracked down why yet.

@aclements aclements modified the milestones: Go1.7, Unplanned Apr 20, 2016
@aclements aclements changed the title runtime: memory usage of goroutines on Windows increased by 20% from 1.5 to 1.6 runtime: wbuf allocation increased significantly from 1.5 to 1.6 May 24, 2016
@aclements
Copy link
Member

Using benchmany run -n 1 -order metric -metric gc-bytes -buildcmd 'go build' go1.5..go1.6 to bisect on memstats.GCSys between 1.5 and 1.6, there are two clear change points: commit 1870572 made it go from 33.6 MB to 417 MB and commit b6c0934 made it go down to 213 MB.

This makes sense to some extent: 1870572 increased the size of the workbuf by 16x, but that was supposed to mean we had ~16x fewer of them. Commit b6c0934 then halved the workbuf size (since it started caching two of them). Instead, in this benchmark, we have almost the same number of workbufs. The next step is to figure out why they aren't being reused like they're supposed to be.

@aclements aclements self-assigned this May 24, 2016
@aclements
Copy link
Member

This is happening because of the dispose in scanstack. Because of the rutime.GC calls, alls stacks are being scanned during mark termination, which causes every scanstack to dispose its buffer. Even though there are only a few pointers in the buffer when it's disposed here, it goes to the "full" queue. Since all of the stack scans happen before we start draining mark work during mark termination, the number of work buffers is proportional to the number of stacks, rather than the number of pointers. In fact, the math works out almost exactly: 213 MB / 2048 bytes/workbuf = 1.09e5 workbufs ≈ 1e5 goroutines.

@aclements
Copy link
Member

I have a fairly simple fix that reduces this test down to 10 MB of workbufs. I'll test and benchmark it more thoroughly tomorrow and send a CL.

@gopherbot
Copy link

CL https://golang.org/cl/23391 mentions this issue.

@golang golang locked and limited conversation to collaborators May 25, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants