Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/trace: running out of memory on the Plan 9 builder #12032

Closed
0intro opened this issue Aug 5, 2015 · 13 comments
Closed

runtime/trace: running out of memory on the Plan 9 builder #12032

0intro opened this issue Aug 5, 2015 · 13 comments

Comments

@0intro
Copy link
Member

0intro commented Aug 5, 2015

Since July 22, the runtime/trace tests are regularly failing on the Plan 9 builder.

http://build.golang.org/log/15f2a83855da9c8c7f55b590a285dc483f0fb986

Failed to schedule "go_test:runtime/trace" test after 3 tries.

It seems the Plan 9 builder runs out of memory when running the runtime/trace tests.

Moving the Plan 9 builder from a n1-highcpu-2 (1.8 GB) to a n1-highcpu-4 (3.8 GB)
machine seems to work around the issue.

We should figure out why the runtime/trace tests are using so much memory on Plan 9.

@0intro 0intro added the OS-Plan9 label Aug 5, 2015
@0intro 0intro self-assigned this Aug 5, 2015
@0intro 0intro added this to the Go1.6 milestone Aug 5, 2015
@gopherbot
Copy link

CL https://golang.org/cl/13077 mentions this issue.

@0intro
Copy link
Member Author

0intro commented Aug 5, 2015

Here are the garbage collector traces:

% GODEBUG='gctrace=1' go test -v -run TestTraceStress'$'
gc 1 @0.161s 0%: 0.016+0.63+0.007+0.85+0.22 ms clock, 0.016+0.63+0+0/0.74/0+0.22 ms cpu, 4->4->0 MB, 4 MB goal, 1 P
gc 2 @0.776s 0%: 0.015+0.50+0.007+0.93+0.19 ms clock, 0.015+0.50+0+0/0.82/0+0.19 ms cpu, 4->4->0 MB, 4 MB goal, 1 P
# testing
gc 1 @0.042s 15%: 0.019+1.1+0.012+6.4+0.48 ms clock, 0.019+1.1+0+0/6.0/0+0.48 ms cpu, 4->4->2 MB, 4 MB goal, 1 P
gc 2 @0.077s 18%: 0.015+0.62+0.007+8.0+0.50 ms clock, 0.015+0.62+0+0/7.6/0+0.50 ms cpu, 4->4->3 MB, 4 MB goal, 1 P
gc 3 @0.108s 23%: 0.017+0.52+0.007+10+0.47 ms clock, 0.017+0.52+0+0/10/0+0.47 ms cpu, 6->6->5 MB, 7 MB goal, 1 P
gc 4 @0.144s 26%: 0.020+0.46+0.008+13+0.46 ms clock, 0.020+0.46+0+0/13/0+0.46 ms cpu, 9->9->6 MB, 10 MB goal, 1 P
gc 5 @0.194s 29%: 0.023+0.54+0.007+20+0.51 ms clock, 0.023+0.54+0+0/19/0+0.51 ms cpu, 13->13->9 MB, 13 MB goal, 1 P
# runtime/trace_test
gc 1 @0.040s 17%: 0.016+1.1+0.007+7.7+0.48 ms clock, 0.016+1.1+0+0/7.2/0+0.48 ms cpu, 4->4->3 MB, 4 MB goal, 1 P
gc 2 @0.079s 22%: 0.025+0.63+0.007+10+0.49 ms clock, 0.025+0.63+0+0/10/0+0.49 ms cpu, 5->5->4 MB, 6 MB goal, 1 P
gc 3 @0.115s 26%: 0.020+0.50+0.006+13+0.47 ms clock, 0.020+0.50+0+0/13/0+0.47 ms cpu, 8->8->6 MB, 8 MB goal, 1 P
# testmain
gc 1 @0.046s 14%: 0.017+0.23+0.007+7.7+0.16 ms clock, 0.017+0.23+0+0/7.6/0+0.16 ms cpu, 5->5->3 MB, 4 MB goal, 1 P
gc 2 @0.073s 23%: 0.016+0.11+0.006+12+0.18 ms clock, 0.016+0.11+0+0/12/0+0.18 ms cpu, 5->5->5 MB, 7 MB goal, 1 P
gc 3 @0.345s 10%: 0.020+0.12+0.006+17+0.15 ms clock, 0.020+0.12+0+0/17/0+0.15 ms cpu, 9->9->7 MB, 10 MB goal, 1 P
gc 4 @0.412s 15%: 0.020+0.11+0.006+28+0.19 ms clock, 0.020+0.11+0+0/28/0+0.19 ms cpu, 14->14->12 MB, 15 MB goal, 1 P
gc 5 @0.539s 19%: 0.028+0.15+0.006+49+0.14 ms clock, 0.028+0.15+0+0/49/0+0.14 ms cpu, 24->24->19 MB, 24 MB goal, 1 P
=== RUN   TestTraceStress
gc 1 @0.094s 1%: 10+0+0+0+1.1 ms clock, 10+0+0+0/0/0+1.1 ms cpu, 0->0->0 MB, 0 MB goal, 10 P (forced)
gc 2 @0.803s 0%: 0.11+0.12+59+50+0.17 ms clock, 0.11+0.12+0+0.11/0.022/0.006+0.17 ms cpu, 180->180->3 MB, 4 MB goal, 10 P
gc 3 @1.973s 0%: 0.13+0.29+39+159+0.31 ms clock, 0.13+0.29+0+0/0.15/0+0.31 ms cpu, 838->838->7 MB, 6 MB goal, 10 P
gc 4 @2.986s 0%: 0.025+0.13+0.007+0.57+0.16 ms clock, 0.025+0.13+0+0/0.48/0+0.16 ms cpu, 8->8->8 MB, 15 MB goal, 1 P
gc 5 @3.584s 0%: 0.018+0.11+0.007+2.1+0.13 ms clock, 0.018+0.11+0+0/2.0/0+0.13 ms cpu, 12->12->9 MB, 16 MB goal, 1 P
gc 6 @4.081s 0%: 0.020+0.11+0.006+4.9+0.12 ms clock, 0.020+0.11+0+0/4.8/0+0.12 ms cpu, 17->17->12 MB, 19 MB goal, 1 P
gc 7 @4.608s 0%: 0.018+0.11+0.007+9.8+0.11 ms clock, 0.018+0.11+0+0/9.7/0+0.11 ms cpu, 24->24->16 MB, 24 MB goal, 1 P
gc 8 @5.145s 0%: 0.020+0.11+0.006+15+0.098 ms clock, 0.020+0.11+0+0/14/0+0.098 ms cpu, 30->30->20 MB, 32 MB goal, 1 P
gc 9 @5.703s 0%: 0.025+0.12+0.006+24+0.13 ms clock, 0.025+0.12+0+0/23/0+0.13 ms cpu, 42->42->27 MB, 41 MB goal, 1 P
gc 10 @6.581s 0%: 0.038+0.13+0.006+24+0.11 ms clock, 0.038+0.13+0+0/24/0+0.11 ms cpu, 50->50->28 MB, 55 MB goal, 1 P
gc 11 @7.190s 0%: 0.040+0.13+0.006+113+0.14 ms clock, 0.040+0.13+0+25/32/0+0.14 ms cpu, 63->64->47 MB, 56 MB goal, 1 P
gc 12 @8.174s 0%: 0.041+0.15+0.007+57+0.15 ms clock, 0.041+0.15+0+0/57/0+0.15 ms cpu, 76->76->56 MB, 92 MB goal, 1 P
gc 13 @8.846s 0%: 0.044+0.13+0.007+55+0.16 ms clock, 0.044+0.13+0+0/55/0+0.16 ms cpu, 100->100->53 MB, 113 MB goal, 1 P
gc 14 @9.514s 0%: 0.047+0.13+0.007+69+0.17 ms clock, 0.047+0.13+0+0/69/0+0.17 ms cpu, 101->101->64 MB, 107 MB goal, 1 P
gc 15 @10.192s 1%: 0.075+1.5+0.006+156+0.26 ms clock, 0.075+1.5+0+107/39/0+0.26 ms cpu, 126->129->111 MB, 129 MB goal, 1 P
gc 16 @11.047s 2%: 0.13+0.16+0.007+354+0.59 ms clock, 0.13+0.16+0+227/85/0+0.59 ms cpu, 202->214->199 MB, 215 MB goal, 1 P
--- PASS: TestTraceStress (13.53s)
PASS
ok      runtime/trace   13.599s

@bradfitz
Copy link
Contributor

bradfitz commented Aug 5, 2015

/cc @dvyukov for trace and @aclements for reading the GC tealeaves

@dvyukov
Copy link
Member

dvyukov commented Aug 5, 2015

@0intro please also print runtime.MemStats before and after the test. Is there something similar to top to sample real memory consumption? 215MB reported by gctrace does not look too big.

@aclements
Copy link
Member

@0intro, just checking, the gctrace you posted is with current master? It looks like this may be why it's running out of memory:

gc 1 @0.094s 1%: 10+0+0+0+1.1 ms clock, 10+0+0+0/0/0+1.1 ms cpu, 0->0->0 MB, 0 MB goal, 10 P (forced)
gc 2 @0.803s 0%: 0.11+0.12+59+50+0.17 ms clock, 0.11+0.12+0+0.11/0.022/0.006+0.17 ms cpu, 180->180->3 MB, 4 MB goal, 10 P
gc 3 @1.973s 0%: 0.13+0.29+39+159+0.31 ms clock, 0.13+0.29+0+0/0.15/0+0.31 ms cpu, 838->838->7 MB, 6 MB goal, 10 P

The heap gets up to 838 MB here, and well over the heap goal. On my machine these first few cycles never get over 5 MB of heap.

@0intro
Copy link
Member Author

0intro commented Aug 5, 2015

@aclements Yes, the gctrace I posted is with current master. In my side, the first few cycles never got over 5 MB of heap on Linux. Also, I noticed the GC runs much more frequently on Linux than on Plan 9.

@dvyukov I don't know any equivalent of top(1), but I could write a program to sample real memory usage.

Beginning of TestTraceStress:

Alloc 94696
TotalAlloc 94696
Sys 2428028
Lookups 9
Mallocs 355
Frees 0

HeapAlloc 94696
HeapSys 655360
HeapIdle 352256
HeapInuse 303104
HeapReleased 0
HeapObjects 355

StackInuse 393216
StackSys 393216
MSpanInuse 2652
MSpanSys 16384
MCacheInuse 596
MCacheSys 16384
BuckHashSys 723390
GCSys 65536
OtherSys 557758

NextGC 4194304
LastGC 0
PauseTotalNs 0
PauseNs [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
PauseEnd [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
NumGC 0
EnableGC true
DebugGC false

End of TestTraceStress:

Alloc 109868536
TotalAlloc 1267486032
Sys 1122876540
Lookups 30
Mallocs 4052554
Frees 2777581

HeapAlloc 109868536
HeapSys 1051623424
HeapIdle 931725312
HeapInuse 119898112
HeapReleased 0
HeapObjects 1274973

StackInuse 1146880
StackSys 1146880
MSpanInuse 734876
MSpanSys 753664
MCacheInuse 596
MCacheSys 16384
BuckHashSys 727562
GCSys 66928384
OtherSys 1680242

NextGC 110651743
LastGC 1438809708105256767
PauseTotalNs 15331449
PauseNs [12020952 1036730 426593 217069 158400 94706 88838 93029 98896 108954 112305 120686 150858 146669 456764 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
PauseEnd [1438809706864766181 1438809706972933128 1438809707263519915 1438809707598346982 1438809707678284793 1438809707685883092 1438809707696945257 1438809707711702605 1438809707732382138 1438809707768982215 1438809707809904459 1438809707855633291 1438809707926595805 1438809708003697485 1438809708105256767 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
NumGC 15
EnableGC true
DebugGC false

@dvyukov
Copy link
Member

dvyukov commented Aug 5, 2015

It seems there is something wrong with GC, HeapSys is more than a gig.

adg pushed a commit to golang/build that referenced this issue Aug 5, 2015
Updates golang/go#12032.

Change-Id: I7239115b2e0e55f7fc1e5cdf24aa7e314f3bddac
Reviewed-on: https://go-review.googlesource.com/13077
Reviewed-by: Andrew Gerrand <adg@golang.org>
@4a6f656c
Copy link
Contributor

4a6f656c commented Oct 1, 2015

This also appears to be the cause of failures on the openbsd/arm builder - of course as soon as I run the trace tests with GODEBUG=gctrace=1 they no longer fail...

@dvyukov
Copy link
Member

dvyukov commented Oct 1, 2015

@aclements @RLH

@dvyukov
Copy link
Member

dvyukov commented Oct 1, 2015

Need to rollback https://go-review.googlesource.com/#/c/15242 once this is fixed.

@gopherbot
Copy link

CL https://golang.org/cl/15242 mentions this issue.

4a6f656c pushed a commit that referenced this issue Oct 1, 2015
Reduce allocation to avoid running out of memory on the openbsd/arm builder,
until issue/12032 is resolved.

Update issue #12032

Change-Id: Ibd513829ffdbd0db6cd86a0a5409934336131156
Reviewed-on: https://go-review.googlesource.com/15242
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
@bradfitz
Copy link
Contributor

bradfitz commented Dec 4, 2015

We don't make releases for Plan9, so removing the Go1.6 label.

@bradfitz bradfitz modified the milestones: Unreleased, Go1.6 Dec 4, 2015
@aclements
Copy link
Member

This hasn't happened since August 4th (last failure was commit 3cfc34a) and it was happening regularly before that, so I think this is no longer an issue. Please reopen if that's not the case.

$ findflakes -grep "Failed to schedule.*runtime/trace"
First observed ae1ea2a 22 Jul 15:47 2015 (1222 commits ago)
Last observed  3cfc34a 04 Aug 21:10 2015 (1032 commits ago)
0.00% chance failure is still happening
13% failure probability (25 of 191 commits)
Likely culprits:
   13% ae1ea2a runtime/trace: add new package
   11% 1a99ba5 encoding/json: fix decoding of JSON null values
   10% 1421bc1 misc/makerelease: delete
    8% 23f4e43 lib/time: update to IANA release 2015e.
    7% 6c7acdf doc: add a clause about embedded methods to go1compat
    6% fc7f6d3 cmd/dist: remove prints during test registration
    6% c97b499 cmd/go: fix typo in comment
    5% 419a6c0 cmd/go: pass an unmodified environment to a go run program
    4% 0505dfc go/types: make types.Typ a slice, unexport UniverseByte/Rune
    4% 72921c6 cmd/go: skip TestGoInstallErrorOnCrossCompileToBin in short mode
No known past failures

@golang golang locked and limited conversation to collaborators Dec 14, 2016
@rsc rsc unassigned 0intro Jun 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants