Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/pprof: tests "out of memory" on Plan 9 #9712

Closed
0intro opened this issue Jan 28, 2015 · 4 comments
Closed

runtime/pprof: tests "out of memory" on Plan 9 #9712

0intro opened this issue Jan 28, 2015 · 4 comments

Comments

@0intro
Copy link
Member

0intro commented Jan 28, 2015

The runtime/pprof trace_parser_test.go and trace_test.go
test files were added in CL 2039.

TestTraceDoubleStart, TestTrace, TestTraceStres and
TestTraceSymbolize are failing on Plan 9 ("out of memory").

However, only TestTrace is failing when the tests are run individually.

cpu% go test -run TestTraceDoubleStart
PASS
ok      runtime/pprof   0.052s
cpu% go test -run TestTrace
fatal error: trace: out of memory
goroutine 7 [running]:
runtime.throw(0x196ac8, 0x14)
    /usr/go/src/runtime/panic.go:508 +0x71 fp=0x10397ddc sp=0x10397dd0
runtime.traceFlush(0x0, 0x0)
    /usr/go/src/runtime/trace.go:511 +0x1f0 fp=0x10397e08 sp=0x10397ddc
runtime.traceEvent(0x10d, 0x10397ec0, 0x2, 0x2)
    /usr/go/src/runtime/trace.go:411 +0x68a fp=0x10397eb0 sp=0x10397e08
runtime.traceGoCreate(0x1037a0a0, 0x21680)
    /usr/go/src/runtime/trace.go:734 +0x99 fp=0x10397ee0 sp=0x10397eb0
runtime.StartTrace(0x0, 0x0)
    /usr/go/src/runtime/trace.go:151 +0x146 fp=0x10397f3c sp=0x10397ee0
runtime/pprof.StartTrace(0x3038dbc0, 0x103ec420, 0x0, 0x0)
    /usr/go/src/runtime/pprof/pprof.go:624 +0x57 fp=0x10397f5c sp=0x10397f3c
runtime/pprof_test.TestTraceDoubleStart(0x103ec3c0)
    /usr/go/src/runtime/pprof/trace_test.go:38 +0x60 fp=0x10397fb4 sp=0x10397f5c
testing.tRunner(0x103ec3c0, 0x252300)
    /usr/go/src/testing/testing.go:448 +0xb6 fp=0x10397fe8 sp=0x10397fb4
runtime.goexit()
    /usr/go/src/runtime/asm_386.s:2408 +0x1 fp=0x10397fec sp=0x10397fe8
created by testing.RunTests
    /usr/go/src/testing/testing.go:556 +0x881
goroutine 1 [chan receive]:
testing.RunTests(0x1cc818, 0x2522a0, 0xc, 0xc, 0x1)
    /usr/go/src/testing/testing.go:557 +0x8bb
testing.(*M).Run(0x103cc060, 0x268040)
    /usr/go/src/testing/testing.go:486 +0x5d
main.main()
    runtime/pprof/_test/_testmain.go:76 +0x176
exit status: 'pprof.test 16942: 2'
FAIL    runtime/pprof   3.325s
cpu% go test -run TestTraceStress
PASS
ok      runtime/pprof   4.844s
cpu% go test -run TestTraceSymbolize
PASS
ok      runtime/pprof   9.812s
@0intro 0intro self-assigned this Jan 28, 2015
@dvyukov
Copy link
Member

dvyukov commented Jan 29, 2015

I need help figuring out why happens.
I've double checked that tracer does not leak. I've run TestTraceStress in tight loop for a minute, memory consumption is several megs larger than w/o tracing, but otherwise grows the same way.

David, can you log all calls to sysAlloc, total memory consumption and errno value?

@0intro
Copy link
Member Author

0intro commented Jan 29, 2015

The issue doesn't seem to be related to runtime/pprof specifically.
It seems sysAlloc always fail after sysFree, when this pattern happens:

sysFree (0x30493000, 65536, 0x275f80)
sysFree (0x304a3000, 65536, 0x275f80)
sysFree (0x30483000, 65536, 0x275f80)
sysAlloc (65536, 0x275f80)

For example: TestTraceDoubleStart fails when executed after TestTraceStartStop:

cpu% go test -v -short
sysReserve(0x300000, 806363136, 0xdfffee27)
sbrk(806363136)
sbrk return 0x281000
sysAlloc (262144, 0x275f80)
sbrk(262144)
sbrk return 0x30383000
sysAlloc return 0x30383000
sysAlloc (65536, 0x275f80)
sbrk(65536)
sbrk return 0x303c3000
sysAlloc return 0x303c3000
sysAlloc (719996, 0x275f70)
sbrk(719996)
sbrk return 0x303d3000
sysAlloc return 0x303d3000
=== RUN TestMemoryProfiler
--- PASS: TestMemoryProfiler (0.02s)
=== RUN TestCPUProfile
--- PASS: TestCPUProfile (0.00s)
=== RUN TestCPUProfileMultithreaded
--- PASS: TestCPUProfileMultithreaded (0.00s)
=== RUN TestCPUProfileWithFork
sysAlloc (1605672, 0x275f80)
sbrk(1605672)
sbrk return 0x30483000
sysAlloc return 0x30483000
--- PASS: TestCPUProfileWithFork (15.30s)
=== RUN TestGoroutineSwitch
--- SKIP: TestGoroutineSwitch (0.14s)
    pprof_test.go:96: profile too short: [0x0 0x3 0x0 0x2710 0x0 0x0 0x1 0x0]
    pprof_test.go:98: ignoring failure on plan9; see golang.org/issue/6047
=== RUN TestMathBigDivide
--- PASS: TestMathBigDivide (0.00s)
=== RUN TestBlockProfile
--- PASS: TestBlockProfile (0.21s)
=== RUN TestTraceStartStop
sysAlloc (65536, 0x275f80)
sbrk(65536)
sbrk return 0x3060c000
sysAlloc return 0x3060c000
sysAlloc (65536, 0x275f80)
sbrk(65536)
sbrk return 0x3061c000
sysAlloc return 0x3061c000
sysAlloc (65536, 0x275f80)
sbrk(65536)
sbrk return 0x3062c000
sysAlloc return 0x3062c000
sysFree (0x3061c000, 65536, 0x275f80)
sysFree (0x3062c000, 65536, 0x275f80)
sysFree (0x3060c000, 65536, 0x275f80)
--- PASS: TestTraceStartStop (0.14s)
=== RUN TestTraceDoubleStart
sysAlloc (65536, 0x275f80)
sbrk(65536)
sbrk return nil
sysAlloc return 0x0
fatal error: trace: out of memory

goroutine 46 [running]:
runtime.throw(0x19ac08, 0x14)
    /usr/go/src/runtime/panic.go:508 +0x71 fp=0x16840de0 sp=0x16840dd4
runtime.traceFlush(0x0, 0x0)
    /usr/go/src/runtime/trace.go:511 +0x1f0 fp=0x16840e0c sp=0x16840de0
runtime.traceEvent(0x10d, 0x16840ec4, 0x2, 0x2)
    /usr/go/src/runtime/trace.go:411 +0x68a fp=0x16840eb4 sp=0x16840e0c
runtime.traceGoCreate(0x103820a0, 0x21940)
    /usr/go/src/runtime/trace.go:734 +0x99 fp=0x16840ee4 sp=0x16840eb4
runtime.StartTrace(0x0, 0x0)
    /usr/go/src/runtime/trace.go:151 +0x146 fp=0x16840f40 sp=0x16840ee4
runtime/pprof.StartTrace(0x303a1458, 0x103e5560, 0x0, 0x0)
    /usr/go/src/runtime/pprof/pprof.go:624 +0x30 fp=0x16840f5c sp=0x16840f40
runtime/pprof_test.TestTraceDoubleStart(0x103e5500)
    /usr/go/src/runtime/pprof/trace_test.go:51 +0x6c fp=0x16840fb4 sp=0x16840f5c
testing.tRunner(0x103e5500, 0x259320)
    /usr/go/src/testing/testing.go:448 +0xb4 fp=0x16840fe8 sp=0x16840fb4
runtime.goexit()
    /usr/go/src/runtime/asm_386.s:2408 +0x1 fp=0x16840fec sp=0x16840fe8
created by testing.RunTests
    /usr/go/src/testing/testing.go:556 +0x833

goroutine 1 [chan receive]:
testing.RunTests(0x1d1018, 0x2592c0, 0xc, 0xc, 0x1)
    /usr/go/src/testing/testing.go:557 +0x86b
testing.(*M).Run(0x103c4060, 0x26f060)
    /usr/go/src/testing/testing.go:486 +0x5d
main.main()
    runtime/pprof/_test/_testmain.go:76 +0x176
exit status: 'pprof.test 23897: 2'
FAIL    runtime/pprof   15.970s

TestTraceDoubleStart succeed when ran individually:

cpu% go test -run TestTraceDoubleStart
sysReserve(0x300000, 806363136, 0xdfffee27)
sbrk(806363136)
sbrk return 0x281000
sysAlloc (262144, 0x275f80)
sbrk(262144)
sbrk return 0x30383000
sysAlloc return 0x30383000
sysAlloc (65536, 0x275f80)
sbrk(65536)
sbrk return 0x303c3000
sysAlloc return 0x303c3000
sysAlloc (719996, 0x275f70)
sbrk(719996)
sbrk return 0x303d3000
sysAlloc return 0x303d3000
sysAlloc (65536, 0x275f80)
sbrk(65536)
sbrk return 0x30483000
sysAlloc return 0x30483000
sysAlloc (65536, 0x275f80)
sbrk(65536)
sbrk return 0x30493000
sysAlloc return 0x30493000
sysAlloc (65536, 0x275f80)
sbrk(65536)
sbrk return 0x304a3000
sysAlloc return 0x304a3000
sysFree (0x30493000, 65536, 0x275f80)
sysFree (0x304a3000, 65536, 0x275f80)
sysFree (0x30483000, 65536, 0x275f80)
PASS
ok      runtime/pprof   0.109s

TestTrace always fails:

cpu% go test -run TestTrace
sysReserve(0x300000, 806363136, 0xdfffee27)
sbrk(806363136)
sbrk return 0x281000
sysAlloc (262144, 0x275f80)
sbrk(262144)
sbrk return 0x30383000
sysAlloc return 0x30383000
sysAlloc (65536, 0x275f80)
sbrk(65536)
sbrk return 0x303c3000
sysAlloc return 0x303c3000
sysAlloc (719996, 0x275f70)
sbrk(719996)
sbrk return 0x303d3000
sysAlloc return 0x303d3000
sysAlloc (65536, 0x275f80)
sbrk(65536)
sbrk return 0x30483000
sysAlloc return 0x30483000
sysAlloc (65536, 0x275f80)
sbrk(65536)
sbrk return 0x30493000
sysAlloc return 0x30493000
sysAlloc (65536, 0x275f80)
sbrk(65536)
sbrk return 0x304a3000
sysAlloc return 0x304a3000
sysFree (0x30493000, 65536, 0x275f80)
sysFree (0x304a3000, 65536, 0x275f80)
sysFree (0x30483000, 65536, 0x275f80)
sysAlloc (65536, 0x275f80)
sbrk(65536)
sbrk return nil
sysAlloc return 0x0
fatal error: trace: out of memory

goroutine 7 [running]:
runtime.throw(0x19ac08, 0x14)
    /usr/go/src/runtime/panic.go:508 +0x71 fp=0x1039fde0 sp=0x1039fdd4
runtime.traceFlush(0x0, 0x0)
    /usr/go/src/runtime/trace.go:511 +0x1f0 fp=0x1039fe0c sp=0x1039fde0
runtime.traceEvent(0x10d, 0x1039fec4, 0x2, 0x2)
    /usr/go/src/runtime/trace.go:411 +0x68a fp=0x1039feb4 sp=0x1039fe0c
runtime.traceGoCreate(0x103820a0, 0x21940)
    /usr/go/src/runtime/trace.go:734 +0x99 fp=0x1039fee4 sp=0x1039feb4
runtime.StartTrace(0x0, 0x0)
    /usr/go/src/runtime/trace.go:151 +0x146 fp=0x1039ff40 sp=0x1039fee4
runtime/pprof.StartTrace(0x30394bc0, 0x103e4420, 0x0, 0x0)
    /usr/go/src/runtime/pprof/pprof.go:624 +0x30 fp=0x1039ff5c sp=0x1039ff40
runtime/pprof_test.TestTraceDoubleStart(0x103e43c0)
    /usr/go/src/runtime/pprof/trace_test.go:51 +0x6c fp=0x1039ffb4 sp=0x1039ff5c
testing.tRunner(0x103e43c0, 0x259320)
    /usr/go/src/testing/testing.go:448 +0xb4 fp=0x1039ffe8 sp=0x1039ffb4
runtime.goexit()
    /usr/go/src/runtime/asm_386.s:2408 +0x1 fp=0x1039ffec sp=0x1039ffe8
created by testing.RunTests
    /usr/go/src/testing/testing.go:556 +0x833

goroutine 1 [chan receive]:
testing.RunTests(0x1d1018, 0x2592c0, 0xc, 0xc, 0x1)
    /usr/go/src/testing/testing.go:557 +0x86b
testing.(*M).Run(0x103c4060, 0x26f060)
    /usr/go/src/testing/testing.go:486 +0x5d
main.main()
    runtime/pprof/_test/_testmain.go:76 +0x176
exit status: 'pprof.test 24086: 2'
FAIL    runtime/pprof   0.284s

@dvyukov
Copy link
Member

dvyukov commented Jan 30, 2015

There is at least one buf in mem_plan9.go that can lead to memory corruption (the same memory returned by sysAlloc twice).

The following line in sysFree:
n += (n + memRound) &^ memRound
doubles n value: n += n

But even if you replace it with
n = (n + memRound) &^ memRound
It is still incorrect, because sysAlloc does not round n to page size, but sysFree does. So freeing won't happen for blocks that are not multiple of _PAGESIZE.

I would do the following. It has much simpler invariants: both bloc and n are always page-rounded:

--- a/src/runtime/mem_plan9.go
+++ b/src/runtime/mem_plan9.go
@@ -9,21 +9,24 @@ import "unsafe"
 var bloc uintptr
 var memlock mutex

-const memRound = _PAGESIZE - 1
+func memRound(p uintptr) uintptr {
+       return (p + _PAGESIZE - 1) &^ (_PAGESIZE - 1)
+}

 func initBloc() {
-       bloc = uintptr(unsafe.Pointer(&end))
+       bloc = memRound(uintptr(unsafe.Pointer(&end)))
 }

 func sbrk(n uintptr) unsafe.Pointer {
        lock(&memlock)
        // Plan 9 sbrk from /sys/src/libc/9sys/sbrk.c
-       bl := (bloc + memRound) &^ memRound
+       bl := bloc
+       n = memRound(n)
        if brk_(unsafe.Pointer(bl+n)) < 0 {
                unlock(&memlock)
                return nil
        }
-       bloc = bl + n
+       bloc += n
        unlock(&memlock)
        return unsafe.Pointer(bl)
 }
@@ -42,7 +45,7 @@ func sysFree(v unsafe.Pointer, n uintptr, stat *uint64) {
        // from tiny/mem.c
        // Push pointer back if this is a free
        // of the most recent sysAlloc.
-       n += (n + memRound) &^ memRound
+       n = memRound(n)
        if bloc == uintptr(v)+n {
                bloc -= n
        }

I see that in your scenario the corruption thing actually happens:

sysAlloc (65536, 0x275f80)
sbrk(65536)
sbrk return 0x30493000
sysAlloc return 0x30493000
10000

sysAlloc (65536, 0x275f80)
sbrk(65536)
sbrk return 0x304a3000
sysAlloc return 0x304a3000

sysFree (0x30493000, 65536, 0x275f80)
sysFree (0x304a3000, 65536, 0x275f80)
sysFree (0x30483000, 65536, 0x275f80)

The first sysFree(0x30493000) will move bloc to 0x30493000 even if it is not the last allocated block.

However, this does not seem to explain why brk fails.
Are you sure that brk can be used for shrinking at all? Can you do brk(0x20000000); brk(0x10000000); brk(0x15000000) at startup?

@0intro
Copy link
Member Author

0intro commented Jan 30, 2015

Thanks. Your fix seems to fix the corruption issue
and brk doesn't fail anymore.

After your fix, all the tests passes successfully.

=== RUN TestTraceStartStop
--- PASS: TestTraceStartStop (0.10s)
=== RUN TestTraceDoubleStart
--- PASS: TestTraceDoubleStart (0.00s)
=== RUN TestTrace
--- PASS: TestTrace (0.00s)
=== RUN TestTraceStress
--- PASS: TestTraceStress (4.36s)
=== RUN TestTraceSymbolize
--- PASS: TestTraceSymbolize (1.47s)

I think brk can be used for shrinking as well.

@golang golang locked and limited conversation to collaborators Jun 25, 2016
@rsc rsc unassigned 0intro Jun 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants