Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: -m flag unexpectedly affects compilation output #22378

Closed
huguesb opened this issue Oct 21, 2017 · 12 comments
Closed

cmd/compile: -m flag unexpectedly affects compilation output #22378

huguesb opened this issue Oct 21, 2017 · 12 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@huguesb
Copy link
Contributor

huguesb commented Oct 21, 2017

I am currently working on some changes to the inliner for #15561, the current state of my work is https://go-review.googlesource.com/c/go/+/72490

While testing my changes, I noticed a very odd behavior, namely that code miscompiled when my inliner changes where triggered, was compiled correctly when passing the -m flag.

The following code is reduced reproducer from a failure observed in generateTrace from cmd/trace/trace.go


import (
	"fmt"
)

type elem struct {
	f uint64
}

func f() int {
	m := make(map[uint64]*elem)
	getOrCreate := func(g uint64) *elem {
		info, ok := m[g]
		if !ok {
			info = &elem{}
			m[g] = info
		}
		return info
	}
	getOrCreate(42)
	return len(m)
}

func main() {
	fmt.Println(f())
}

Diffing a disassembly of this program, compiled with - l -l and -l -l -m respectively, gives the following:

16c16
<   0x1091a6e		4889442420		MOVQ AX, 0x20(SP)
---
>   0x1091a6e		4889442430		MOVQ AX, 0x30(SP)
27c27
<   0x1091a9a		488b442420		MOVQ 0x20(SP), AX
---
>   0x1091a9a		488b442430		MOVQ 0x30(SP), AX
47c47
<   0x1091adf		488b4c2420		MOVQ 0x20(SP), CX
---
>   0x1091adf		488b4c2430		MOVQ 0x30(SP), CX

The only other reference to the 0x20(SP) slot is, in both cases, the return value of the mapaccess call inside the inlined closure body :

0x1091a8c             e86f7ff7ff              CALL runtime.mapaccess2_fast64(SB)
0x1091a91             0fb6442420              MOVZX 0x20(SP), AX

I realize that there is probably something wrong with my inliner changes. However, I'm submitting this bug nonetheless because I find it very surprising that the -m flag somehow fixes this issue, even though it should only affect compiler logs and not the resulting binary. This seems like it might be a latent issue in the compiler that would be worth fixing.

I have manually audited all locations that read Debug['m'] inside the compiler and didn't notice anything suspicious/side-effecty so I'm pretty stumped at this point and could use another pair of eyes.

@dsnet dsnet added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Oct 22, 2017
@dsnet dsnet added this to the Go1.10 milestone Oct 22, 2017
@dsnet
Copy link
Member

dsnet commented Oct 22, 2017

\cc @randall77 @cherrymui

@cherrymui
Copy link
Member

cherrymui commented Oct 23, 2017

I cannot reproduce it, with both Go 1.9 and tip (006bc57), at least on darwin/amd64. Then binaries generated with -l -l and -l -l -m are identical.

$ go1.9 build -gcflags="-l -l" -o x1 x.go
$ go1.9 build -gcflags="-l -l -m" -o x2 x.go
# command-line-arguments
./x.go:13:17: can inline f.func1
./x.go:16:17: &elem literal escapes to heap
./x.go:12:11: f make(map[uint64]*elem) does not escape
./x.go:13:17: f func literal does not escape
./x.go:26:15: f() escapes to heap
./x.go:26:13: main ... argument does not escape
$ diff x1 x2
$ go-tip build -gcflags="-l -l" -o x1 x.go
$ go-tip build -gcflags="-l -l -m" -o x2 x.go
# command-line-arguments
./x.go:13:17: can inline f.func1
./x.go:16:17: &elem literal escapes to heap
./x.go:12:11: f make(map[uint64]*elem) does not escape
./x.go:13:17: f func literal does not escape
./x.go:26:15: f() escapes to heap
./x.go:26:13: main ... argument does not escape
$ diff x1 x2
$ go-tip version
go version devel +006bc57095 Sun Oct 22 15:50:50 2017 +0000 darwin/amd64

@huguesb, which version of Go, and what GOARCH/GOOS are you using?

@huguesb
Copy link
Contributor Author

huguesb commented Oct 23, 2017

@cherrymui I'm on darwin/amd64. This issue is reliably reproducible with CL72490 applied on top of tip (006bc57095):

$ git fetch https://go.googlesource.com/go refs/changes/90/72490/2 && git cherry-pick FETCH_HEAD
$ ./make.bash
$ PATH=~/go/bin go run -gcflags='-l -l -m' closure.go
# command-line-arguments
./closure.go:14:17: can inline f.func1
./closure.go:23:13: inlining call to f.func1
./closure.go:12:11: make(map[uint64]*elem) escapes to heap
./closure.go:23:13: &elem literal escapes to heap
./closure.go:17:17: &elem literal escapes to heap
./closure.go:14:17: f func literal does not escape
./closure.go:29:15: f() escapes to heap
./closure.go:29:13: main ... argument does not escape
1
$ PATH=~/go/bin go run -gcflags='-l -l' closure.go
unexpected fault address 0xb01dfacedebac1e
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x1 addr=0xb01dfacedebac1e pc=0x100898e]

goroutine 1 [running]:
runtime.throw(0x10c2c0c, 0x5)
	/Users/hugues/go/src/runtime/panic.go:616 +0x81 fp=0xc420045de8 sp=0xc420045dc8 pc=0x1026bc1
runtime.sigpanic()
	/Users/hugues/go/src/runtime/signal_unix.go:383 +0x200 fp=0xc420045e38 sp=0xc420045de8 pc=0x1039bd0
runtime.hashGrow(0x10a9480, 0xc420076100)
	/Users/hugues/go/src/runtime/hashmap.go:910 +0xfe fp=0xc420045e98 sp=0xc420045e38 pc=0x100898e
runtime.mapassign_fast64(0x10a9480, 0xc420076100, 0x2a, 0x1159c00)
	/Users/hugues/go/src/runtime/hashmap_fast.go:498 +0x181 fp=0xc420045ef0 sp=0xc420045e98 pc=0x100a4f1
main.f.func1(...)
	/Users/hugues/go/src/closure.go:18
main.f(0x10a1e00)
	/Users/hugues/go/src/closure.go:23 +0xc7 fp=0xc420045f30 sp=0xc420045ef0 pc=0x1091b07
main.main()
	/Users/hugues/go/src/closure.go:29 +0x22 fp=0xc420045f88 sp=0xc420045f30 pc=0x1091b72
runtime.main()
	/Users/hugues/go/src/runtime/proc.go:195 +0x200 fp=0xc420045fe0 sp=0xc420045f88 pc=0x1028200
runtime.goexit()
	/Users/hugues/go/src/runtime/asm_amd64.s:2374 +0x1 fp=0xc420045fe8 sp=0xc420045fe0 pc=0x104f291
exit status 2

@cherrymui
Copy link
Member

cherrymui commented Oct 23, 2017

Ok, it seems there is something racy here, which may or may not be related to your CL.

With your CL applied, I run go build -gcflags='-l -l' x.go in a loop, and see sporadic crashes of the compiler. Since Go 1.9 the compiler, if invoked by cmd/go, by default runs concurrently. But cmd/go disables concurrent compilation if -m is specified, so the race does not happen there.

Without concurrent compilation, it still generates identical binaries:

$ go tool compile -l -l -o x1.o x.go
$ go tool compile -l -l -m -o x2.o x.go
x.go:13:17: can inline f.func1
x.go:21:13: inlining call to f.func1
x.go:12:11: make(map[uint64]*elem) escapes to heap
x.go:21:13: &elem literal escapes to heap
x.go:16:17: &elem literal escapes to heap
x.go:13:17: f func literal does not escape
x.go:26:15: f() escapes to heap
x.go:26:13: main ... argument does not escape
$ diff x1.o x2.o
$

@randall77
Copy link
Contributor

@josharian

@huguesb
Copy link
Contributor Author

huguesb commented Oct 23, 2017

@cherrymui thanks for taking a look, I did not think of concurrent compilation. Is there an easy way of building the compiler with the race detector enabled? The naive GO_GCFLAGS=-race ./make.bash doesn't seem to work

@cherrymui
Copy link
Member

I'm actually not sure. I guess go install -race cmd/compile will build a compiler with race detector enabled, and then you can test it by using it to build other programs.

@huguesb
Copy link
Contributor Author

huguesb commented Oct 25, 2017

I tried building a race-enabled compiler with go install -a -race cmd/compile (and same for cmd/go) but it did not make any observable difference. The issue is still reproducible and I'm not seeing any logs from the race detector when I use the newly built compiler to build other programs

@huguesb
Copy link
Contributor Author

huguesb commented Oct 25, 2017

I finally figured out the problem: there was a subtle bug in my change which resulted in references to captured variables inside the inlined body using the same node as inside the regular closure body, instead of being substituted with local temporaries in every inlined location. I'm guessing the SSA phase either modifies the AST in-place or uses AST nodes as map keys, which leads to the observed issue when the regular closure body and a function with the inlined closure are compiled concurrently. Is this documented somewhere?

@huguesb huguesb closed this as completed Oct 25, 2017
@mdempsky
Copy link
Member

mdempsky commented Oct 25, 2017

I'm guessing the SSA phase either modifies the AST in-place

Yes, order and walk mutate the function's AST.

Is this documented somewhere?

I don't think so. The general rule is the goroutine that's compiling a function has exclusive ownership over that function during concurrent compilation. If you need to do interprocedural stuff, you should do it before concurrent compilation.

Some stuff is safe to inspect externally. For example, the ONAME Nodes representing the function name and parameters are safe for basic use, but these basically become read-only once the early analysis phases are done.

@huguesb
Copy link
Contributor Author

huguesb commented Oct 25, 2017

Yes, order and walk mutate the function's AST.

That's not the issue here though. Order and walk are not executed concurrently at this point, only the SSA phase is.

@cherrymui
Copy link
Member

cherrymui commented Oct 26, 2017

uses AST nodes as map keys

Yes, there are a few in cmd/compile/internal/gc/ssa.go, also plive.go.

@golang golang locked and limited conversation to collaborators Oct 26, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

6 participants