Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile/internal/gc: esc.go duplicated sink paths for re-assignments #27003

Open
quasilyte opened this issue Aug 15, 2018 · 4 comments
Open
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@quasilyte
Copy link
Contributor

Given this code:

/* 1*/ package example
/* 2*/ 
/* 3*/ var sink interface{}
/* 4*/ 
/* 5*/ func fn() {
/* 6*/ 	var x *int
/* 7*/ 	x = new(int)
/* 8*/ 	sink = x // Sink 1 at the line 8; first new(int) flows here
/* 9*/ 	x = new(int)
/*10*/ 	sink = x // Sink 2 at the line 10; second new(int) flows here
/*11*/ }

(Code annotated with line number for convenience, minimal example outlined after the issue description.)

Execute the command $ go tool compile -m=2 example.go.
The output on tip is (important bits are in bold):

example.go:5:6: can inline fn as: func() { var x *int; x = ; x = new(int); sink = x; x = new(int); sink = x }
example.go:8:7: x escapes to heap
example.go:8:7: 	from sink (assigned to top level variable) at example.go:8:7
example.go:7:9: new(int) escapes to heap
example.go:7:9: 	from x (assigned) at example.go:7:4
example.go:7:9: 	from x (interface-converted) at example.go:8:7
example.go:7:9: 	from sink (assigned to top level variable) at example.go:8:7
example.go:9:9: new(int) escapes to heap
example.go:9:9: 	from x (assigned) at example.go:9:4
example.go:9:9: 	from x (interface-converted) at example.go:8:7
example.go:9:9: 	from sink (assigned to top level variable) at example.go:8:7
example.go:10:7: x escapes to heap
example.go:10:7: 	from sink (assigned to top level variable) at example.go:10:7

Expected output would not report second sink to the same location.

This is a consequence of how graph is constructed and printed.

Simplified example:

package example

var sink *int

func fn() {
	var x *int
	x = new(int)
	sink = x
	x = new(int)
	sink = x
}

And the output:

example.go:5:6: can inline fn as: func() { var x *int; x = ; x = new(int); sink = x; x = new(int); sink = x }
example.go:7:9: new(int) escapes to heap
example.go:7:9: 	from x (assigned) at example.go:7:4
example.go:7:9: 	from sink (assigned to top level variable) at example.go:8:7
example.go:9:9: new(int) escapes to heap
example.go:9:9: 	from x (assigned) at example.go:9:4
example.go:9:9: 	from sink (assigned to top level variable) at example.go:8:7

Expected output:

example.go:5:6: can inline fn as: func() { var x *int; x = ; x = new(int); sink = x; x = new(int); sink = x }
example.go:7:9: new(int) escapes to heap
example.go:7:9: 	from x (assigned) at example.go:7:4
example.go:7:9: 	from sink (assigned to top level variable) at example.go:8:7
example.go:9:9: new(int) escapes to heap
example.go:9:9: 	from x (assigned) at example.go:9:4
example.go:9:9: 	from sink (assigned to top level variable) at example.go:10:7

For that example, we have something like this:

  • sink pseudo node has 2 source edges, both comes from x node.
  • x node has 2 source edges, they come from the two different new(int)
sink.Flowsrc == { {dst=sink, src=x, where=line8}, {dst=sink, src=x, where=line10} }
x.Flowsrc == { {dst=x, src=new(int), where=line7 }, {dst=x, src=new(int), where=line9} }

During traversal in escwalkBody, both new(int) paths are printed using first sink destination as a parent, so we get two same paths. For the second destination nothing is printed due to osrcesc variable check that is used to avoid duplicated messages.

If osrcesc is removed, both paths are printed twice (so, 4 messages instead of 2, but 2 of them are correct). Currently, osrcesc leads to 2 messages, 1 of which is incorrect.

It's not enough to just check whether destination node located before the actual starting flow point because of recursive functions:

func f8(x int, y *int) *int {
	if x <= 0 {
		return y
	}
	x--
	return f8(*y, &x)
}

Here the return y is a destination endpoint, and it comes before the tracked &x.

I have no good ideas on how to fix this one.
Hopefully, insights above can help someone to roll the solution to this.

@quasilyte
Copy link
Contributor Author

quasilyte commented Aug 15, 2018

Don't know who to CC. Maybe @josharian?
(Just for issue validation; perhaps this is known and accepted limitation of the esc.go explainer.)

@josharian
Copy link
Contributor

@dr2chase is primary for escape analysis (AFAIK); he wrote the escape analysis explainer. The other person that comes to mind is @cherrymui.

@andybons andybons added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 15, 2018
@andybons andybons added this to the Unplanned milestone Aug 15, 2018
@cherrymui
Copy link
Member

It seems to me a deeper issue is that the analysis of data flow is on the node level, where all occurrences of x are represented with the same node. For example, if I remove the second sink=x assignment, i.e.

package example

var sink *int

func fn() {
	var x *int
	x = new(int)
	sink = x
	x = new(int)
	_ = x
}

The second new(int) does not necessarily escape. With the current analysis, it does escape, because the new(int) flows to x, and x, at some point, flows to heap. In the original program, it doesn't really matter when/where x flows to heap; it just picks one.

For this, I think that the analysis would need to work on a level that tracks the order of data flow (probably some form of SSA). And more accurate location report would follow naturally.

@quasilyte
Copy link
Contributor Author

The second new(int) does not necessarily escape. With the current analysis, it does escape, because the new(int) flows to x, and x, at some point, flows to heap. In the original program, it doesn't really matter when/where x flows to heap; it just picks one.

I actually tried to make exactly this, to teach escape analysis to recognize that second new(int) is unnecessary, but encountered unexpected debug output that made me wonder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

4 participants