Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: bad pcln associations #21098

Closed
aarzilli opened this issue Jul 20, 2017 · 12 comments
Closed

cmd/compile: bad pcln associations #21098

aarzilli opened this issue Jul 20, 2017 · 12 comments
Labels
Debugging FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@aarzilli
Copy link
Contributor

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version go1.9beta2 linux/amd64

What did you do?

Compile this function with -gcflags '-N -l' (context http://github.com/aarzilli/diexplorer)

func openPE(path string) (*dwarf.Data, uint64, []byte) {
	file, _ := pe.Open(path)
	if file == nil {
		return nil, 0, nil
	}
	fmt.Fprintf(os.Stderr, "Found PE executable\n")
	dwarf, err := file.DWARF()
	must(err)

	var imageBase uint64
	switch oh := file.OptionalHeader.(type) {
	case *pe.OptionalHeader32:
		imageBase = uint64(oh.ImageBase)
	case *pe.OptionalHeader64:
		imageBase = oh.ImageBase
	default:
		panic(fmt.Errorf("pe file format not recognized"))
	}
	sect := file.Section(".text")
	if sect == nil {
		panic(fmt.Errorf("text section not found"))
	}
	textStart := imageBase + uint64(sect.VirtualAddress)
	textData, err := sect.Data()
	must(err)
	return dwarf, textStart, textData
}

disassembles to https://play.golang.org/p/Wn2J8N43ws. The problem is the LEA at 0x715c55 it gets assigned to main.go:64 (the return line) when it should be assigned to main.go:59 (the panic line after if sect == nil) you can see that the program jumps there from line 58 and all instructions after it are for line 59. I believe the problem is that the LEA doesn't have a position associated and it gets an automatic one assigned but since it's the first instruction emitted for its block the position is wrong.

The function main.disassemble (from the same repository, I'm not copying it here because it's too long) seems to have the opposite problem (disassembly of that function here: https://play.golang.org/p/XELhHHYCkW) the LEA at 0x714334 gets assigned to disass.go:230 but it should be assigned to disass.go:161 like all the instructions surrounding it.

This two problems aren't common enough that I can find a minimal example to reproduce them but they are common enough that I encounter them in the wild while using delve.

I made a program (http://github.com/aarzilli/badnext) to find such problems with pcln automatically, it works by comparing the CFG derived from parsing the source code with the CFG derived from the disassembly and reporting discrepancies. Usage badnext [-v] check <regex> <executable> checks all functions matching in (doesn't work on anonymous functions).

Sometimes the discrepancies it finds are justifiable but it does find legitimate problems.

The final problem I encounter with the pcln table is that with any code that looks like this:

if condition {
    body
}

the if's body is compiled into something that ends with a JMP, and that JMP is assigned the same position as the if's header, it would be better if the JMP had the same position as the last instruction of the body or the same position as the closing brace. This isn't as big a problem as the problem with LEAs above because it's regular enough that you get used to it quickly, but it would be nice if it was fixed.

@aarzilli
Copy link
Contributor Author

cc @heschik @hyangah @derekparker

@dr2chase dr2chase self-assigned this Jul 20, 2017
@dr2chase
Copy link
Contributor

I think I see how to fix the JMP, and will look into the LEA next. I assume this is aimed at 1.10; it is extremely late for 1.9 (though rc1 is still cooking).

@ianlancetaylor ianlancetaylor added this to the Go1.10 milestone Jul 20, 2017
@heschi
Copy link
Contributor

heschi commented Jul 20, 2017

I poked into the LEA for kicks. SSA dump attached.
It's correct in 1.8 and broken in 1.9. I suspect http://golang.org/cl/36207: the load is the first value in its block, and it's a rematerialization of a constant pointer. We now use copyIntoNoXPos in those situations, and I imagine that causes the compiler to carry over the textually previous value's position information when we build the line number tables.

Maybe we should use the pos of the value that's consuming the thing we're rematerializing?

@dr2chase
Copy link
Contributor

And for allocValToReg (which does the deed several times) there is a pos passed in that would be corrected.

@dr2chase
Copy link
Contributor

Hmm, not as good as I had hoped. Using the existing line-number-churn test, naive use of passed-in-pos increases churn.

@aarzilli
Copy link
Contributor Author

I think this should get the Debugging tag.

@gopherbot
Copy link

CL https://golang.org/cl/50610 mentions this issue.

@dr2chase
Copy link
Contributor

@aarzilli - if you get a chance, give it a look. There are issues with jumps acquiring the position of their target, and right now I'm also wondering why I didn't also do the same for all the conditional branches as well. I don't think it's done.

@bradfitz bradfitz added Debugging NeedsFix The path to resolution is known, but the work has not been done. labels Jul 21, 2017
@dr2chase
Copy link
Contributor

Was discussing this with @cherrymui over lunch, we wonder if blocks ought to include a "last line number" that would be assigned to them when the block was initially constructed, and (hopefully) propagated intelligently forward. The case where the JMP uses the "last" position seen in a block can pick up a target XPos instead (because of register adjustments pushed into predecessors) and if a target block as two predecessors like that, setting breakpoints might yield bad results, at least if I understand how that is supposed to work.

@dr2chase
Copy link
Contributor

Latest version of the CL might be an actual improvement.

gopherbot pushed a commit that referenced this issue Oct 7, 2017
This attempts to choose better values for values that are
rematerialized (uses the XPos of the consumer, not the
original) and for unconditional branches (uses the last
assigned XPos in the block).

The JMP branches seem to sometimes end up with a PC in the
destination block, I think because of register movement
or rematerialization that gets placed in predecessor blocks.
This may be acceptable because (eyeball-empirically) that is
often the line number of the target block, so the line number
flow is correct.

Added proper test, that checks both -N -l and regular compilation.
The test is also capable (for gdb, delve soon) of tracking
variable printing based on comments in the source code.

There's substantial room for improvement in debugger behavior.

Updates #21098.

Change-Id: I13abd48a39141583b85576a015f561065819afd0
Reviewed-on: https://go-review.googlesource.com/50610
Run-TryBot: David Chase <drchase@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Keith Randall <khr@golang.org>
@heschi
Copy link
Contributor

heschi commented Oct 17, 2017

David, is this fixed or do you want to use it to track further work?

@dr2chase
Copy link
Contributor

We should call it fixed. I need to file additional bugs now using the shiny new debug-next regression tester.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Debugging FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

6 participants