Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: weird double-jump in code generation #15090

Closed
rasky opened this issue Apr 3, 2016 · 15 comments
Closed

cmd/compile: weird double-jump in code generation #15090

rasky opened this issue Apr 3, 2016 · 15 comments

Comments

@rasky
Copy link
Member

rasky commented Apr 3, 2016

Please answer these questions before submitting your issue. Thanks!

  1. What version of Go are you using (go version)?
    go version devel +ac8d97b Sat Apr 2 12:38:00 2016 +0000 darwin/amd64
  2. What did you do?
package main

const (
    cRadixWidth = 8

    cRadixNumNodes   = 1 << cRadixWidth
    cRadixStartShift = 32 - cRadixWidth
    cRadixMask       = (1 << cRadixWidth) - 1
)

type radixNode struct {
    children [cRadixNumNodes]interface{}
}

type radixTree struct {
    root radixNode
}

func (t *radixTree) Search(key uint32) interface{} {
    var ok bool
    node := &t.root
    for {
        k := uint8(key >> cRadixStartShift)
        c := node.children[k]
        if c == nil {
            return nil
        }
        if node, ok = c.(*radixNode); !ok {
            return c
        }
        key <<= cRadixWidth
    }
}

Final generated code for (*radixTree).Search, as dumped by go tool objdump:

    radix.go:25 0x4096ca0   4883ec08        SUBQ $0x8, SP
    radix.go:29 0x4096ca4   488b4c2410      MOVQ 0x10(SP), CX
    radix.go:29 0x4096ca9   8401            TESTL AL, 0(CX)
    radix.go:30 0x4096cab   8b542418        MOVL 0x18(SP), DX
    radix.go:31 0x4096caf   89d3            MOVL DX, BX
    radix.go:31 0x4096cb1   c1eb18          SHRL $0x18, BX
    radix.go:32 0x4096cb4   8401            TESTL AL, 0(CX)
    radix.go:32 0x4096cb6   0fb6c3          MOVZX BL, AX
    radix.go:32 0x4096cb9   48c1e004        SHLQ $0x4, AX
    radix.go:32 0x4096cbd   488b5c0108      MOVQ 0x8(CX)(AX*1), BX
    radix.go:32 0x4096cc2   48891c24        MOVQ BX, 0(SP)
    radix.go:32 0x4096cc6   488b0c01        MOVQ 0(CX)(AX*1), CX
    radix.go:33 0x4096cca   4885c9          TESTQ CX, CX
    radix.go:33 0x4096ccd   7432            JE 0x4096d01
    radix.go:39 0x4096ccf   488d05cad11f00      LEAQ 0x1fd1ca(IP), AX
    radix.go:36 0x4096cd6   4839c1          CMPQ AX, CX
    radix.go:36 0x4096cd9   751d            JNE 0x4096cf8
    radix.go:36 0x4096cdb   7508            JNE 0x4096ce5
    radix.go:39 0x4096cdd   c1e208          SHLL $0x8, DX
    radix.go:36 0x4096ce0   4889d9          MOVQ BX, CX
    radix.go:31 0x4096ce3   ebca            JMP 0x4096caf
    radix.go:37 0x4096ce5   48894c2420      MOVQ CX, 0x20(SP)
    radix.go:37 0x4096cea   488b0c24        MOVQ 0(SP), CX
    radix.go:37 0x4096cee   48894c2428      MOVQ CX, 0x28(SP)
    radix.go:37 0x4096cf3   4883c408        ADDQ $0x8, SP
    radix.go:37 0x4096cf7   c3          RET
    radix.go:36 0x4096cf8   48c7c300000000      MOVQ $0x0, BX
    radix.go:36 0x4096cff   ebda            JMP 0x4096cdb
    radix.go:34 0x4096d01   48c744242000000000  MOVQ $0x0, 0x20(SP)
    radix.go:34 0x4096d0a   48c744242800000000  MOVQ $0x0, 0x28(SP)
    radix.go:34 0x4096d13   4883c408        ADDQ $0x8, SP
    radix.go:34 0x4096d17   c3          RET
    radix.go:34 0x4096d18   cc          INT $0x3
    radix.go:34 0x4096d19   cc          INT $0x3
    radix.go:34 0x4096d1a   cc          INT $0x3
    radix.go:34 0x4096d1b   cc          INT $0x3
    radix.go:34 0x4096d1c   cc          INT $0x3
    radix.go:34 0x4096d1d   cc          INT $0x3
    radix.go:34 0x4096d1e   cc          INT $0x3
    radix.go:34 0x4096d1f   cc          INT $0x3

I see the following issues:

  • double jump at 0x4096cd9
  • unused test at 0x4096ca9 (it is a nilcheck)
  • inefficient spill at 0x4096cc2
  • constant at 0x4096ccf could be hoisted before the loop
@rasky
Copy link
Member Author

rasky commented Apr 3, 2016

/cc @randall77

@robpike
Copy link
Contributor

robpike commented Apr 3, 2016

Amateur response:

The JMP at cd9 goes to cf8, which JMPs at cff to cdb. So all the jumps are used, and that's OK. Is this compiler output or final generated code, because I would have expected the linker to do some branch folding if this is the final output.

I believe the TESTLs are nil checks introduced by the compiler as guarantees that indirections through nil pointers cannot be used to access random memory.

The "register spill" is putting a value back on the stack. It may be unnecessary but it's hard to tell from this incomplete fragment.

To me, this all looks pretty normal.

@rasky
Copy link
Member Author

rasky commented Apr 4, 2016

(this is not a bug report, but a missing optimization issue, for future consideration)

The code shown is the final, complete code of Search(), dumped with go tool objdump. It's not partial, it's the full assembly output for that function.

The jumps are all used but they are inefficient compared to the minimal possible output.

The TESTL might be residual of removed nil-checks (as the flags set by those tests are not checked afterwards), but if so, they should removed as well.

The register spill is in fact used later on at 0x4096cea, but there are a dozen of registers available so there's really no reason for spilling to the stack.

@robpike
Copy link
Contributor

robpike commented Apr 4, 2016

I may be wrong but I'm pretty sure the TESTL remains there for the side effect of validating the struct's base address before indirecting through to the field. That is the nil check.

I am surprised that the linker didn't fold the branches.

@brtzsnr
Copy link
Contributor

brtzsnr commented Apr 4, 2016

@brtzsnr

radix.go:29 0x4096ca9 8401 TESTL AL, 0(CX) Is a nil check. It'll fail if CX == 0 (in the code this is t).

radix.go:32 0x4096cc2 48891c24 MOVQ BX, 0(SP) Is another case for suboptimal register allocator. Here are a few more examples: #14504.

radix.go:39 0x4096ccf 488d05cad11f00 LEAQ 0x1fd1ca(IP), AX happens because the compiler doesn't do loop hoisting. I would love to have it for https://go-review.googlesource.com/#/c/21434/, too.

radix.go:36 0x4096cd9 751d JNE 0x4096cf8 second jump happens because we don't merge consecutive ifs on the same variable. Same happens here.

func g(a bool) int {
    x := 1
    if a {
        x = 2
    }
    if a {
        x += 1
    } else {
        x += 2
    }
    return x
}

-Correction-: radix.go:36 0x4096cd9 751d JNE 0x4096cf8 happens because of the following. Will fix later this evening.

    v40 = Copy <bool> v35
    If v40 -> b14 b13

@rasky
Copy link
Member Author

rasky commented Apr 4, 2016

Sorry if I'm dense (I'm often confused by the ASM syntax you use, I'm used to a different syntax). TESTL just sets the flags; who's checking for those flags? Who's handling the result of such nilcheck? I don't see this happening in the above code.

@brtzsnr
Copy link
Contributor

brtzsnr commented Apr 4, 2016

TESTL AL, 0(CX) TESTL tests for flags but in this case it reads from 0(CX). If CX == 0 (t == nil) then 0(CX) == 0 which generates a page fault which is transformed by the runtime in a panic.

@rasky
Copy link
Member Author

rasky commented Apr 4, 2016

Thanks for clarifying, I didn't know nilchecks relied on page faults. Makes sense now.

@brtzsnr
Copy link
Contributor

brtzsnr commented Apr 4, 2016

I stand corrected, again:

    v40 = Copy <bool> v35
    If v40 -> b14 b13

was not the cause of this. It's just the lack for merging consecutive if's on the same variable. This is probably a common case where we would benefit from merging consecutive if's.

@dr2chase
Copy link
Contributor

dr2chase commented Apr 4, 2016

Loop hoisting, we could do. I got as far as writing the detector, measured hits, decided that there weren't enough to get that excited. But I could revive that, if people are interested.

Algorithm was:

  1. detect (reducible) loops
  2. determine articulation blocks (those which must execute if loop executes)
  3. find fixed point of "values not relying on within-loop values"
  4. filter out the ones that are too simple to care about

Only thing remaining is to move those into the predecessor of the loop header (assuming after critical edges eliminated). Note that this increases register pressure, so you might want a heuristic on loop size.

@josharian
Copy link
Contributor

On "merging consecutive ifs", see also #12628.

@bradfitz bradfitz added this to the Unplanned milestone Apr 7, 2016
@randall77
Copy link
Contributor

The unnecessary spill has been fixed (not sure when). I just sent out a CL to remove the redundant zero extension.

The constant hoisting is #15808
The redundant branch is #12628

I'll close this in deference to those bugs.

@gopherbot
Copy link

CL https://golang.org/cl/28191 mentions this issue.

@rasky
Copy link
Member Author

rasky commented Sep 28, 2016

@randall77 the CL has been forgotten. Do you plan to submit it?

@randall77
Copy link
Contributor

Yes, I just need to ping a reviewer. It will get in for 1.8.

gopherbot pushed a commit that referenced this issue Oct 27, 2016
var x uint64
uint8(x >> 56)

We don't need to generate any code for the uint8().

Update #15090

Change-Id: Ie1ca4e32022dccf7f7bc42d531a285521fb67872
Reviewed-on: https://go-review.googlesource.com/28191
Run-TryBot: Keith Randall <khr@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Chase <drchase@google.com>
@golang golang locked and limited conversation to collaborators Sep 28, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants