cmd/compile: weird double-jump in code generation #15090

rasky · 2016-04-03T21:50:22Z

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?
go version devel +ac8d97b Sat Apr 2 12:38:00 2016 +0000 darwin/amd64
What did you do?

package main

const (
    cRadixWidth = 8

    cRadixNumNodes   = 1 << cRadixWidth
    cRadixStartShift = 32 - cRadixWidth
    cRadixMask       = (1 << cRadixWidth) - 1
)

type radixNode struct {
    children [cRadixNumNodes]interface{}
}

type radixTree struct {
    root radixNode
}

func (t *radixTree) Search(key uint32) interface{} {
    var ok bool
    node := &t.root
    for {
        k := uint8(key >> cRadixStartShift)
        c := node.children[k]
        if c == nil {
            return nil
        }
        if node, ok = c.(*radixNode); !ok {
            return c
        }
        key <<= cRadixWidth
    }
}

Final generated code for (*radixTree).Search, as dumped by go tool objdump:

    radix.go:25 0x4096ca0   4883ec08        SUBQ $0x8, SP
    radix.go:29 0x4096ca4   488b4c2410      MOVQ 0x10(SP), CX
    radix.go:29 0x4096ca9   8401            TESTL AL, 0(CX)
    radix.go:30 0x4096cab   8b542418        MOVL 0x18(SP), DX
    radix.go:31 0x4096caf   89d3            MOVL DX, BX
    radix.go:31 0x4096cb1   c1eb18          SHRL $0x18, BX
    radix.go:32 0x4096cb4   8401            TESTL AL, 0(CX)
    radix.go:32 0x4096cb6   0fb6c3          MOVZX BL, AX
    radix.go:32 0x4096cb9   48c1e004        SHLQ $0x4, AX
    radix.go:32 0x4096cbd   488b5c0108      MOVQ 0x8(CX)(AX*1), BX
    radix.go:32 0x4096cc2   48891c24        MOVQ BX, 0(SP)
    radix.go:32 0x4096cc6   488b0c01        MOVQ 0(CX)(AX*1), CX
    radix.go:33 0x4096cca   4885c9          TESTQ CX, CX
    radix.go:33 0x4096ccd   7432            JE 0x4096d01
    radix.go:39 0x4096ccf   488d05cad11f00      LEAQ 0x1fd1ca(IP), AX
    radix.go:36 0x4096cd6   4839c1          CMPQ AX, CX
    radix.go:36 0x4096cd9   751d            JNE 0x4096cf8
    radix.go:36 0x4096cdb   7508            JNE 0x4096ce5
    radix.go:39 0x4096cdd   c1e208          SHLL $0x8, DX
    radix.go:36 0x4096ce0   4889d9          MOVQ BX, CX
    radix.go:31 0x4096ce3   ebca            JMP 0x4096caf
    radix.go:37 0x4096ce5   48894c2420      MOVQ CX, 0x20(SP)
    radix.go:37 0x4096cea   488b0c24        MOVQ 0(SP), CX
    radix.go:37 0x4096cee   48894c2428      MOVQ CX, 0x28(SP)
    radix.go:37 0x4096cf3   4883c408        ADDQ $0x8, SP
    radix.go:37 0x4096cf7   c3          RET
    radix.go:36 0x4096cf8   48c7c300000000      MOVQ $0x0, BX
    radix.go:36 0x4096cff   ebda            JMP 0x4096cdb
    radix.go:34 0x4096d01   48c744242000000000  MOVQ $0x0, 0x20(SP)
    radix.go:34 0x4096d0a   48c744242800000000  MOVQ $0x0, 0x28(SP)
    radix.go:34 0x4096d13   4883c408        ADDQ $0x8, SP
    radix.go:34 0x4096d17   c3          RET
    radix.go:34 0x4096d18   cc          INT $0x3
    radix.go:34 0x4096d19   cc          INT $0x3
    radix.go:34 0x4096d1a   cc          INT $0x3
    radix.go:34 0x4096d1b   cc          INT $0x3
    radix.go:34 0x4096d1c   cc          INT $0x3
    radix.go:34 0x4096d1d   cc          INT $0x3
    radix.go:34 0x4096d1e   cc          INT $0x3
    radix.go:34 0x4096d1f   cc          INT $0x3

I see the following issues:

double jump at 0x4096cd9
~~unused test at 0x4096ca9~~ (it is a nilcheck)
inefficient spill at 0x4096cc2
constant at 0x4096ccf could be hoisted before the loop

The text was updated successfully, but these errors were encountered:

rasky · 2016-04-03T21:51:10Z

/cc @randall77

robpike · 2016-04-03T22:41:50Z

Amateur response:

The JMP at cd9 goes to cf8, which JMPs at cff to cdb. So all the jumps are used, and that's OK. Is this compiler output or final generated code, because I would have expected the linker to do some branch folding if this is the final output.

I believe the TESTLs are nil checks introduced by the compiler as guarantees that indirections through nil pointers cannot be used to access random memory.

The "register spill" is putting a value back on the stack. It may be unnecessary but it's hard to tell from this incomplete fragment.

To me, this all looks pretty normal.

rasky · 2016-04-04T00:35:07Z

(this is not a bug report, but a missing optimization issue, for future consideration)

The code shown is the final, complete code of Search(), dumped with go tool objdump. It's not partial, it's the full assembly output for that function.

The jumps are all used but they are inefficient compared to the minimal possible output.

The TESTL might be residual of removed nil-checks (as the flags set by those tests are not checked afterwards), but if so, they should removed as well.

The register spill is in fact used later on at 0x4096cea, but there are a dozen of registers available so there's really no reason for spilling to the stack.

robpike · 2016-04-04T03:33:46Z

I may be wrong but I'm pretty sure the TESTL remains there for the side effect of validating the struct's base address before indirecting through to the field. That is the nil check.

I am surprised that the linker didn't fold the branches.

brtzsnr · 2016-04-04T09:40:15Z

@brtzsnr

radix.go:29 0x4096ca9 8401 TESTL AL, 0(CX) Is a nil check. It'll fail if CX == 0 (in the code this is t).

radix.go:32 0x4096cc2 48891c24 MOVQ BX, 0(SP) Is another case for suboptimal register allocator. Here are a few more examples: #14504.

radix.go:39 0x4096ccf 488d05cad11f00 LEAQ 0x1fd1ca(IP), AX happens because the compiler doesn't do loop hoisting. I would love to have it for https://go-review.googlesource.com/#/c/21434/, too.

radix.go:36 0x4096cd9 751d JNE 0x4096cf8 second jump happens because we don't merge consecutive ifs on the same variable. Same happens here.

func g(a bool) int {
    x := 1
    if a {
        x = 2
    }
    if a {
        x += 1
    } else {
        x += 2
    }
    return x
}

-Correction-: radix.go:36 0x4096cd9 751d JNE 0x4096cf8 happens because of the following. Will fix later this evening.

    v40 = Copy <bool> v35
    If v40 -> b14 b13

rasky · 2016-04-04T09:45:26Z

Sorry if I'm dense (I'm often confused by the ASM syntax you use, I'm used to a different syntax). TESTL just sets the flags; who's checking for those flags? Who's handling the result of such nilcheck? I don't see this happening in the above code.

brtzsnr · 2016-04-04T09:56:04Z

TESTL AL, 0(CX) TESTL tests for flags but in this case it reads from 0(CX). If CX == 0 (t == nil) then 0(CX) == 0 which generates a page fault which is transformed by the runtime in a panic.

rasky · 2016-04-04T10:01:21Z

Thanks for clarifying, I didn't know nilchecks relied on page faults. Makes sense now.

brtzsnr · 2016-04-04T13:42:08Z

I stand corrected, again:

    v40 = Copy <bool> v35
    If v40 -> b14 b13

was not the cause of this. It's just the lack for merging consecutive if's on the same variable. This is probably a common case where we would benefit from merging consecutive if's.

dr2chase · 2016-04-04T13:50:18Z

Loop hoisting, we could do. I got as far as writing the detector, measured hits, decided that there weren't enough to get that excited. But I could revive that, if people are interested.

Algorithm was:

detect (reducible) loops
determine articulation blocks (those which must execute if loop executes)
find fixed point of "values not relying on within-loop values"
filter out the ones that are too simple to care about

Only thing remaining is to move those into the predecessor of the loop header (assuming after critical edges eliminated). Note that this increases register pressure, so you might want a heuristic on loop size.

josharian · 2016-04-04T15:17:59Z

On "merging consecutive ifs", see also #12628.

randall77 · 2016-08-30T22:39:00Z

The unnecessary spill has been fixed (not sure when). I just sent out a CL to remove the redundant zero extension.

The constant hoisting is #15808
The redundant branch is #12628

I'll close this in deference to those bugs.

gopherbot · 2016-08-30T23:00:23Z

CL https://golang.org/cl/28191 mentions this issue.

rasky · 2016-09-28T21:04:35Z

@randall77 the CL has been forgotten. Do you plan to submit it?

randall77 · 2016-09-28T21:14:54Z

Yes, I just need to ping a reviewer. It will get in for 1.8.

var x uint64 uint8(x >> 56) We don't need to generate any code for the uint8(). Update #15090 Change-Id: Ie1ca4e32022dccf7f7bc42d531a285521fb67872 Reviewed-on: https://go-review.googlesource.com/28191 Run-TryBot: Keith Randall <khr@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: David Chase <drchase@google.com>

bradfitz added the Performance label Apr 7, 2016

bradfitz added this to the Unplanned milestone Apr 7, 2016

randall77 closed this as completed Aug 30, 2016

golang locked and limited conversation to collaborators Sep 28, 2017

gopherbot added the FrozenDueToAge label Sep 28, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmd/compile: weird double-jump in code generation #15090

cmd/compile: weird double-jump in code generation #15090

rasky commented Apr 3, 2016

rasky commented Apr 3, 2016

robpike commented Apr 3, 2016

rasky commented Apr 4, 2016

robpike commented Apr 4, 2016

brtzsnr commented Apr 4, 2016

rasky commented Apr 4, 2016

brtzsnr commented Apr 4, 2016

rasky commented Apr 4, 2016

brtzsnr commented Apr 4, 2016

dr2chase commented Apr 4, 2016

josharian commented Apr 4, 2016

randall77 commented Aug 30, 2016

gopherbot commented Aug 30, 2016

rasky commented Sep 28, 2016

randall77 commented Sep 28, 2016

cmd/compile: weird double-jump in code generation #15090

cmd/compile: weird double-jump in code generation #15090

Comments

rasky commented Apr 3, 2016

rasky commented Apr 3, 2016

robpike commented Apr 3, 2016

rasky commented Apr 4, 2016

robpike commented Apr 4, 2016

brtzsnr commented Apr 4, 2016

rasky commented Apr 4, 2016

brtzsnr commented Apr 4, 2016

rasky commented Apr 4, 2016

brtzsnr commented Apr 4, 2016

dr2chase commented Apr 4, 2016

josharian commented Apr 4, 2016

randall77 commented Aug 30, 2016

gopherbot commented Aug 30, 2016

rasky commented Sep 28, 2016

randall77 commented Sep 28, 2016