Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: optimize slice access with constant upper/lower bound #14266

Closed
OneOfOne opened this issue Feb 8, 2016 · 7 comments
Closed

cmd/compile: optimize slice access with constant upper/lower bound #14266

OneOfOne opened this issue Feb 8, 2016 · 7 comments

Comments

@OneOfOne
Copy link
Contributor

OneOfOne commented Feb 8, 2016

For example (https://golang.org/src/encoding/binary/binary.go#L69):

func Uint64(b []byte) uint64 {
    return uint64(b[0]) | uint64(b[1])<<8 | uint64(b[2])<<16 | uint64(b[3])<<24 |
        uint64(b[4])<<32 | uint64(b[5])<<40 | uint64(b[6])<<48 | uint64(b[7])<<56
}

Gets compiled roughly to:

func Uint64(b []byte) (v uint64) {
    if len(b) > 0 { v |= uint64(b[0])   }  else { panic() }
    if len(b) > 1 { v |= uint64(b[1]) << 8 } else { panic() }
    .... and so on....
    return
}

I was wondering if that can get optimized to something more like:

func Uint64(b []byte) (v uint64) {
    if len(b) < 8 { panic("index error") }
    .....
    return
}

I tried to look at the code in src/cmd/compile/internal/gc/cgen.go but it was way above my head, if there's interest in that idea and no one wants to pick it up, I'll look more into learning how the code gen works and try to implement it.

@randall77 you might find this interesting.

@bradfitz
Copy link
Contributor

bradfitz commented Feb 8, 2016

/cc @dr2chase

@bradfitz bradfitz added this to the Unplanned milestone Feb 8, 2016
@bradfitz
Copy link
Contributor

bradfitz commented Feb 8, 2016

I wonder if this might actually come for free from existing SSA + @dr2chase work.

There are no ordering requirements for the bitwise OR operations, and there are no side effects and there's only one return path.

@randall77
Copy link
Contributor

David (drchase) can take a look at the bounds checks.
It is a separate issue to combine the loads. Once the bounds checks are gone (or even before?) we can use rewrites to combine the loads into larger units. Let's leave this bug about bounds checks and I've opened #14267 for combining the loads.

@dr2chase
Copy link
Contributor

dr2chase commented Feb 8, 2016

Grand plan is to do this in SSA. We have bounds-check removal code now that does this somewhat better. The enabling step is to search for the highest bounds check before a side-effect, and to hoist that one first where it dominates all the others.

However, I just tried that experiment by hand, and constant indices aren't working well (yet).

@dgryski
Copy link
Contributor

dgryski commented Jan 18, 2018

Can this be closed? I think the current BCE logic handles this case.

@navytux
Copy link
Contributor

navytux commented Jan 19, 2018

Imho this is not exactly the case, and that's why in endian encoding routines there is explicit prologue check for whole length, e.g.

func (littleEndian) Uint64(b []byte) uint64 {
        _ = b[7] // bounds check hint to compiler; see golang.org/issue/14808
        return uint64(b[0]) | uint64(b[1])<<8 | uint64(b[2])<<16 | uint64(b[3])<<24 |
                uint64(b[4])<<32 | uint64(b[5])<<40 | uint64(b[6])<<48 | uint64(b[7])<<56
}

which is indeed gets compiled to

TEXT ·LEUint64(SB), NOSPLIT, $8-32 // bce.go:9
        // SUBQ    $8, SP
        // MOVQ    BP, (SP) (BP save)
        // LEAQ    (SP), BP (BP init)
        // FUNCDATA $0, gclocals·4032f753396f2012ad1784f398b170f4(SB) (args)
        FUNCDATA   $1, gclocals·69c1753bd5f81501d95132d08af04464(SB) (locals)
        MOVQ       b+24(SP), AX
        CMPQ       AX, $7                 // bce.go:10
        JLS        pc45
        MOVQ       b+16(SP), AX
        MOVQ       (AX), AX               // bce.go:12
        MOVQ       AX, _r1+40(SP)         // bce.go:11
        // MOVQ    (SP), BP (BP restore)  // bce.go:12
        // ADDQ    $8, SP (SP restore)
        RET
pc45:
        PCDATA     $0, $1                 // bce.go:10
        CALL       runtime.panicindex(SB)
        UNDEF

but without that explicit _ = b[7] the code becomes

TEXT ·LEUint64(SB), NOSPLIT, $8-32 // bce.go:9
        // SUBQ    $8, SP
        // MOVQ    BP, (SP) (BP save)
        // LEAQ    (SP), BP (BP init)
        // FUNCDATA $0, gclocals·4032f753396f2012ad1784f398b170f4(SB) (args)
        FUNCDATA   $1, gclocals·69c1753bd5f81501d95132d08af04464(SB) (locals)
        MOVQ       b+24(SP), AX
        TESTQ      AX, AX                 // bce.go:11
        JLS        pc93
        CMPQ       AX, $1
        JLS        pc93
        CMPQ       AX, $2
        JLS        pc93
        CMPQ       AX, $3
        JLS        pc93
        CMPQ       AX, $4                 // bce.go:12
        JLS        pc86
        CMPQ       AX, $5
        JLS        pc86
        CMPQ       AX, $6
        JLS        pc86
        CMPQ       AX, $7
        JLS        pc86
        MOVQ       b+16(SP), AX
        MOVQ       (AX), AX
        MOVQ       AX, _r1+40(SP)         // bce.go:11
        // MOVQ    (SP), BP (BP restore)  // bce.go:12
        // ADDQ    $8, SP (SP restore)
        RET
pc86:
        PCDATA     $0, $1
        CALL       runtime.panicindex(SB)
        UNDEF

(go version devel +4dc1c491b0 Thu Jan 18 14:43:29 2018 +0000 linux/amd64)

P.S. somehow for bigendian it works ok without _ = b[7].

@rasky
Copy link
Member

rasky commented Aug 20, 2018

I think this can be closed. The current Go semantic does not allow to optimize bound accesses if they are done in forward order (eg: 0 to 7). This is why an explicit _ = b[7] is required in some cases.

So this is not a missing optimization. If you anybody wants to propose a semantic change, please file a Go 2 proposal.

@rasky rasky closed this as completed Aug 20, 2018
@golang golang locked and limited conversation to collaborators Aug 20, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants