Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: inefficient increment of global array #10432

Open
dvyukov opened this issue Apr 13, 2015 · 7 comments
Open

cmd/compile: inefficient increment of global array #10432

dvyukov opened this issue Apr 13, 2015 · 7 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime.
Milestone

Comments

@dvyukov
Copy link
Member

dvyukov commented Apr 13, 2015

go version devel +a02d925 Sun Apr 12 12:14:36 2015 +0300 linux/amd64

Program is:

var tab *[256]byte

func foo(i int) {
    tab[i&255]++
}

Currently it is compiled to:

0000000000400c30 <main.foo>:
  400c30:       64 48 8b 0c 25 f8 ff    mov    %fs:0xfffffffffffffff8,%rcx
  400c37:       ff ff 
  400c39:       48 3b 61 10             cmp    0x10(%rcx),%rsp
  400c3d:       77 07                   ja     400c46 <main.foo+0x16>
  400c3f:       e8 cc 24 04 00          callq  443110 <runtime.morestack_noctxt>
  400c44:       eb ea                   jmp    400c30 <main.foo>
  400c46:       48 8b 44 24 08          mov    0x8(%rsp),%rax
  400c4b:       48 8b 1d 36 a0 0a 00    mov    0xaa036(%rip),%rbx        # 4aac88 <main.tab>
  400c52:       48 25 ff 00 00 00       and    $0xff,%rax
  400c58:       48 83 fb 00             cmp    $0x0,%rbx
  400c5c:       74 41                   je     400c9f <main.foo+0x6f>
  400c5e:       48 3d 00 01 00 00       cmp    $0x100,%rax
  400c64:       73 32                   jae    400c98 <main.foo+0x68>
  400c66:       48 8d 1c 03             lea    (%rbx,%rax,1),%rbx
  400c6a:       0f b6 2b                movzbl (%rbx),%ebp
  400c6d:       48 8b 1d 14 a0 0a 00    mov    0xaa014(%rip),%rbx        # 4aac88 <main.tab>
  400c74:       48 83 fb 00             cmp    $0x0,%rbx
  400c78:       74 1a                   je     400c94 <main.foo+0x64>
  400c7a:       48 3d 00 01 00 00       cmp    $0x100,%rax
  400c80:       73 0b                   jae    400c8d <main.foo+0x5d>
  400c82:       48 8d 1c 03             lea    (%rbx,%rax,1),%rbx
  400c86:       48 ff c5                inc    %rbp
  400c89:       40 88 2b                mov    %bpl,(%rbx)
  400c8c:       c3                      retq   
  400c8d:       e8 be a3 01 00          callq  41b050 <runtime.panicindex>
  400c92:       0f 0b                   ud2    
  400c94:       89 03                   mov    %eax,(%rbx)
  400c96:       eb e2                   jmp    400c7a <main.foo+0x4a>
  400c98:       e8 b3 a3 01 00          callq  41b050 <runtime.panicindex>
  400c9d:       0f 0b                   ud2    
  400c9f:       89 03                   mov    %eax,(%rbx)
  400ca1:       eb bb                   jmp    400c5e <main.foo+0x2e>

While it could be compiled to something along the lines of:

0000000000400c30 <main.foo>:
    mov 0x8(%rsp),%rax
    mov 0xaa036(%rip),%rbx        # 4aac88 <main.tab>
    add $1, (%rbx, %rax)
    ret

Here are several issues:

  1. Address calculation, bounds check and nil check are done twice.
  2. Nil check is not necessary here, as the increment itself will trigger paging fault if tab is nil.
  3. Bounds check is not necessary as i&255 is guaranteed to be <256.
@dvyukov
Copy link
Member Author

dvyukov commented Apr 13, 2015

FTR, the full program is:

package main

func main() {
    foo(42)
}

var tab *[256]byte

func foo(i int) {
    tab[i&255]++
}

build with -l.

@dvyukov
Copy link
Member Author

dvyukov commented Apr 13, 2015

@ianlancetaylor ianlancetaylor changed the title cmd/gc: inefficient increment of global array cmd/compile: inefficient increment of global array Jun 3, 2015
@ianlancetaylor ianlancetaylor added this to the Go1.6 milestone Jun 3, 2015
@rsc rsc modified the milestones: Unplanned, Go1.6 Nov 4, 2015
@navytux
Copy link
Contributor

navytux commented Mar 25, 2017

For the reference, today foo compiles to this:

TEXT ·foo(SB), $0-8 // dv.go:9
        NO_LOCAL_POINTERS
        // FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB) (args)
        // FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) (no locals)
        MOVQ       ·tab(SB), AX  // dv.go:10
        TESTB      AL, (AX)
        MOVQ       i+0(FP), CX
        MOVBLZX    CL, CX
        MOVBLZX    (AX)(CX*1), DX
        INCL       DX
        MOVB       DL, (AX)(CX*1)
        RET                       // dv.go:11

TESTB AL, (AX) seems to be not needed, and probably also MOVQ i+0(FP), CX and MOVBLZX CL, CX could be combined into one MOVBLZX from stack (not sure here), but otherwise foo looks to be much closer to optimal than it was in 2015.

@josharian
Copy link
Contributor

I'm surprised that the nil check isn't eliminated. It should be. @cherrymui?

Combining the stack load and the zero extension is #15300.

@randall77
Copy link
Contributor

I don't think we do any nil check removal on non-constant array indexing. We could.

@navytux
Copy link
Contributor

navytux commented May 4, 2018

For the reference: with today's tip (go version devel +8aa316c139 Fri May 4 17:01:04 2018 +0000 linux/amd64) original program compiles to essentially the same as in #10432 (comment):

TEXT ·foo(SB), NOSPLIT, $0-8 // dv.go:9
        NO_LOCAL_POINTERS
        // FUNCDATA $0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) (args)
        // FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) (no locals)
        MOVQ       ·tab(SB), AX  // dv.go:10
        TESTB      AL, (AX)
        MOVQ       i+8(SP), CX    // dv.go:9
        MOVBLZX    CL, CX         // dv.go:10
        MOVBLZX    (AX)(CX*1), DX
        INCL       DX
        MOVB       DL, (AX)(CX*1)
        RET                       // dv.go:11

There is ongoing CL 91415 which teaches nilcheckelim to understand that offsets from non-nil pointers are also non-nil. However it currently works with only fixed offsets, not non-constant indexing.

/cc @mrosier-qdt

@navytux
Copy link
Contributor

navytux commented May 4, 2018

/cc @bmakam-qdt

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime.
Projects
None yet
Development

No branches or pull requests

7 participants