Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: [0:x] slicing doesn't need to protect against past-end-of-object pointers #14849

Closed
randall77 opened this issue Mar 17, 2016 · 6 comments

Comments

@randall77
Copy link
Contributor

For s[i:j:k], SSA does:

    rlen = j-i
    rcap = k-i
    p = s.ptr
    if rcap != 0 {
        p += i * elemsize
    }
    result = (SliceMake p rlen rcap)

The if is to make sure we don't end up manufacturing a pointer to the next object in the heap, and thus falsely retaining it.
If i==0, we don't need that check.

This comes up in encoding/binary.littleEndian.Uint64:

00008 (get.go:6)    MOVQ    "".b+16(FP), CX
00009 (get.go:6)    CMPQ    CX, $8
00010 (get.go:6)    JCS $0, 18
00011 (get.go:6)    CMPQ    CX, $0
00012 (get.go:6)    JEQ $0, 13
00013 (get.go:6)    MOVQ    "".b(FP), CX
00014 (get.go:6)    MOVQ    (CX), CX
00015 (get.go:6)    VARDEF  "".~r1+24(FP)
00016 (get.go:6)    MOVQ    CX, "".~r1+24(FP)
00017 (get.go:6)    RET
00018 (get.go:6)    CALL    runtime.panicslice(SB)
00019 (get.go:6)    UNDEF

That second comparison is useless, and represents the rcap != 0 comparison. Maybe we should not emit this check at all if i==0. Maybe we could detect this comparison is redundant (cx >= 8 implies cx != 0), or maybe we could detect that the branch is dumb because its taken and fallthrough go to the same place (as we've already constant-folded the i*elemsize to 0, there's no phi anymore).

@dr2chase @brtzsnr

@randall77
Copy link
Contributor Author

Note that if we change the slicing in Uint64 to b=b[:8:8], the extra comparison goes away.

@brtzsnr
Copy link
Contributor

brtzsnr commented Mar 17, 2016

This is just a missed optimization by generic.

    If v39 -> b9 b10 (likely)
  b9: <- b8
    v40 = AddPtr <*byte> v30 v29
    Plain -> b10
  b10: <- b8 b9
    v41 = Phi <*byte> v30 v40

v29 is 0 here, but AddPtr v30 v29 is not folded until the lowering pass. This means that v41 is not replaced by v30 which means fuse cannot collapse the branch.

Stupid jumps and empty branches after trim are a sign of missed opportunities for fuse.

I'm going to submit a fix tomorrow, but if anyone wants to take care make sure the proper canonicalization rule is added: (AddPtr x const) -> (AddPtr const x).

@josharian
Copy link
Contributor

I think a better canonicalization would be (AddPtr x const) -> (OffPtr [const] x), that way it can be optimized further against other OffPtrs.

@josharian
Copy link
Contributor

Or perhaps better yet, find out where that AddPtr came from. I think there are no (AddPtr x const) at the beginning, by construction, so someone introduced it; they should just introduce OffPtr instead.

@brtzsnr
Copy link
Contributor

brtzsnr commented Mar 17, 2016

(PtrIndex <t> ptr idx) && config.PtrSize == 8 -> (AddPtr ptr (Mul64 <config.fe.TypeInt()> idx (Const64 <config.fe.TypeInt()> [t.ElemType().Size()])))

The rule is legit.

@gopherbot
Copy link

CL https://golang.org/cl/20833 mentions this issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants