Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: cmd/compile: compiler-emitted itab symbols #14874

Closed
walken-google opened this issue Mar 19, 2016 · 13 comments
Closed

proposal: cmd/compile: compiler-emitted itab symbols #14874

walken-google opened this issue Mar 19, 2016 · 13 comments

Comments

@walken-google
Copy link
Contributor

I am writing this after the fact, as I already have a proposed patch set out for review. Sorry about that.

itab structures are currently always generated by the runtime. This leads to inefficiencies when converting between a known type and a known interface - the generated code could be more efficient if it could know the itab address, but it can't if the itab is generated by the runtime, so we end up generating more complicated code that calls typ2Itab and caches the result.

I propose having the compiler generate itab structures when the go code converts between a known type and a known interface, so that the generated itabs can be taken as constant addresses in the generated code. The runtime will register these structures into its itab hash tables at init time, and is still capable of creating more on the fly if required during conversions from one interface type to another.

sample code:

package p

type I interface {
        F()
}

type T struct{ i int }

func (t T) F() {}

func T2I(t T) I   { return t }
func TP2I(t *T) I { return t }

compiled with master at fc6bcde:

"".T2I t=1 size=128 args=0x18 locals=0x40
    0x0000 00000 (src/t2i.go:11)    TEXT    "".T2I(SB), $64-24
    0x0000 00000 (src/t2i.go:11)    MOVQ    (TLS), CX
    0x0009 00009 (src/t2i.go:11)    CMPQ    SP, 16(CX)
    0x000d 00013 (src/t2i.go:11)    JLS 113
    0x000f 00015 (src/t2i.go:11)    SUBQ    $64, SP
    0x0013 00019 (src/t2i.go:11)    FUNCDATA    $0, gclocals·790e5cc5051fc0affc980ade09e929ec(SB)
    0x0013 00019 (src/t2i.go:11)    FUNCDATA    $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
    0x0013 00019 (src/t2i.go:11)    MOVQ    "".t+72(FP), AX
    0x0018 00024 (src/t2i.go:11)    MOVQ    AX, "".autotmp_0000+56(SP)
    0x001d 00029 (src/t2i.go:11)    LEAQ    type."".T(SB), AX
    0x0024 00036 (src/t2i.go:11)    MOVQ    AX, (SP)
    0x0028 00040 (src/t2i.go:11)    LEAQ    type."".I(SB), AX
    0x002f 00047 (src/t2i.go:11)    MOVQ    AX, 8(SP)
    0x0034 00052 (src/t2i.go:11)    LEAQ    go.itab."".T."".I(SB), AX
    0x003b 00059 (src/t2i.go:11)    MOVQ    AX, 16(SP)
    0x0040 00064 (src/t2i.go:11)    LEAQ    "".autotmp_0000+56(SP), AX
    0x0045 00069 (src/t2i.go:11)    MOVQ    AX, 24(SP)
    0x004a 00074 (src/t2i.go:11)    MOVQ    $0, 32(SP)
    0x0053 00083 (src/t2i.go:11)    PCDATA  $0, $0
    0x0053 00083 (src/t2i.go:11)    CALL    runtime.convT2I(SB)
    0x0058 00088 (src/t2i.go:11)    MOVQ    48(SP), CX
    0x005d 00093 (src/t2i.go:11)    MOVQ    40(SP), DX
    0x0062 00098 (src/t2i.go:11)    MOVQ    DX, "".~r1+80(FP)
    0x0067 00103 (src/t2i.go:11)    MOVQ    CX, "".~r1+88(FP)
    0x006c 00108 (src/t2i.go:11)    ADDQ    $64, SP
    0x0070 00112 (src/t2i.go:11)    RET
    0x0071 00113 (src/t2i.go:11)    NOP
    0x0071 00113 (src/t2i.go:11)    CALL    runtime.morestack_noctxt(SB)
    0x0076 00118 (src/t2i.go:11)    JMP 0
    rel 5+4 t=14 +0
    rel 32+4 t=13 type."".T+0
    rel 43+4 t=13 type."".I+0
    rel 55+4 t=13 go.itab."".T."".I+0
    rel 84+4 t=6 runtime.convT2I+0
    rel 114+4 t=6 runtime.morestack_noctxt+0
"".TP2I t=1 size=112 args=0x18 locals=0x20
    0x0000 00000 (src/t2i.go:12)    TEXT    "".TP2I(SB), $32-24
    0x0000 00000 (src/t2i.go:12)    MOVQ    (TLS), CX
    0x0009 00009 (src/t2i.go:12)    CMPQ    SP, 16(CX)
    0x000d 00013 (src/t2i.go:12)    JLS 98
    0x000f 00015 (src/t2i.go:12)    SUBQ    $32, SP
    0x0013 00019 (src/t2i.go:12)    FUNCDATA    $0, gclocals·0b86ef39f3fed835f14ba5f4d7c62fa2(SB)
    0x0013 00019 (src/t2i.go:12)    FUNCDATA    $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
    0x0013 00019 (src/t2i.go:12)    MOVQ    go.itab.*"".T."".I(SB), CX
    0x001a 00026 (src/t2i.go:12)    TESTQ   CX, CX
    0x001d 00029 (src/t2i.go:12)    JEQ $0, 51
    0x001f 00031 (src/t2i.go:12)    MOVQ    CX, "".~r1+48(FP)
    0x0024 00036 (src/t2i.go:12)    MOVQ    "".t+40(FP), CX
    0x0029 00041 (src/t2i.go:12)    MOVQ    CX, "".~r1+56(FP)
    0x002e 00046 (src/t2i.go:12)    ADDQ    $32, SP
    0x0032 00050 (src/t2i.go:12)    RET
    0x0033 00051 (src/t2i.go:12)    LEAQ    type.*"".T(SB), AX
    0x003a 00058 (src/t2i.go:12)    MOVQ    AX, (SP)
    0x003e 00062 (src/t2i.go:12)    LEAQ    type."".I(SB), AX
    0x0045 00069 (src/t2i.go:12)    MOVQ    AX, 8(SP)
    0x004a 00074 (src/t2i.go:12)    LEAQ    go.itab.*"".T."".I(SB), CX
    0x0051 00081 (src/t2i.go:12)    MOVQ    CX, 16(SP)
    0x0056 00086 (src/t2i.go:12)    PCDATA  $0, $0
    0x0056 00086 (src/t2i.go:12)    CALL    runtime.typ2Itab(SB)
    0x005b 00091 (src/t2i.go:12)    MOVQ    24(SP), CX
    0x0060 00096 (src/t2i.go:12)    JMP 31
    0x0062 00098 (src/t2i.go:12)    NOP
    0x0062 00098 (src/t2i.go:12)    CALL    runtime.morestack_noctxt(SB)
    0x0067 00103 (src/t2i.go:12)    JMP 0
    rel 5+4 t=14 +0
    rel 22+4 t=13 go.itab.*"".T."".I+0
    rel 54+4 t=13 type.*"".T+0
    rel 65+4 t=13 type."".I+0
    rel 77+4 t=13 go.itab.*"".T."".I+0
    rel 87+4 t=6 runtime.typ2Itab+0
    rel 99+4 t=6 runtime.morestack_noctxt+0
go.itab."".T."".I t=33 dupok size=8
go.itab.*"".T."".I t=33 dupok size=8

Compiled with above plus the proposed changes:

"".T2I t=1 size=112 args=0x18 locals=0x38
    0x0000 00000 (src/t2i.go:11)    TEXT    "".T2I(SB), $56-24
    0x0000 00000 (src/t2i.go:11)    MOVQ    (TLS), CX
    0x0009 00009 (src/t2i.go:11)    CMPQ    SP, 16(CX)
    0x000d 00013 (src/t2i.go:11)    JLS 101
    0x000f 00015 (src/t2i.go:11)    SUBQ    $56, SP
    0x0013 00019 (src/t2i.go:11)    FUNCDATA    $0, gclocals·790e5cc5051fc0affc980ade09e929ec(SB)
    0x0013 00019 (src/t2i.go:11)    FUNCDATA    $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
    0x0013 00019 (src/t2i.go:11)    MOVQ    "".t+64(FP), AX
    0x0018 00024 (src/t2i.go:11)    MOVQ    AX, "".autotmp_0000+48(SP)
    0x001d 00029 (src/t2i.go:11)    LEAQ    type."".T(SB), AX
    0x0024 00036 (src/t2i.go:11)    MOVQ    AX, (SP)
    0x0028 00040 (src/t2i.go:11)    LEAQ    go.itab."".T."".I(SB), AX
    0x002f 00047 (src/t2i.go:11)    MOVQ    AX, 8(SP)
    0x0034 00052 (src/t2i.go:11)    LEAQ    "".autotmp_0000+48(SP), AX
    0x0039 00057 (src/t2i.go:11)    MOVQ    AX, 16(SP)
    0x003e 00062 (src/t2i.go:11)    MOVQ    $0, 24(SP)
    0x0047 00071 (src/t2i.go:11)    PCDATA  $0, $0
    0x0047 00071 (src/t2i.go:11)    CALL    runtime.convT2I(SB)
    0x004c 00076 (src/t2i.go:11)    MOVQ    40(SP), CX
    0x0051 00081 (src/t2i.go:11)    MOVQ    32(SP), DX
    0x0056 00086 (src/t2i.go:11)    MOVQ    DX, "".~r1+72(FP)
    0x005b 00091 (src/t2i.go:11)    MOVQ    CX, "".~r1+80(FP)
    0x0060 00096 (src/t2i.go:11)    ADDQ    $56, SP
    0x0064 00100 (src/t2i.go:11)    RET
    0x0065 00101 (src/t2i.go:11)    NOP
    0x0065 00101 (src/t2i.go:11)    CALL    runtime.morestack_noctxt(SB)
    0x006a 00106 (src/t2i.go:11)    JMP 0
    rel 5+4 t=14 +0
    rel 32+4 t=13 type."".T+0
    rel 43+4 t=13 go.itab."".T."".I+0
    rel 72+4 t=6 runtime.convT2I+0
    rel 102+4 t=6 runtime.morestack_noctxt+0
"".TP2I t=1 size=32 args=0x18 locals=0x0
    0x0000 00000 (src/t2i.go:12)    TEXT    "".TP2I(SB), $0-24
    0x0000 00000 (src/t2i.go:12)    NOP
    0x0000 00000 (src/t2i.go:12)    NOP
    0x0000 00000 (src/t2i.go:12)    FUNCDATA    $0, gclocals·0b86ef39f3fed835f14ba5f4d7c62fa2(SB)
    0x0000 00000 (src/t2i.go:12)    FUNCDATA    $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
    0x0000 00000 (src/t2i.go:12)    LEAQ    go.itab.*"".T."".I(SB), AX
    0x0007 00007 (src/t2i.go:12)    MOVQ    AX, "".~r1+16(FP)
    0x000c 00012 (src/t2i.go:12)    MOVQ    "".t+8(FP), AX
    0x0011 00017 (src/t2i.go:12)    MOVQ    AX, "".~r1+24(FP)
    0x0016 00022 (src/t2i.go:12)    RET
    rel 3+4 t=13 go.itab.*"".T."".I+0
go.itab."".T."".I t=34 dupok size=40
    0x0000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    rel 0+8 t=1 type."".I+0
    rel 8+8 t=1 type."".T+0
go.itablink."".T."".I t=9 dupok size=8
    0x0000 00 00 00 00 00 00 00 00                          ........
    rel 0+8 t=1 go.itab."".T."".I+0
go.itab.*"".T."".I t=34 dupok size=40
    0x0000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    rel 0+8 t=1 type."".I+0
    rel 8+8 t=1 type.*"".T+0
go.itablink.*"".T."".I t=9 dupok size=8
    0x0000 00 00 00 00 00 00 00 00                          ........
    rel 0+8 t=1 go.itab.*"".T."".I+0

The generated code for T2I is now slightly simpler because it has one less constant argument to convT2I to push on the stack - this saves 12 bytes of code for each place that converts T to I, at the cost of having the compiler generate an itab (once) which takes 32 bytes + 8 per interface method.

The generated code for TP2I is now much simpler, just moving the (known) itab pointer and the T pointer into the returned iface. In this simple case, this saves so much code that it pays for the itab size on the first use. In a more complex function, I suspect the benefits will be less but there should be an overall size reduction either way.

I am also seeing a ~1.2% size reduction for the bin/{gen,go,gofmt} binaries and a ~1.0% size reduction for the pkg/tool/linux_amd64/* binaries.

I do not have meaningful benchmarking numbers to report at this point (my results are way too noisy for such a small expected change).

@gopherbot
Copy link

CL https://golang.org/cl/20902 mentions this issue.

@gopherbot
Copy link

CL https://golang.org/cl/20888 mentions this issue.

@gopherbot
Copy link

CL https://golang.org/cl/20901 mentions this issue.

@gopherbot
Copy link

CL https://golang.org/cl/20889 mentions this issue.

@gopherbot
Copy link

CL https://golang.org/cl/20900 mentions this issue.

@walken-google walken-google changed the title compiler-emitted itab symbols Proposal: compiler-emitted itab symbols Mar 19, 2016
@ianlancetaylor ianlancetaylor changed the title Proposal: compiler-emitted itab symbols proposal: cmd/compile: compiler-emitted itab symbols Mar 20, 2016
@ianlancetaylor ianlancetaylor added this to the Go1.7 milestone Mar 20, 2016
@ianlancetaylor
Copy link
Contributor

CC @randall77

@ianlancetaylor
Copy link
Contributor

This is written as a proposal, but if you are seeing better code and a size reduction, it seems uncontroversial. This should go in.

@mdempsky
Copy link
Member

I want to check that I understand how this works in the presence of shared libraries.

Suppose libraries libT.so, libI.so, libX.so, and libY.so, where the only interdependencies between them are that libX.so and libY.so both depend on both of libT.so and libI.so.

Suppose libT.so declares type t.T, libI.so declares interface i.I (that t.T happens to implement), and libX.so and libY.so both include conversions from t.T to i.I.

Since libX.so and libY.so don't know about each other, they both need to provide definitions for go.itab.t.T,i.I.

How is this duplicity handled? Will one definition take priority and always be canonically used, or do we need to ensure the runtime / compiler-emitted code are safe in the presence of redundant itabs?

I notice, for example, that ifaceeq currently requires identical itab pointer values. So if we need to worry about redundant itabs, we need to at least relax that function to check for itab._type equality instead (maybe only in the shared library case).

@walken-google
Copy link
Contributor Author

I'm not entirely sure about shared libraries.

In the single-executable case, what happens is that packages X and Y will both emit a go.itab.t.T,i.I symbol which is marked as dupok. The linker then notices the two same-named symbols, throws one, keeps the other, and we end up with a unique itab in the final binary (I actually tested this).

In the case of shared libraries, I'm not sure how dupok symbols are handled, but I would assume they have to be handled already anyway - before my changes, libX.so and libY.so already both define go.type.t.T and they already need to end up with a single symbol being used. I don't quite remember ELF semantics but I think what happens is that the linker loads shared libraries in a certain order and if one redefines a previously linked global symbol, the new definition is thrown out and the symbol is made to point to the previous definition instead (essentially giving dupok semantics to all ELF global symbols).

@mdempsky
Copy link
Member

Yeah, the static executable case is easy and seems safe to me. I think it should be okay with at most simple fixes for the shared library case too, but I'd just like to confirm that first. (cc @mwhudson)

@mwhudson
Copy link
Contributor

Yes, I think it should be OK, as @walken-google says, the first symbol of a given names "wins" (at least for global symbols with default visibility), so that should be OK. It might be another thing for whichever brave soul implements plugins to think about, but I think shared libraries should be fine.

@walken-google
Copy link
Contributor Author

keith randall suggested that since convT2I now has an itab argument, we don't need the src type argument anymore (that's already in the itab). Removing that argument saves another 0.1% space in the binaries I tried - the savings are now ~1.3% for bin/{gen,go,gofmt} binaries and ~1.1% for the pkg/tool/linux_amd64/* binaries.

@josharian
Copy link
Contributor

FWIW, we know at compile time whether the final argument to convT2I will be nil, so we could in theory have two convT2I flavors, the most commonly used of which has no final arg.

gopherbot pushed a commit that referenced this issue Mar 29, 2016
See #14874

This change makes the runtime register all compiler generated itabs
(as obtained from the moduledata) during init.

Change-Id: I9969a0985b99b8bda820a631f7fe4c78f1174cdf
Reviewed-on: https://go-review.googlesource.com/20900
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Michel Lespinasse <walken@google.com>
gopherbot pushed a commit that referenced this issue Mar 29, 2016
See #14874

This change tells the compiler to emit itab and itablink symbols in
situations where they could be useful; however the compiled code does
not actually make use of the new symbols yet.

Change-Id: I0db3e6ec0cb1f3b7cebd4c60229e4a48372fe586
Reviewed-on: https://go-review.googlesource.com/20888
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: Michel Lespinasse <walken@google.com>
gopherbot pushed a commit that referenced this issue Mar 29, 2016
See #14874

This change tells the linker to collect all the itablink symbols and
collect them so that moduledata can have a slice of all compiler
generated itabs.

The logic is shamelessly adapted from what is done with typelink symbols.

Change-Id: Ie93b59acf0fcba908a876d506afbf796f222dbac
Reviewed-on: https://go-review.googlesource.com/20889
Reviewed-by: Keith Randall <khr@golang.org>
gopherbot pushed a commit that referenced this issue Mar 29, 2016
…aped

See #14874

This change adds a compiler optimization for pointer shaped convT2I.
Since itab symbols are now emitted by the compiler, the itab address can
be directly moved into the iface structure.

Change-Id: I311483af544519ca682c5f872960717ead772f26
Reviewed-on: https://go-review.googlesource.com/20901
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
gopherbot pushed a commit that referenced this issue Mar 29, 2016
See #14874
Updates #6853

This change adds a compiler optimization for non pointer shaped convT2I.
Since itab symbols are now emitted by the compiler, the itab address can
be passed directly to convT2I instead of passing the iface type and a
cache pointer argument.

Compilebench results for the 5-commits series ending here:

name       old time/op     new time/op     delta
Template       336ms ± 4%      344ms ± 4%   +2.61%          (p=0.027 n=9+8)
Unicode        165ms ± 6%      173ms ± 7%   +5.11%          (p=0.014 n=9+9)
GoTypes        1.09s ± 1%      1.06s ± 2%   -3.29%          (p=0.000 n=9+9)
Compiler       5.09s ±10%      4.75s ±10%   -6.64%        (p=0.011 n=10+10)
MakeBash       31.1s ± 5%      30.3s ± 3%     ~           (p=0.089 n=10+10)

name       old text-bytes  new text-bytes  delta
HelloSize       558k ± 0%       558k ± 0%   +0.02%        (p=0.000 n=10+10)
CmdGoSize      6.24M ± 0%      6.11M ± 0%   -2.11%        (p=0.000 n=10+10)

name       old data-bytes  new data-bytes  delta
HelloSize      3.66k ± 0%      3.74k ± 0%   +2.41%        (p=0.000 n=10+10)
CmdGoSize       134k ± 0%       162k ± 0%  +20.76%        (p=0.000 n=10+10)

name       old bss-bytes   new bss-bytes   delta
HelloSize       126k ± 0%       126k ± 0%     ~     (all samples are equal)
CmdGoSize       149k ± 0%       146k ± 0%   -2.17%        (p=0.000 n=10+10)

name       old exe-bytes   new exe-bytes   delta
HelloSize       924k ± 0%       924k ± 0%   +0.05%        (p=0.000 n=10+10)
CmdGoSize      9.77M ± 0%      9.62M ± 0%   -1.47%        (p=0.000 n=10+10)

Change-Id: Ib230ddc04988824035c32287ae544a965fedd344
Reviewed-on: https://go-review.googlesource.com/20902
Reviewed-by: Keith Randall <khr@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>
Run-TryBot: Michel Lespinasse <walken@google.com>
@walken-google walken-google self-assigned this Mar 29, 2016
@golang golang locked and limited conversation to collaborators Mar 29, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants