Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/link: dead code elimination for side-effect free functions #14840

Open
mdempsky opened this issue Mar 16, 2016 · 17 comments
Open

cmd/link: dead code elimination for side-effect free functions #14840

mdempsky opened this issue Mar 16, 2016 · 17 comments
Assignees
Labels
binary-size compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@mdempsky
Copy link
Member

Currently if we compile a Go program like:

package main
type dead int
func newDead() *dead { return new(dead) }
var x = newDead()
func main() {}

the linker can't dead code eliminate x or dead. This is because the "x = newDead()" initialization is compiled to an implicit init function, which causes the linker to pull in x and newDead.

(See https://golang.org/cl/20765 for a real world example.)

In general, just because x is otherwise unused, the linker can't get rid of the newDead() call because it might have side-effects. However, it should be possible for the compiler to help identify functions that are side-effect free, which could in turn let the linker be more aggressive about dead code elimination.

We would probably also need to tweak how cmd/compile generates package initializer functions for the linker to be able to eliminate individual initializers.

@mdempsky mdempsky added this to the Unplanned milestone Mar 16, 2016
@bradfitz
Copy link
Contributor

This could help with shrinking init-time map literal construction in e.g. the unicode package.

If the unicode.init funcs generating the unicode's various maps were flagged as side-effect-free, then the whole init funcs can be deadcode eliminated if nobody referred to those maps.

/cc @crawshaw @josharian

@randall77
Copy link
Contributor

init methods are a problem because they may reference (& initialize)
otherwise dead globals. Anything we can do to initialize globals without
inits will help (see https://go-review.googlesource.com/c/17398/ for an
example). It doesn't look like this is an easy transformation here,
because newDead() could do anything.
If instead it was:

var x = new(dead)

maybe we could rewrite that in the compiler to

var _x dead
var x = &_x

the latter wouldn't need any init code, just relocations, so if x was
otherwise unused both x and _x would be stripped out at link time.

On Wed, Mar 16, 2016 at 1:49 PM, Brad Fitzpatrick notifications@github.com
wrote:

This could help with shrinking init-time map literal construction in e.g.
the unicode package.

If the unicode.init funcs generating the unicode's various maps were
flagged as side-effect-free, then the whole init funcs can be deadcode
eliminated if nobody referred to those maps.

/cc @crawshaw https://github.com/crawshaw @josharian
https://github.com/josharian


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#14840 (comment)

@mdempsky
Copy link
Member Author

@randall77 I think during escape analysis we could conservatively detect whether a function has side effects?

@randall77
Copy link
Contributor

I don't think the current escape analysis can help, as it runs after the init functions are built. You'd want to decide the side-effect-freeness of the RHS of global initializers before the init function is built.
Probably a very simple analysis would capture most of the cases we care about.

And then you'd have to have an init function per global, have it associated with the global, and only link it in if the global was otherwise reachable.

@josharian
Copy link
Contributor

The compiler already has samesafeexpr. We could remove the "same" part and steal that.

I wonder whether have an init function per global (with all the associated overhead) would end up outweighing the other good done.

@mdempsky
Copy link
Member Author

Overhead from too many init functions is a fair concern. I'd imagine a lot of the functions could be compiled as nosplit, which might reduce the overhead.

Also, traditionally ELF compilers generate .init sections with straight line instructions that are just pasted together, skipping function call overhead for simple functions. (The .init code still ends up forming a proper function because the linker will also include C runtime files like crti.o and crtn.o, which supply the appropriate function prologue/epilogue in their .init sections.)

@crawshaw
Copy link
Member

Anyone writing code for this? If not, I'll spend a couple of days on it.

Assuming each global with side-effect free initialization is split off, I suspect the linker will need to merge the safe init functions together after reachability analysis. But that should be safe to do as long as we distinguish the safe generated init code from general init functions.

@bradfitz
Copy link
Contributor

All yours.

@mdempsky
Copy link
Member Author

@crawshaw Go for it.

@crawshaw
Copy link
Member

I made a bit of progress on this, but after looking closer I don't know how it will help binary size.

The obvious target is the range tables in the unicode package. It would be nice to make it possible for the linker's deadcode pass to catch exported symbols like Nl, Nd, etc. But: all of these tables are included in the unicode.Categories map, which is used by the regexp package.

All the real world programs I looked at use package regexp. Even tiny ones like objdump. So while this will make the "hello world" program from #6853 look good, I don't see how it will help in practice.

If anyone can convince me otherwise I'll dust this off, but for now this looks like an academic exercise to me.

@mdempsky
Copy link
Member Author

I'm going to look into extending escape analysis to keep track of side-effect free functions for Go 1.17.

@ianlancetaylor ianlancetaylor added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Apr 19, 2021
@ianlancetaylor ianlancetaylor modified the milestones: Go1.17, Backlog Apr 19, 2021
@ianlancetaylor
Copy link
Contributor

Not sure if this is happening, but moving milestone to backlog as it is a long-standing issue.

@bradfitz
Copy link
Contributor

bradfitz commented Aug 5, 2021

As a datapoint, I just tried changing the representation of unicode.RangeTable from a struct to a string containing the binary packed contents of the RangeTables.

Then instead of vars, I made them all consts. And instead of the five map[string]*RangeTable global variables, I made them funcs that returned a sync.Once-lazily-init map. (which the linker knows how to eliminate if unused)

I of course also updated all the funcs taking a *RangeTable pointer to instead just take a RangeTable string.

After minor corresponding updates to the regexp and encoding/xml packages (which both used *RangeTable), the binary size of a fmt-using Hello World program dropped 100 KB on Linux.

bradfitz@tsdev:~$ ls -l hello.before hello.after
-rwxr-xr-x 1 bradfitz bradfitz 1843533 Aug  4 22:20 hello.after
-rwxr-xr-x 1 bradfitz bradfitz 1942437 Aug  4 22:20 hello.before

bradfitz@tsdev:~$ size hello.before hello.after
   text    data     bss     dec     hex filename
1261840   88756  207456 1558052  17c624 hello.before
1234832   25752  207424 1468008  166668 hello.after

bradfitz@tsdev:~$ go tool nm hello.before | grep -F unicode. | wc -l
242
bradfitz@tsdev:~$ go tool nm hello.before | grep -F unicode. | head -20
  536f40 D unicode..inittask
  544318 D unicode.ASCII_Hex_Digit
  544320 D unicode.Adlam
  544328 D unicode.Ahom
  544330 D unicode.Anatolian_Hieroglyphs
  544338 D unicode.Arabic
  544340 D unicode.Armenian
  544348 D unicode.Avestan
  544350 D unicode.Balinese
  544358 D unicode.Bamum
  544360 D unicode.Bassa_Vah
  544368 D unicode.Batak
  544370 D unicode.Bengali
  544378 D unicode.Bhaiksuki
  544380 D unicode.Bidi_Control
  544388 D unicode.Bopomofo
  544390 D unicode.Brahmi
  544398 D unicode.Braille
  5443a0 D unicode.Buginese
  5443a8 D unicode.Buhid

bradfitz@tsdev:~$ go tool nm hello.after | grep -F unicode. | wc -l
1
bradfitz@tsdev:~$ go tool nm hello.after | grep -F unicode. 
  530280 D unicode..inittask

Which is all to say: there are wins to be had here if anybody is curious.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jul 13, 2022
@gopherbot
Copy link

Change https://go.dev/cl/461315 mentions this issue: cmd/compile,cmd/link: enable deadcode of unreferenced large global maps

@gopherbot
Copy link

Change https://go.dev/cl/463395 mentions this issue: cmd/link: linker portion of dead map removal (patch 2 of 2)

@gopherbot
Copy link

Change https://go.dev/cl/463855 mentions this issue: cmd/internal/obj: flag init functions in object file

gopherbot pushed a commit that referenced this issue Feb 6, 2023
Introduce a flag in the object file indicating whether a given
function corresponds to a compiler-generated (not user-written) init
function, such as "os.init" or "syscall.init". Add code to the
compiler to fill in the correct value for the flag, and add support to
the loader package in the linker for testing the flag. The new loader
API is currently unused, but will be needed in the next CL in this
stack.

Updates #2559.
Updates #36021.
Updates #14840.

Change-Id: Iea7ad2adda487e4af7a44f062f9817977c53b394
Reviewed-on: https://go-review.googlesource.com/c/go/+/463855
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
gopherbot pushed a commit that referenced this issue Feb 6, 2023
This patch changes the compiler's pkg init machinery to pick out large
initialization assignments to global maps (e.g.

   var mymap = map[string]int{"foo":1, "bar":2, ... }

and extract the map init code into a separate outlined function, which is
then called from the main init function with a weak relocation:

   var mymap map[string]int   // KEEP reloc -> map.init.0

   func init() {
      map.init.0() // weak relocation
   }

   func map.init.0() {
     mymap = map[string]int{"foo":1, "bar":2}
   }

The map init outlining is done selectively (only in the case where the
RHS code exceeds a size limit of 20 IR nodes).

In order to ensure that a given map.init.NNN function is included when
its corresponding map is live, we add dummy R_KEEP relocation from the
map variable to the map init function.

This first patch includes the main compiler compiler changes, and with
the weak relocation addition disabled. Subsequent patch includes the
requred linker changes along with switching to the call to the
outlined routine to a weak relocation. See the later linker change for
associated compile time performance numbers.

Updates #2559.
Updates #36021.
Updates #14840.

Change-Id: I1fd6fd6397772be1ebd3eb397caf68ae9a3147e9
Reviewed-on: https://go-review.googlesource.com/c/go/+/461315
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
gopherbot pushed a commit that referenced this issue Feb 6, 2023
This patch contains the linker changes needed to enable deadcoding of
large unreferenced map variables, in combination with a previous
compiler change. We add a new cleanup function that runs just after
deadcode that looks for relocations in "init" funcs that are weak, of
type R_CALL (and siblings), and are targeting an unreachable function.
If we find such a relocation, after checking to make sure it targets a
map.init.XXX helper, we redirect the relocation to a point to a no-op
routine ("mapinitnoop") in the runtime.

Compilebench results for this change:

			  │ out.base.txt │            out.wrap.txt            │
			  │    sec/op    │   sec/op     vs base               │
 Template                   218.6m ±  2%   221.1m ± 1%       ~ (p=0.129 n=39)
 Unicode                    180.5m ±  1%   178.9m ± 1%  -0.93% (p=0.006 n=39)
 GoTypes                     1.162 ±  1%    1.156 ± 1%       ~ (p=0.850 n=39)
 Compiler                   143.6m ±  1%   142.6m ± 1%       ~ (p=0.743 n=39)
 SSA                         8.698 ±  1%    8.719 ± 1%       ~ (p=0.145 n=39)
 Flate                      142.6m ±  1%   143.9m ± 3%       ~ (p=0.287 n=39)
 GoParser                   247.7m ±  1%   248.8m ± 1%       ~ (p=0.265 n=39)
 Reflect                    588.0m ±  1%   590.4m ± 1%       ~ (p=0.269 n=39)
 Tar                        198.5m ±  1%   201.3m ± 1%  +1.38% (p=0.005 n=39)
 XML                        259.1m ±  1%   260.0m ± 1%       ~ (p=0.376 n=39)
 LinkCompiler               746.8m ±  2%   747.8m ± 1%       ~ (p=0.706 n=39)
 ExternalLinkCompiler        1.906 ±  0%    1.902 ± 1%       ~ (p=0.207 n=40)
 LinkWithoutDebugCompiler   522.4m ± 21%   471.1m ± 1%  -9.81% (p=0.000 n=40)
 StdCmd                      21.32 ±  0%    21.39 ± 0%  +0.32% (p=0.035 n=40)
 geomean                    609.2m         606.0m       -0.53%

			  │ out.base.txt │            out.wrap.txt            │
			  │ user-sec/op  │ user-sec/op  vs base               │
 Template                    401.9m ± 3%   406.9m ± 2%       ~ (p=0.291 n=39)
 Unicode                     191.9m ± 6%   196.1m ± 3%       ~ (p=0.052 n=39)
 GoTypes                      3.967 ± 3%    4.056 ± 1%       ~ (p=0.099 n=39)
 Compiler                    171.1m ± 3%   173.4m ± 3%       ~ (p=0.415 n=39)
 SSA                          30.04 ± 4%    30.25 ± 4%       ~ (p=0.106 n=39)
 Flate                       246.3m ± 3%   248.9m ± 4%       ~ (p=0.499 n=39)
 GoParser                    518.7m ± 1%   520.6m ± 2%       ~ (p=0.531 n=39)
 Reflect                      1.670 ± 1%    1.656 ± 2%       ~ (p=0.137 n=39)
 Tar                         352.7m ± 2%   360.3m ± 2%       ~ (p=0.117 n=39)
 XML                         528.8m ± 2%   521.1m ± 2%       ~ (p=0.296 n=39)
 LinkCompiler                 1.128 ± 2%    1.140 ± 2%       ~ (p=0.324 n=39)
 ExternalLinkCompiler         2.165 ± 2%    2.147 ± 2%       ~ (p=0.537 n=40)
 LinkWithoutDebugCompiler    484.2m ± 4%   490.7m ± 3%       ~ (p=0.897 n=40)
 geomean                     818.5m        825.1m       +0.80%

	   │ out.base.txt │             out.wrap.txt              │
	   │  text-bytes  │  text-bytes   vs base                 │
 HelloSize   766.0Ki ± 0%   766.0Ki ± 0%       ~ (p=1.000 n=40) ¹
 CmdGoSize   10.02Mi ± 0%   10.02Mi ± 0%  -0.03% (n=40)
 geomean     2.738Mi        2.738Mi       -0.01%
 ¹ all samples are equal

	   │ out.base.txt │             out.wrap.txt              │
	   │  data-bytes  │  data-bytes   vs base                 │
 HelloSize   14.17Ki ± 0%   14.17Ki ± 0%       ~ (p=1.000 n=40) ¹
 CmdGoSize   308.3Ki ± 0%   298.5Ki ± 0%  -3.19% (n=40)
 geomean     66.10Ki        65.04Ki       -1.61%
 ¹ all samples are equal

	   │ out.base.txt │             out.wrap.txt              │
	   │  bss-bytes   │  bss-bytes    vs base                 │
 HelloSize   197.3Ki ± 0%   197.3Ki ± 0%       ~ (p=1.000 n=40) ¹
 CmdGoSize   228.2Ki ± 0%   228.1Ki ± 0%  -0.01% (n=40)
 geomean     212.2Ki        212.1Ki       -0.01%
 ¹ all samples are equal

	   │ out.base.txt │            out.wrap.txt             │
	   │  exe-bytes   │  exe-bytes    vs base               │
 HelloSize   1.192Mi ± 0%   1.192Mi ± 0%  +0.00% (p=0.000 n=40)
 CmdGoSize   14.85Mi ± 0%   14.83Mi ± 0%  -0.09% (n=40)
 geomean     4.207Mi        4.205Mi       -0.05%

Also tested for any linker changes by benchmarking relink of k8s
"kubelet"; no changes to speak of there.

Updates #2559.
Updates #36021.
Updates #14840.

Change-Id: I4cc5370b3f20679a1065aaaf87bdf2881e257631
Reviewed-on: https://go-review.googlesource.com/c/go/+/463395
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@thanm
Copy link
Contributor

thanm commented Feb 7, 2023

For posterity, linker dead code elimination of unused maps is now implemented in this stack.

Getting back to the original example:

package main
type dead int
func newDead() *dead { return new(dead) }
var x = newDead()
func main() {}

In principle the same strategy used for maps could be used here for variables like "x", e.g. outline the code that performs the init (e.g. call to newDead), then have a weak relocation from main.init to the outlined init func, and a strong relocation from "x" to the outlined init func.

The existing code that checks for side effects is here; in theory if were were to extend this code to work interprocedurally we might be able to catch this case.

johanbrandhorst pushed a commit to Pryz/go that referenced this issue Feb 12, 2023
Introduce a flag in the object file indicating whether a given
function corresponds to a compiler-generated (not user-written) init
function, such as "os.init" or "syscall.init". Add code to the
compiler to fill in the correct value for the flag, and add support to
the loader package in the linker for testing the flag. The new loader
API is currently unused, but will be needed in the next CL in this
stack.

Updates golang#2559.
Updates golang#36021.
Updates golang#14840.

Change-Id: Iea7ad2adda487e4af7a44f062f9817977c53b394
Reviewed-on: https://go-review.googlesource.com/c/go/+/463855
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
johanbrandhorst pushed a commit to Pryz/go that referenced this issue Feb 12, 2023
This patch changes the compiler's pkg init machinery to pick out large
initialization assignments to global maps (e.g.

   var mymap = map[string]int{"foo":1, "bar":2, ... }

and extract the map init code into a separate outlined function, which is
then called from the main init function with a weak relocation:

   var mymap map[string]int   // KEEP reloc -> map.init.0

   func init() {
      map.init.0() // weak relocation
   }

   func map.init.0() {
     mymap = map[string]int{"foo":1, "bar":2}
   }

The map init outlining is done selectively (only in the case where the
RHS code exceeds a size limit of 20 IR nodes).

In order to ensure that a given map.init.NNN function is included when
its corresponding map is live, we add dummy R_KEEP relocation from the
map variable to the map init function.

This first patch includes the main compiler compiler changes, and with
the weak relocation addition disabled. Subsequent patch includes the
requred linker changes along with switching to the call to the
outlined routine to a weak relocation. See the later linker change for
associated compile time performance numbers.

Updates golang#2559.
Updates golang#36021.
Updates golang#14840.

Change-Id: I1fd6fd6397772be1ebd3eb397caf68ae9a3147e9
Reviewed-on: https://go-review.googlesource.com/c/go/+/461315
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
johanbrandhorst pushed a commit to Pryz/go that referenced this issue Feb 12, 2023
This patch contains the linker changes needed to enable deadcoding of
large unreferenced map variables, in combination with a previous
compiler change. We add a new cleanup function that runs just after
deadcode that looks for relocations in "init" funcs that are weak, of
type R_CALL (and siblings), and are targeting an unreachable function.
If we find such a relocation, after checking to make sure it targets a
map.init.XXX helper, we redirect the relocation to a point to a no-op
routine ("mapinitnoop") in the runtime.

Compilebench results for this change:

			  │ out.base.txt │            out.wrap.txt            │
			  │    sec/op    │   sec/op     vs base               │
 Template                   218.6m ±  2%   221.1m ± 1%       ~ (p=0.129 n=39)
 Unicode                    180.5m ±  1%   178.9m ± 1%  -0.93% (p=0.006 n=39)
 GoTypes                     1.162 ±  1%    1.156 ± 1%       ~ (p=0.850 n=39)
 Compiler                   143.6m ±  1%   142.6m ± 1%       ~ (p=0.743 n=39)
 SSA                         8.698 ±  1%    8.719 ± 1%       ~ (p=0.145 n=39)
 Flate                      142.6m ±  1%   143.9m ± 3%       ~ (p=0.287 n=39)
 GoParser                   247.7m ±  1%   248.8m ± 1%       ~ (p=0.265 n=39)
 Reflect                    588.0m ±  1%   590.4m ± 1%       ~ (p=0.269 n=39)
 Tar                        198.5m ±  1%   201.3m ± 1%  +1.38% (p=0.005 n=39)
 XML                        259.1m ±  1%   260.0m ± 1%       ~ (p=0.376 n=39)
 LinkCompiler               746.8m ±  2%   747.8m ± 1%       ~ (p=0.706 n=39)
 ExternalLinkCompiler        1.906 ±  0%    1.902 ± 1%       ~ (p=0.207 n=40)
 LinkWithoutDebugCompiler   522.4m ± 21%   471.1m ± 1%  -9.81% (p=0.000 n=40)
 StdCmd                      21.32 ±  0%    21.39 ± 0%  +0.32% (p=0.035 n=40)
 geomean                    609.2m         606.0m       -0.53%

			  │ out.base.txt │            out.wrap.txt            │
			  │ user-sec/op  │ user-sec/op  vs base               │
 Template                    401.9m ± 3%   406.9m ± 2%       ~ (p=0.291 n=39)
 Unicode                     191.9m ± 6%   196.1m ± 3%       ~ (p=0.052 n=39)
 GoTypes                      3.967 ± 3%    4.056 ± 1%       ~ (p=0.099 n=39)
 Compiler                    171.1m ± 3%   173.4m ± 3%       ~ (p=0.415 n=39)
 SSA                          30.04 ± 4%    30.25 ± 4%       ~ (p=0.106 n=39)
 Flate                       246.3m ± 3%   248.9m ± 4%       ~ (p=0.499 n=39)
 GoParser                    518.7m ± 1%   520.6m ± 2%       ~ (p=0.531 n=39)
 Reflect                      1.670 ± 1%    1.656 ± 2%       ~ (p=0.137 n=39)
 Tar                         352.7m ± 2%   360.3m ± 2%       ~ (p=0.117 n=39)
 XML                         528.8m ± 2%   521.1m ± 2%       ~ (p=0.296 n=39)
 LinkCompiler                 1.128 ± 2%    1.140 ± 2%       ~ (p=0.324 n=39)
 ExternalLinkCompiler         2.165 ± 2%    2.147 ± 2%       ~ (p=0.537 n=40)
 LinkWithoutDebugCompiler    484.2m ± 4%   490.7m ± 3%       ~ (p=0.897 n=40)
 geomean                     818.5m        825.1m       +0.80%

	   │ out.base.txt │             out.wrap.txt              │
	   │  text-bytes  │  text-bytes   vs base                 │
 HelloSize   766.0Ki ± 0%   766.0Ki ± 0%       ~ (p=1.000 n=40) ¹
 CmdGoSize   10.02Mi ± 0%   10.02Mi ± 0%  -0.03% (n=40)
 geomean     2.738Mi        2.738Mi       -0.01%
 ¹ all samples are equal

	   │ out.base.txt │             out.wrap.txt              │
	   │  data-bytes  │  data-bytes   vs base                 │
 HelloSize   14.17Ki ± 0%   14.17Ki ± 0%       ~ (p=1.000 n=40) ¹
 CmdGoSize   308.3Ki ± 0%   298.5Ki ± 0%  -3.19% (n=40)
 geomean     66.10Ki        65.04Ki       -1.61%
 ¹ all samples are equal

	   │ out.base.txt │             out.wrap.txt              │
	   │  bss-bytes   │  bss-bytes    vs base                 │
 HelloSize   197.3Ki ± 0%   197.3Ki ± 0%       ~ (p=1.000 n=40) ¹
 CmdGoSize   228.2Ki ± 0%   228.1Ki ± 0%  -0.01% (n=40)
 geomean     212.2Ki        212.1Ki       -0.01%
 ¹ all samples are equal

	   │ out.base.txt │            out.wrap.txt             │
	   │  exe-bytes   │  exe-bytes    vs base               │
 HelloSize   1.192Mi ± 0%   1.192Mi ± 0%  +0.00% (p=0.000 n=40)
 CmdGoSize   14.85Mi ± 0%   14.83Mi ± 0%  -0.09% (n=40)
 geomean     4.207Mi        4.205Mi       -0.05%

Also tested for any linker changes by benchmarking relink of k8s
"kubelet"; no changes to speak of there.

Updates golang#2559.
Updates golang#36021.
Updates golang#14840.

Change-Id: I4cc5370b3f20679a1065aaaf87bdf2881e257631
Reviewed-on: https://go-review.googlesource.com/c/go/+/463395
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
eric pushed a commit to fancybits/go that referenced this issue Sep 7, 2023
This patch changes the compiler's pkg init machinery to pick out large
initialization assignments to global maps (e.g.

   var mymap = map[string]int{"foo":1, "bar":2, ... }

and extract the map init code into a separate outlined function, which is
then called from the main init function with a weak relocation:

   var mymap map[string]int   // KEEP reloc -> map.init.0

   func init() {
      map.init.0() // weak relocation
   }

   func map.init.0() {
     mymap = map[string]int{"foo":1, "bar":2}
   }

The map init outlining is done selectively (only in the case where the
RHS code exceeds a size limit of 20 IR nodes).

In order to ensure that a given map.init.NNN function is included when
its corresponding map is live, we add dummy R_KEEP relocation from the
map variable to the map init function.

This first patch includes the main compiler compiler changes, and with
the weak relocation addition disabled. Subsequent patch includes the
requred linker changes along with switching to the call to the
outlined routine to a weak relocation. See the later linker change for
associated compile time performance numbers.

Updates golang#2559.
Updates golang#36021.
Updates golang#14840.

Change-Id: I1fd6fd6397772be1ebd3eb397caf68ae9a3147e9
Reviewed-on: https://go-review.googlesource.com/c/go/+/461315
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
eric pushed a commit to fancybits/go that referenced this issue Sep 7, 2023
This patch contains the linker changes needed to enable deadcoding of
large unreferenced map variables, in combination with a previous
compiler change. We add a new cleanup function that runs just after
deadcode that looks for relocations in "init" funcs that are weak, of
type R_CALL (and siblings), and are targeting an unreachable function.
If we find such a relocation, after checking to make sure it targets a
map.init.XXX helper, we redirect the relocation to a point to a no-op
routine ("mapinitnoop") in the runtime.

Compilebench results for this change:

			  │ out.base.txt │            out.wrap.txt            │
			  │    sec/op    │   sec/op     vs base               │
 Template                   218.6m ±  2%   221.1m ± 1%       ~ (p=0.129 n=39)
 Unicode                    180.5m ±  1%   178.9m ± 1%  -0.93% (p=0.006 n=39)
 GoTypes                     1.162 ±  1%    1.156 ± 1%       ~ (p=0.850 n=39)
 Compiler                   143.6m ±  1%   142.6m ± 1%       ~ (p=0.743 n=39)
 SSA                         8.698 ±  1%    8.719 ± 1%       ~ (p=0.145 n=39)
 Flate                      142.6m ±  1%   143.9m ± 3%       ~ (p=0.287 n=39)
 GoParser                   247.7m ±  1%   248.8m ± 1%       ~ (p=0.265 n=39)
 Reflect                    588.0m ±  1%   590.4m ± 1%       ~ (p=0.269 n=39)
 Tar                        198.5m ±  1%   201.3m ± 1%  +1.38% (p=0.005 n=39)
 XML                        259.1m ±  1%   260.0m ± 1%       ~ (p=0.376 n=39)
 LinkCompiler               746.8m ±  2%   747.8m ± 1%       ~ (p=0.706 n=39)
 ExternalLinkCompiler        1.906 ±  0%    1.902 ± 1%       ~ (p=0.207 n=40)
 LinkWithoutDebugCompiler   522.4m ± 21%   471.1m ± 1%  -9.81% (p=0.000 n=40)
 StdCmd                      21.32 ±  0%    21.39 ± 0%  +0.32% (p=0.035 n=40)
 geomean                    609.2m         606.0m       -0.53%

			  │ out.base.txt │            out.wrap.txt            │
			  │ user-sec/op  │ user-sec/op  vs base               │
 Template                    401.9m ± 3%   406.9m ± 2%       ~ (p=0.291 n=39)
 Unicode                     191.9m ± 6%   196.1m ± 3%       ~ (p=0.052 n=39)
 GoTypes                      3.967 ± 3%    4.056 ± 1%       ~ (p=0.099 n=39)
 Compiler                    171.1m ± 3%   173.4m ± 3%       ~ (p=0.415 n=39)
 SSA                          30.04 ± 4%    30.25 ± 4%       ~ (p=0.106 n=39)
 Flate                       246.3m ± 3%   248.9m ± 4%       ~ (p=0.499 n=39)
 GoParser                    518.7m ± 1%   520.6m ± 2%       ~ (p=0.531 n=39)
 Reflect                      1.670 ± 1%    1.656 ± 2%       ~ (p=0.137 n=39)
 Tar                         352.7m ± 2%   360.3m ± 2%       ~ (p=0.117 n=39)
 XML                         528.8m ± 2%   521.1m ± 2%       ~ (p=0.296 n=39)
 LinkCompiler                 1.128 ± 2%    1.140 ± 2%       ~ (p=0.324 n=39)
 ExternalLinkCompiler         2.165 ± 2%    2.147 ± 2%       ~ (p=0.537 n=40)
 LinkWithoutDebugCompiler    484.2m ± 4%   490.7m ± 3%       ~ (p=0.897 n=40)
 geomean                     818.5m        825.1m       +0.80%

	   │ out.base.txt │             out.wrap.txt              │
	   │  text-bytes  │  text-bytes   vs base                 │
 HelloSize   766.0Ki ± 0%   766.0Ki ± 0%       ~ (p=1.000 n=40) ¹
 CmdGoSize   10.02Mi ± 0%   10.02Mi ± 0%  -0.03% (n=40)
 geomean     2.738Mi        2.738Mi       -0.01%
 ¹ all samples are equal

	   │ out.base.txt │             out.wrap.txt              │
	   │  data-bytes  │  data-bytes   vs base                 │
 HelloSize   14.17Ki ± 0%   14.17Ki ± 0%       ~ (p=1.000 n=40) ¹
 CmdGoSize   308.3Ki ± 0%   298.5Ki ± 0%  -3.19% (n=40)
 geomean     66.10Ki        65.04Ki       -1.61%
 ¹ all samples are equal

	   │ out.base.txt │             out.wrap.txt              │
	   │  bss-bytes   │  bss-bytes    vs base                 │
 HelloSize   197.3Ki ± 0%   197.3Ki ± 0%       ~ (p=1.000 n=40) ¹
 CmdGoSize   228.2Ki ± 0%   228.1Ki ± 0%  -0.01% (n=40)
 geomean     212.2Ki        212.1Ki       -0.01%
 ¹ all samples are equal

	   │ out.base.txt │            out.wrap.txt             │
	   │  exe-bytes   │  exe-bytes    vs base               │
 HelloSize   1.192Mi ± 0%   1.192Mi ± 0%  +0.00% (p=0.000 n=40)
 CmdGoSize   14.85Mi ± 0%   14.83Mi ± 0%  -0.09% (n=40)
 geomean     4.207Mi        4.205Mi       -0.05%

Also tested for any linker changes by benchmarking relink of k8s
"kubelet"; no changes to speak of there.

Updates golang#2559.
Updates golang#36021.
Updates golang#14840.

Change-Id: I4cc5370b3f20679a1065aaaf87bdf2881e257631
Reviewed-on: https://go-review.googlesource.com/c/go/+/463395
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
eric pushed a commit to fancybits/go that referenced this issue Sep 7, 2023
Introduce a flag in the object file indicating whether a given
function corresponds to a compiler-generated (not user-written) init
function, such as "os.init" or "syscall.init". Add code to the
compiler to fill in the correct value for the flag, and add support to
the loader package in the linker for testing the flag. The new loader
API is currently unused, but will be needed in the next CL in this
stack.

Updates golang#2559.
Updates golang#36021.
Updates golang#14840.

Change-Id: Iea7ad2adda487e4af7a44f062f9817977c53b394
Reviewed-on: https://go-review.googlesource.com/c/go/+/463855
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Than McIntosh <thanm@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
eric pushed a commit to fancybits/go that referenced this issue Sep 7, 2023
This patch changes the compiler's pkg init machinery to pick out large
initialization assignments to global maps (e.g.

   var mymap = map[string]int{"foo":1, "bar":2, ... }

and extract the map init code into a separate outlined function, which is
then called from the main init function with a weak relocation:

   var mymap map[string]int   // KEEP reloc -> map.init.0

   func init() {
      map.init.0() // weak relocation
   }

   func map.init.0() {
     mymap = map[string]int{"foo":1, "bar":2}
   }

The map init outlining is done selectively (only in the case where the
RHS code exceeds a size limit of 20 IR nodes).

In order to ensure that a given map.init.NNN function is included when
its corresponding map is live, we add dummy R_KEEP relocation from the
map variable to the map init function.

This first patch includes the main compiler compiler changes, and with
the weak relocation addition disabled. Subsequent patch includes the
requred linker changes along with switching to the call to the
outlined routine to a weak relocation. See the later linker change for
associated compile time performance numbers.

Updates golang#2559.
Updates golang#36021.
Updates golang#14840.

Change-Id: I1fd6fd6397772be1ebd3eb397caf68ae9a3147e9
Reviewed-on: https://go-review.googlesource.com/c/go/+/461315
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
eric pushed a commit to fancybits/go that referenced this issue Sep 7, 2023
This patch contains the linker changes needed to enable deadcoding of
large unreferenced map variables, in combination with a previous
compiler change. We add a new cleanup function that runs just after
deadcode that looks for relocations in "init" funcs that are weak, of
type R_CALL (and siblings), and are targeting an unreachable function.
If we find such a relocation, after checking to make sure it targets a
map.init.XXX helper, we redirect the relocation to a point to a no-op
routine ("mapinitnoop") in the runtime.

Compilebench results for this change:

			  │ out.base.txt │            out.wrap.txt            │
			  │    sec/op    │   sec/op     vs base               │
 Template                   218.6m ±  2%   221.1m ± 1%       ~ (p=0.129 n=39)
 Unicode                    180.5m ±  1%   178.9m ± 1%  -0.93% (p=0.006 n=39)
 GoTypes                     1.162 ±  1%    1.156 ± 1%       ~ (p=0.850 n=39)
 Compiler                   143.6m ±  1%   142.6m ± 1%       ~ (p=0.743 n=39)
 SSA                         8.698 ±  1%    8.719 ± 1%       ~ (p=0.145 n=39)
 Flate                      142.6m ±  1%   143.9m ± 3%       ~ (p=0.287 n=39)
 GoParser                   247.7m ±  1%   248.8m ± 1%       ~ (p=0.265 n=39)
 Reflect                    588.0m ±  1%   590.4m ± 1%       ~ (p=0.269 n=39)
 Tar                        198.5m ±  1%   201.3m ± 1%  +1.38% (p=0.005 n=39)
 XML                        259.1m ±  1%   260.0m ± 1%       ~ (p=0.376 n=39)
 LinkCompiler               746.8m ±  2%   747.8m ± 1%       ~ (p=0.706 n=39)
 ExternalLinkCompiler        1.906 ±  0%    1.902 ± 1%       ~ (p=0.207 n=40)
 LinkWithoutDebugCompiler   522.4m ± 21%   471.1m ± 1%  -9.81% (p=0.000 n=40)
 StdCmd                      21.32 ±  0%    21.39 ± 0%  +0.32% (p=0.035 n=40)
 geomean                    609.2m         606.0m       -0.53%

			  │ out.base.txt │            out.wrap.txt            │
			  │ user-sec/op  │ user-sec/op  vs base               │
 Template                    401.9m ± 3%   406.9m ± 2%       ~ (p=0.291 n=39)
 Unicode                     191.9m ± 6%   196.1m ± 3%       ~ (p=0.052 n=39)
 GoTypes                      3.967 ± 3%    4.056 ± 1%       ~ (p=0.099 n=39)
 Compiler                    171.1m ± 3%   173.4m ± 3%       ~ (p=0.415 n=39)
 SSA                          30.04 ± 4%    30.25 ± 4%       ~ (p=0.106 n=39)
 Flate                       246.3m ± 3%   248.9m ± 4%       ~ (p=0.499 n=39)
 GoParser                    518.7m ± 1%   520.6m ± 2%       ~ (p=0.531 n=39)
 Reflect                      1.670 ± 1%    1.656 ± 2%       ~ (p=0.137 n=39)
 Tar                         352.7m ± 2%   360.3m ± 2%       ~ (p=0.117 n=39)
 XML                         528.8m ± 2%   521.1m ± 2%       ~ (p=0.296 n=39)
 LinkCompiler                 1.128 ± 2%    1.140 ± 2%       ~ (p=0.324 n=39)
 ExternalLinkCompiler         2.165 ± 2%    2.147 ± 2%       ~ (p=0.537 n=40)
 LinkWithoutDebugCompiler    484.2m ± 4%   490.7m ± 3%       ~ (p=0.897 n=40)
 geomean                     818.5m        825.1m       +0.80%

	   │ out.base.txt │             out.wrap.txt              │
	   │  text-bytes  │  text-bytes   vs base                 │
 HelloSize   766.0Ki ± 0%   766.0Ki ± 0%       ~ (p=1.000 n=40) ¹
 CmdGoSize   10.02Mi ± 0%   10.02Mi ± 0%  -0.03% (n=40)
 geomean     2.738Mi        2.738Mi       -0.01%
 ¹ all samples are equal

	   │ out.base.txt │             out.wrap.txt              │
	   │  data-bytes  │  data-bytes   vs base                 │
 HelloSize   14.17Ki ± 0%   14.17Ki ± 0%       ~ (p=1.000 n=40) ¹
 CmdGoSize   308.3Ki ± 0%   298.5Ki ± 0%  -3.19% (n=40)
 geomean     66.10Ki        65.04Ki       -1.61%
 ¹ all samples are equal

	   │ out.base.txt │             out.wrap.txt              │
	   │  bss-bytes   │  bss-bytes    vs base                 │
 HelloSize   197.3Ki ± 0%   197.3Ki ± 0%       ~ (p=1.000 n=40) ¹
 CmdGoSize   228.2Ki ± 0%   228.1Ki ± 0%  -0.01% (n=40)
 geomean     212.2Ki        212.1Ki       -0.01%
 ¹ all samples are equal

	   │ out.base.txt │            out.wrap.txt             │
	   │  exe-bytes   │  exe-bytes    vs base               │
 HelloSize   1.192Mi ± 0%   1.192Mi ± 0%  +0.00% (p=0.000 n=40)
 CmdGoSize   14.85Mi ± 0%   14.83Mi ± 0%  -0.09% (n=40)
 geomean     4.207Mi        4.205Mi       -0.05%

Also tested for any linker changes by benchmarking relink of k8s
"kubelet"; no changes to speak of there.

Updates golang#2559.
Updates golang#36021.
Updates golang#14840.

Change-Id: I4cc5370b3f20679a1065aaaf87bdf2881e257631
Reviewed-on: https://go-review.googlesource.com/c/go/+/463395
Run-TryBot: Than McIntosh <thanm@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binary-size compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Status: Triage Backlog
Development

No branches or pull requests

8 participants