Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: large string constants being copied unnecessarily #15729

Closed
SlyMarbo opened this issue May 18, 2016 · 19 comments
Closed

cmd/compile: large string constants being copied unnecessarily #15729

SlyMarbo opened this issue May 18, 2016 · 19 comments
Milestone

Comments

@SlyMarbo
Copy link

SlyMarbo commented May 18, 2016

  1. What version of Go are you using (go version)?
    • go/1.6
  2. What operating system and processor architecture are you using (go env)?
    • linux/amd64, darwin/amd64
  3. What did you do?
  4. What did you expect to see?
    • Compile size not to change noticeably after uncommenting the commented code.
  5. What did you see instead?
    • The binary size increased by len(data) bytes.

It appears that using a large string constant multiple times within a function results in that constant being copied into the compile binary multiple times. While this may be fine with small strings, it seems unnecessary with large strings and inflates the binary size and compile time significantly.

In the code in which this issue was discovered, the string constant was nearly 30 MB and resulted in the compile time increasing from ~6s to ~13s and the binary from ~27 MB to ~66 MB.

@bradfitz
Copy link
Contributor

/cc @randall77 @crawshaw

@bradfitz
Copy link
Contributor

Do you see it twice with go tool nm $YOURBINARY?

@bradfitz bradfitz added this to the Go1.8 milestone May 18, 2016
@SlyMarbo
Copy link
Author

No, I just used go build twice and observed the difference with ls -lh; I wasn't aware of go tool nm, but running that now I do indeed see it twice:

$ go build
$ ls -lh string-inline-bug
-rwxr-xr-x 1 user users 2.2M May 18 16:10 string-inline-bug
$ go tool nm string-inline-bug | grep main
  451560 T main
  4bc640 r main..gostring.1
  401060 T main.get
  4baa00 r main.hdr..gostring.1
  401170 T main.init
  5d78c0 D main.initdone.
  401000 T main.main
  426620 T runtime.main
  4ba580 R runtime.main.f
  445f10 T runtime.main.func1
  4ba568 R runtime.main.func1.f
  445f50 T runtime.main.func2
  4ba570 R runtime.main.func2.f
  4ba578 R runtime.mainPC
  5bd318 D runtime.main_init_done
$ # uncomment the line
$ go build
$ ls -lh string-inline-bug
-rwxr-xr-x 1 user users 3.0M May 18 16:11 string-inline-bug
$ go tool nm string-inline-bug | grep main
  4515d0 T main
  4bc660 r main..gostring.1
  584340 r main..gostring.2
  401060 T main.get
  4baa00 r main.hdr..gostring.1
  4baa10 r main.hdr..gostring.2
  4011e0 T main.init
  69f8c0 D main.initdone.
  401000 T main.main
  426690 T runtime.main
  4ba580 R runtime.main.f
  445f80 T runtime.main.func1
  4ba568 R runtime.main.func1.f
  445fc0 T runtime.main.func2
  4ba570 R runtime.main.func2.f
  4ba578 R runtime.mainPC
  685318 D runtime.main_init_done

@josharian
Copy link
Contributor

This should already be fixed in go 1.7.

@josharian
Copy link
Contributor

@SlyMarbo
Copy link
Author

This is still occurring as of tip (4d2ac54).

@josharian josharian reopened this May 18, 2016
@josharian
Copy link
Contributor

Huh. Thanks, reopened.

@mwhudson
Copy link
Contributor

mwhudson commented May 18, 2016

I can't reproduce it at tip. (I can with 1.6 though).

$ go version
go version devel +ebbe4f8 Wed May 18 21:20:33 2016 +0000 linux/amd64

@SlyMarbo
Copy link
Author

SlyMarbo commented May 18, 2016

Definitely still occurring for me (on ebbe4f8). If it makes any difference, I have 818374 Bytes of data; the bug was harder to detect with smaller strings, but it is still identified with go tool nm.

@mwhudson
Copy link
Contributor

Well I don't know what the difference is then:

mwhudson@aeglos:/opt/opensource/go-test-cases$ ls -l largestring1.go largestring2.go 
-rw-rw-r-- 1 mwhudson mwhudson 3817610 May 19 12:22 largestring1.go
-rw-rw-r-- 1 mwhudson mwhudson 3817602 May 19 12:22 largestring2.go
mwhudson@aeglos:/opt/opensource/go-test-cases$ diff -u largestring1.go largestring2.go 
--- largestring1.go 2016-05-19 12:22:24.472972006 +1200
+++ largestring2.go 2016-05-19 12:22:46.479056042 +1200
@@ -16,10 +16,10 @@
    index := rand.Intn((len(data) - 10) / 3)

    // Uncommenting this increases binary size by len(data) bytes.
-   //word := data[index*3 : index*3+10]
-   //if word[0] == 'a' {
-   //  return word
-   //}
+   word := data[index*3 : index*3+10]
+   if word[0] == 'a' {
+       return word
+   }

    for i := index; i > 0; i-- {
        word := data[i*3 : i*3+10]
mwhudson@aeglos:/opt/opensource/go-test-cases$ go build largestring1.go
mwhudson@aeglos:/opt/opensource/go-test-cases$ go build largestring2.go
mwhudson@aeglos:/opt/opensource/go-test-cases$ ls -l largestring1 largestring2
-rwxrwxr-x 1 mwhudson mwhudson 4947272 May 19 12:23 largestring1
-rwxrwxr-x 1 mwhudson mwhudson 4947272 May 19 12:23 largestring2

@SlyMarbo
Copy link
Author

SlyMarbo commented May 19, 2016

I don't know what to suggest; running this on a different machine I get the same result:

% go version
go version devel +1f7a0d4 Thu May 19 04:37:45 2016 +0000 darwin/amd64
% go build good.go 
% go build bad.go
% diff -u good.go bad.go
--- good.go 2016-05-19 09:28:59.000000000 +0100
+++ bad.go  2016-05-19 09:28:52.000000000 +0100
@@ -16,10 +16,10 @@
    index := rand.Intn((len(data) - 10) / 3)

    // Uncommenting this increases binary size by len(data) bytes.
-   //word := data[index*3 : index*3+10]
-   //if word[0] == 'a' {
-   //  return word
-   //}
+   word := data[index*3 : index*3+10]
+   if word[0] == 'a' {
+       return word
+   }

    for i := index; i > 0; i-- {
        word := data[i*3 : i*3+10]
% ls -l good bad
-rwxr-xr-x  1 user  staff  4018720 May 19 09:40 bad
-rwxr-xr-x  1 user  staff  2724304 May 19 09:40 good
% go tool nm bad | grep main
   51750 T _main
   b9f00 R main..gostring.1
  1f5f00 R main..gostring.2
    20a0 T main.get
   b8360 R main.hdr..gostring.1
   b8370 R main.hdr..gostring.2
    2220 T main.init
  384300 B main.initdone.
    2040 T main.main
   27750 T runtime.main
   b7ee8 R runtime.main.f
   46790 T runtime.main.func1
   b7ed0 R runtime.main.func1.f
   467d0 T runtime.main.func2
   b7ed8 R runtime.main.func2.f
   b7ee0 R runtime.mainPC
  369d70 B runtime.main_init_done
% go tool nm good | grep main
   516e0 T _main
   b9ea0 R main..gostring.1
    20a0 T main.get
   b8300 R main.hdr..gostring.1
    21b0 T main.init
  248300 B main.initdone.
    2040 T main.main
   276e0 T runtime.main
   b7e88 R runtime.main.f
   46720 T runtime.main.func1
   b7e70 R runtime.main.func1.f
   46760 T runtime.main.func2
   b7e78 R runtime.main.func2.f
   b7e80 R runtime.mainPC
  22dd70 B runtime.main_init_done

@mwhudson
Copy link
Contributor

This is very strange, I don't understand what is going on at all. I don't even understand where the symbol names in your output come from. Maybe @josharian can try on his system?

@josharian
Copy link
Contributor

@SlyMarbo I just tried this again with tip (82ec4cd). I used your exactly playground code, but I substituted a 65536 byte constant (using "a"*65536 in python). I still can't reproduce.

I ran:

``bash
$ go build foo.go && ls -l foo
-rwxr-xr-x 1 josh staff 1162496 May 20 17:41 foo


Then I uncommented the three lines and ran:

```bash
$ go build foo.go && ls -l foo
-rwxr-xr-x  1 josh  staff  1162496 May 20 17:42 foo

Also, the symbols called main.hdr..gostring.1 are no longer produced at tip. Please see the linked CL; they are instead given content-addressable names and end up in the generic go.string.* symbol, where their individual names are not visible.

The fact that you are seeing main.hdr..gostring.1 in your nm output says to me that you're accidentally using an old (e.g. 1.6) compiler. Would you mind double-checking? Thanks.

@SlyMarbo
Copy link
Author

@josharian I'm not sure what's going on. The fact that neither you nor @mwhudson can reproduce definitely suggests I'm doing something wrong, but I'm sure I'm using tip on both machines. Is it possible my tooling has become desynchronised, with go version reporting tip, but using go1.6's go tool compile or similar?

@SlyMarbo
Copy link
Author

@josharian Yes, that seems to be what happened; I changed GOROOT to point to tip and it now acts as expected. Sorry for the confusion.

@davecheney
Copy link
Contributor

You don't need to change GOROOT to switch go versions. In fact, the easiest
way is to never set GOROOT, just compile different versions of go in
different directories, then call their respective go tools explicitly, ie
$HOME/go1.5/bin/go, etc

On Sat, 21 May 2016, 20:19 Jamie Hall, notifications@github.com wrote:

@josharian https://github.com/josharian Yes, that seems to be what
happened; I changed GOROOT to point to tip and it now acts as expected.
Sorry for the confusion.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#15729 (comment)

@SlyMarbo
Copy link
Author

Yeah, my mistake was to leave my existing Go installation (1.6) as it was, clone tip into a different directory and run all.bash, which put the newer go tool in my GOBIN, but hadn't affected anything else, so it was still using the old subtools. I blew away my old tooling and moved tip there, re-ran all.bash, and it worked as expected.

I agree that I shouldn't have set GOROOT; it was a hangover from when I was trying to get GoSublime to work nicely.

@davecheney
Copy link
Contributor

As a suggestion you can compile many versions of go into separate
directories then symlink their respective go binaries into somewhere
convenient in your $PATH. I use this technique to have various versions at
my fingertips when investigating regressions, ie

go test // uh oh
go1.4 test // hmm, passes here
go1.5 test // and here
go1.6 test // and here, time to raise a bug

On Sat, 21 May 2016, 21:02 Jamie Hall, notifications@github.com wrote:

Yeah, my mistake was to leave my existing Go installation (1.6) as it was,
clone tip into a different directory and run all.bash, which put the
newer go tool in my GOBIN, but hadn't affected anything else, so it was
still using the old subtools. I blew away my old tooling and moved tip
there, re-ran all.bash, and it worked as expected.

I agree that I shouldn't have set GOROOT; it was a hangover from when I
was trying to get GoSublime to work nicely.


You are receiving this because you commented.

Reply to this email directly or view it on GitHub
#15729 (comment)

@SlyMarbo
Copy link
Author

Great suggestion, thanks @davecheney :)

@golang golang locked and limited conversation to collaborators May 21, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants